Amazon RDS - Comprehensive Guide
🗄️ Tổng quan về RDS
RDS là gì?
- Relational Database Service: Managed database service
- Multiple engines: MySQL, PostgreSQL, MariaDB, Oracle, SQL Server, Aurora
- Automated operations: Backup, patching, monitoring, scaling
- High availability: Multi-AZ deployments
RDS vs Self-managed Database
| Feature | RDS | Self-managed EC2 |
|---|---|---|
| Setup | Minutes | Hours/Days |
| Patching | Automated | Manual |
| Backup | Automated | Manual setup |
| Monitoring | Built-in | Setup required |
| Scaling | Push-button | Manual |
| Cost | Higher per hour | Lower per hour, higher ops cost |
🏗️ Database Engines
1. Amazon Aurora
- MySQL/PostgreSQL compatible
- Performance: 5x faster than MySQL, 3x faster than PostgreSQL
- Storage: Auto-scaling from 10GB to 128TB
- Replicas: Up to 15 read replicas
- Multi-master: Write scaling
2. MySQL
- Versions: 5.6, 5.7, 8.0
- Storage: gp2, gp3, io1, io2
- Max storage: 64TB
- Read replicas: Up to 5
3. PostgreSQL
- Versions: 11, 12, 13, 14, 15
- Extensions: PostGIS, pg_stat_statements
- Logical replication: Cross-region
- Performance Insights: Query analysis
4. MariaDB
- MySQL compatible: Drop-in replacement
- Storage engines: InnoDB, MyRocks
- Thread pooling: Better concurrency
5. Oracle
- Editions: Standard, Enterprise
- Licensing: BYOL or License Included
- Features: Advanced Security, Partitioning
6. SQL Server
- Editions: Express, Web, Standard, Enterprise
- Windows Authentication: Domain integration
- Always On: High availability
💾 Storage Types
General Purpose SSD (gp2/gp3)
- Use case: Balanced price-performance
- IOPS: 3 IOPS per GB (gp2), up to 16,000 (gp3)
- Throughput: Up to 1,000 MiB/s (gp3)
- Burst capability: Up to 3,000 IOPS for gp2
Provisioned IOPS (io1/io2)
- Use case: I/O intensive workloads
- IOPS: Up to 80,000 IOPS
- Durability: 99.999% (io2)
- IOPS:Storage ratio: 50:1 maximum
Magnetic (Standard)
- Legacy storage: Không recommend cho production
- Use case: Infrequent access
- Performance: ~100 IOPS average
🔧 Instance Types
Burstable Performance (T3/T4g)
- Use case: Variable workloads
- CPU credits: Burst capability
- Cost-effective: Development/testing
Memory Optimized (R5/R6i/X1e)
- Use case: In-memory databases
- High memory-to-vCPU ratio
- SAP HANA workloads
Compute Optimized (C5/C6i)
- Use case: CPU-intensive
- High-frequency processors
- Web servers, scientific computing
🌐 High Availability & Disaster Recovery
Multi-AZ Deployments
Primary AZ Standby AZ
┌─────────────┐ ┌─────────────┐
│ Primary │────│ Standby │
│ Instance │ │ Instance │
│ ↓ │ │ │
│ EBS Vol │────│ EBS Vol │
└─────────────┘ └─────────────┘
Features: - Synchronous replication: No data loss - Automatic failover: 1-2 minutes - Single endpoint: Application transparency - Backup from standby: No I/O suspension
Read Replicas
Primary Instance
↓ (Async replication)
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Read Replica │ │ Read Replica │ │ Read Replica │
│ (AZ-a) │ │ (AZ-b) │ │ (Different │
│ │ │ │ │ Region) │
└─────────────┘ └─────────────┘ └─────────────┘
Features: - Read scaling: Distribute read traffic - Cross-region: Disaster recovery - Eventual consistency: Async replication - Promote to primary: Manual process
📊 Backup & Recovery
Automated Backups
- Point-in-time recovery: 1-35 days retention
- Daily snapshots: During backup window
- Transaction logs: Continuous backup
- No performance impact: Uses Multi-AZ standby
Manual Snapshots
- User-initiated: On-demand backup
- Retention: Until manually deleted
- Cross-region copy: Disaster recovery
- Share snapshots: Between accounts
Backup Best Practices
Backup Strategy:
Automated Backups:
Retention: 7 days (minimum for production)
Window: Low-traffic period (e.g., 3:00-4:00 AM)
Manual Snapshots:
Before major changes: Application deployments
Monthly: Long-term retention
Cross-region: Disaster recovery
🔒 Security Features
Encryption
- At rest: AES-256 encryption
- In transit: SSL/TLS
- Key management: AWS KMS
- Encrypted snapshots: Automatic
Network Security
- VPC: Private networking
- Security Groups: Database-level firewall
- Subnet Groups: Multi-AZ placement
- VPC endpoints: Private connectivity
Access Control
- IAM database authentication: Token-based
- Master user: Database admin
- Parameter groups: Configuration management
- Option groups: Feature enablement
📈 Performance Optimization
Performance Insights
- Query analysis: Top SQL statements
- Wait events: Bottleneck identification
- Historical data: 7 days free, extended with fee
- Database load: Normalized metrics
Parameter Groups
-- Example MySQL parameters
innodb_buffer_pool_size = 70% of available memory
innodb_flush_log_at_trx_commit = 2 -- Better performance
slow_query_log = 1 -- Enable slow query logging
long_query_time = 1 -- Log queries > 1 second
Connection Pooling
# Application-level pooling
from sqlalchemy import create_engine
engine = create_engine(
'mysql://user:pass@rds-endpoint:3306/db',
pool_size=20,
max_overflow=30,
pool_recycle=3600
)
💰 Pricing Models
On-Demand Instances
- Pay per hour: No commitment
- Flexible: Start/stop anytime
- Use case: Development, testing, unpredictable workloads
Reserved Instances
- 1 or 3 year terms: Up to 69% savings
- Payment options: All upfront, partial, no upfront
- Convertible: Change instance type
Aurora Serverless
- Pay per second: When database is active
- Auto-scaling: 0.5 to 256 ACUs
- Auto-pause: Cost savings for intermittent workloads
🔄 Migration Strategies
Database Migration Service (DMS)
Source DB → DMS Replication Instance → Target RDS
↓ ↓ ↓
On-premises → AWS DMS → AWS RDS
Oracle → Continuous → PostgreSQL
Replication (with SCT)
Schema Conversion Tool (SCT)
- Heterogeneous migrations: Oracle to PostgreSQL
- Assessment report: Conversion complexity
- Code conversion: Stored procedures, functions
Native Tools
# MySQL dump and restore
mysqldump -h source-host -u user -p database > backup.sql
mysql -h rds-endpoint -u user -p new-database < backup.sql
# PostgreSQL pg_dump
pg_dump -h source-host -U user -d database > backup.sql
psql -h rds-endpoint -U user -d new-database < backup.sql
🧪 Monitoring & Troubleshooting
CloudWatch Metrics
- CPU Utilization: Instance performance
- Database Connections: Connection count
- Free Storage Space: Disk usage
- IOPS: I/O performance
- Latency: Read/Write response time
Enhanced Monitoring
- OS-level metrics: Every 1-60 seconds
- Process list: Running processes
- Memory usage: Detailed memory metrics
Common Issues & Solutions
High CPU:
Causes: Inefficient queries, missing indexes
Solutions: Query optimization, read replicas
Storage Full:
Causes: Rapid data growth, large transactions
Solutions: Storage auto-scaling, log rotation
Connection Limits:
Causes: Too many connections
Solutions: Connection pooling, parameter tuning
Slow Queries:
Causes: Missing indexes, lock contention
Solutions: Query optimization, index creation
🚀 Best Practices
Performance Best Practices
- Right-size instances: Monitor and adjust
- Use read replicas: Scale read traffic
- Optimize queries: Use EXPLAIN plans
- Index strategy: Cover frequently queried columns
- Connection pooling: Reduce connection overhead
Security Best Practices
- Encryption: Enable at rest and in transit
- VPC deployment: Private networking
- Least privilege: Minimal database permissions
- Regular patching: Keep engine updated
- Backup verification: Test restore procedures
Cost Optimization
- Reserved instances: For predictable workloads
- Right-sizing: Match instance to workload
- Storage optimization: Choose appropriate type
- Delete unused: Remove old snapshots
- Aurora Serverless: For intermittent workloads
📝 Exam Tips cho AWS SAA
Scenario-based Questions
- High availability: Multi-AZ cho production
- Read scaling: Read replicas cho heavy read workloads
- Cross-region DR: Cross-region read replicas
- Cost optimization: Reserved instances cho steady workloads
Key Differences
Multi-AZ vs Read Replicas:
Multi-AZ: HA, automatic failover, same region
Read Replicas: Read scaling, manual promotion, cross-region
Aurora vs Standard RDS:
Aurora: Cloud-native, better performance, auto-scaling storage
Standard: Traditional engines, predictable costs, established tools
📖 Tóm tắt
Amazon RDS cung cấp managed relational database service với: - Multiple database engines phù hợp với nhiều use cases - High availability với Multi-AZ deployments - Read scaling với read replicas - Comprehensive security features - Performance monitoring và optimization tools - Flexible pricing models cho different workloads
Hiểu rõ RDS deployment patterns, backup strategies, và performance optimization là critical cho AWS SAA exam.