Amazon RDS - Comprehensive Guide

🗄️ Tổng quan về RDS

RDS là gì?

  • Relational Database Service: Managed database service
  • Multiple engines: MySQL, PostgreSQL, MariaDB, Oracle, SQL Server, Aurora
  • Automated operations: Backup, patching, monitoring, scaling
  • High availability: Multi-AZ deployments

RDS vs Self-managed Database

Feature RDS Self-managed EC2
Setup Minutes Hours/Days
Patching Automated Manual
Backup Automated Manual setup
Monitoring Built-in Setup required
Scaling Push-button Manual
Cost Higher per hour Lower per hour, higher ops cost

🏗️ Database Engines

1. Amazon Aurora

  • MySQL/PostgreSQL compatible
  • Performance: 5x faster than MySQL, 3x faster than PostgreSQL
  • Storage: Auto-scaling from 10GB to 128TB
  • Replicas: Up to 15 read replicas
  • Multi-master: Write scaling

2. MySQL

  • Versions: 5.6, 5.7, 8.0
  • Storage: gp2, gp3, io1, io2
  • Max storage: 64TB
  • Read replicas: Up to 5

3. PostgreSQL

  • Versions: 11, 12, 13, 14, 15
  • Extensions: PostGIS, pg_stat_statements
  • Logical replication: Cross-region
  • Performance Insights: Query analysis

4. MariaDB

  • MySQL compatible: Drop-in replacement
  • Storage engines: InnoDB, MyRocks
  • Thread pooling: Better concurrency

5. Oracle

  • Editions: Standard, Enterprise
  • Licensing: BYOL or License Included
  • Features: Advanced Security, Partitioning

6. SQL Server

  • Editions: Express, Web, Standard, Enterprise
  • Windows Authentication: Domain integration
  • Always On: High availability

💾 Storage Types

General Purpose SSD (gp2/gp3)

  • Use case: Balanced price-performance
  • IOPS: 3 IOPS per GB (gp2), up to 16,000 (gp3)
  • Throughput: Up to 1,000 MiB/s (gp3)
  • Burst capability: Up to 3,000 IOPS for gp2

Provisioned IOPS (io1/io2)

  • Use case: I/O intensive workloads
  • IOPS: Up to 80,000 IOPS
  • Durability: 99.999% (io2)
  • IOPS:Storage ratio: 50:1 maximum

Magnetic (Standard)

  • Legacy storage: Không recommend cho production
  • Use case: Infrequent access
  • Performance: ~100 IOPS average

🔧 Instance Types

Burstable Performance (T3/T4g)

  • Use case: Variable workloads
  • CPU credits: Burst capability
  • Cost-effective: Development/testing

Memory Optimized (R5/R6i/X1e)

  • Use case: In-memory databases
  • High memory-to-vCPU ratio
  • SAP HANA workloads

Compute Optimized (C5/C6i)

  • Use case: CPU-intensive
  • High-frequency processors
  • Web servers, scientific computing

🌐 High Availability & Disaster Recovery

Multi-AZ Deployments

Primary AZ          Standby AZ
┌─────────────┐    ┌─────────────┐
│   Primary   │────│   Standby   │
│   Instance  │    │   Instance  │
│      ↓      │    │             │
│   EBS Vol   │────│   EBS Vol   │
└─────────────┘    └─────────────┘

Features: - Synchronous replication: No data loss - Automatic failover: 1-2 minutes - Single endpoint: Application transparency - Backup from standby: No I/O suspension

Read Replicas

Primary Instance
       ↓ (Async replication)
┌─────────────┐  ┌─────────────┐  ┌─────────────┐
│ Read Replica │  │ Read Replica │  │ Read Replica │
│   (AZ-a)     │  │   (AZ-b)     │  │ (Different  │
│              │  │              │  │  Region)    │
└─────────────┘  └─────────────┘  └─────────────┘

Features: - Read scaling: Distribute read traffic - Cross-region: Disaster recovery - Eventual consistency: Async replication - Promote to primary: Manual process

📊 Backup & Recovery

Automated Backups

  • Point-in-time recovery: 1-35 days retention
  • Daily snapshots: During backup window
  • Transaction logs: Continuous backup
  • No performance impact: Uses Multi-AZ standby

Manual Snapshots

  • User-initiated: On-demand backup
  • Retention: Until manually deleted
  • Cross-region copy: Disaster recovery
  • Share snapshots: Between accounts

Backup Best Practices

Backup Strategy:
  Automated Backups:
    Retention: 7 days (minimum for production)
    Window: Low-traffic period (e.g., 3:00-4:00 AM)

  Manual Snapshots:
    Before major changes: Application deployments
    Monthly: Long-term retention
    Cross-region: Disaster recovery

🔒 Security Features

Encryption

  • At rest: AES-256 encryption
  • In transit: SSL/TLS
  • Key management: AWS KMS
  • Encrypted snapshots: Automatic

Network Security

  • VPC: Private networking
  • Security Groups: Database-level firewall
  • Subnet Groups: Multi-AZ placement
  • VPC endpoints: Private connectivity

Access Control

  • IAM database authentication: Token-based
  • Master user: Database admin
  • Parameter groups: Configuration management
  • Option groups: Feature enablement

📈 Performance Optimization

Performance Insights

  • Query analysis: Top SQL statements
  • Wait events: Bottleneck identification
  • Historical data: 7 days free, extended with fee
  • Database load: Normalized metrics

Parameter Groups

-- Example MySQL parameters
innodb_buffer_pool_size = 70% of available memory
innodb_flush_log_at_trx_commit = 2  -- Better performance
slow_query_log = 1                  -- Enable slow query logging
long_query_time = 1                 -- Log queries > 1 second

Connection Pooling

# Application-level pooling
from sqlalchemy import create_engine

engine = create_engine(
    'mysql://user:pass@rds-endpoint:3306/db',
    pool_size=20,
    max_overflow=30,
    pool_recycle=3600
)

💰 Pricing Models

On-Demand Instances

  • Pay per hour: No commitment
  • Flexible: Start/stop anytime
  • Use case: Development, testing, unpredictable workloads

Reserved Instances

  • 1 or 3 year terms: Up to 69% savings
  • Payment options: All upfront, partial, no upfront
  • Convertible: Change instance type

Aurora Serverless

  • Pay per second: When database is active
  • Auto-scaling: 0.5 to 256 ACUs
  • Auto-pause: Cost savings for intermittent workloads

🔄 Migration Strategies

Database Migration Service (DMS)

Source DB → DMS Replication Instance → Target RDS
   ↓              ↓                        ↓
On-premises → AWS DMS → AWS RDS
Oracle      → Continuous → PostgreSQL
             Replication   (with SCT)

Schema Conversion Tool (SCT)

  • Heterogeneous migrations: Oracle to PostgreSQL
  • Assessment report: Conversion complexity
  • Code conversion: Stored procedures, functions

Native Tools

# MySQL dump and restore
mysqldump -h source-host -u user -p database > backup.sql
mysql -h rds-endpoint -u user -p new-database < backup.sql

# PostgreSQL pg_dump
pg_dump -h source-host -U user -d database > backup.sql
psql -h rds-endpoint -U user -d new-database < backup.sql

🧪 Monitoring & Troubleshooting

CloudWatch Metrics

  • CPU Utilization: Instance performance
  • Database Connections: Connection count
  • Free Storage Space: Disk usage
  • IOPS: I/O performance
  • Latency: Read/Write response time

Enhanced Monitoring

  • OS-level metrics: Every 1-60 seconds
  • Process list: Running processes
  • Memory usage: Detailed memory metrics

Common Issues & Solutions

High CPU:
  Causes: Inefficient queries, missing indexes
  Solutions: Query optimization, read replicas

Storage Full:
  Causes: Rapid data growth, large transactions
  Solutions: Storage auto-scaling, log rotation

Connection Limits:
  Causes: Too many connections
  Solutions: Connection pooling, parameter tuning

Slow Queries:
  Causes: Missing indexes, lock contention
  Solutions: Query optimization, index creation

🚀 Best Practices

Performance Best Practices

  1. Right-size instances: Monitor and adjust
  2. Use read replicas: Scale read traffic
  3. Optimize queries: Use EXPLAIN plans
  4. Index strategy: Cover frequently queried columns
  5. Connection pooling: Reduce connection overhead

Security Best Practices

  1. Encryption: Enable at rest and in transit
  2. VPC deployment: Private networking
  3. Least privilege: Minimal database permissions
  4. Regular patching: Keep engine updated
  5. Backup verification: Test restore procedures

Cost Optimization

  1. Reserved instances: For predictable workloads
  2. Right-sizing: Match instance to workload
  3. Storage optimization: Choose appropriate type
  4. Delete unused: Remove old snapshots
  5. Aurora Serverless: For intermittent workloads

📝 Exam Tips cho AWS SAA

Scenario-based Questions

  • High availability: Multi-AZ cho production
  • Read scaling: Read replicas cho heavy read workloads
  • Cross-region DR: Cross-region read replicas
  • Cost optimization: Reserved instances cho steady workloads

Key Differences

Multi-AZ vs Read Replicas:
  Multi-AZ: HA, automatic failover, same region
  Read Replicas: Read scaling, manual promotion, cross-region

Aurora vs Standard RDS:
  Aurora: Cloud-native, better performance, auto-scaling storage
  Standard: Traditional engines, predictable costs, established tools

📖 Tóm tắt

Amazon RDS cung cấp managed relational database service với: - Multiple database engines phù hợp với nhiều use cases - High availability với Multi-AZ deployments - Read scaling với read replicas - Comprehensive security features - Performance monitoring và optimization tools - Flexible pricing models cho different workloads

Hiểu rõ RDS deployment patterns, backup strategies, và performance optimization là critical cho AWS SAA exam.