Domain 2: Design High-Performing Architectures (28%)

🎯 Overview

Thiết kế các architecture có khả năng scale, elastic, và performance cao để đáp ứng yêu cầu business và technical.

📊 Performance Metrics

Key Performance Indicators (KPIs)

  • Latency: Response time (ms)
  • Throughput: Requests per second (RPS)
  • Bandwidth: Data transfer rate (Mbps/Gbps)
  • IOPS: Input/Output operations per second
  • CPU Utilization: Processor usage (%)
  • Memory Utilization: RAM usage (%)

Performance Targets

  • Web Applications: < 200ms response time
  • APIs: < 100ms response time
  • Database Queries: < 50ms for simple queries
  • File Uploads: Based on file size và network

🚀 Scalable & Elastic Architectures

1. Horizontal vs Vertical Scaling

Horizontal Scaling (Scale Out)

  • Pros: Better fault tolerance, cost-effective
  • Cons: Complex application design
  • AWS Services: Auto Scaling Groups, Load Balancers
  • Use Cases: Web applications, microservices

Vertical Scaling (Scale Up)

  • Pros: Simple implementation, no application changes
  • Cons: Single point of failure, expensive
  • AWS Services: EC2 instance resizing
  • Use Cases: Databases, legacy applications

2. Auto Scaling Strategies

EC2 Auto Scaling

Scaling Policies:
  Target Tracking:
    - CPU Utilization: 70%
    - Request Count per Target: 1000
    - Network In/Out: Based on workload

  Step Scaling:
    - CPU > 80%: Add 2 instances
    - CPU > 90%: Add 4 instances
    - CPU < 30%: Remove 1 instance

  Scheduled Scaling:
    - Business hours: Min 10, Max 50
    - Off hours: Min 2, Max 10
    - Black Friday: Pre-scale to 100

Application Auto Scaling

  • ECS Services: Task-based scaling
  • DynamoDB: Read/Write capacity scaling
  • Aurora: Serverless scaling
  • Lambda: Concurrent execution scaling

3. Elastic Load Balancing

Application Load Balancer (ALB)

  • Features:
  • Content-based routing
  • Host-based routing
  • WebSocket support
  • HTTP/2 support
  • Performance: Automatic scaling
  • Use Cases: Microservices, containerized apps

Network Load Balancer (NLB)

  • Features:
  • Ultra-low latency
  • Static IP addresses
  • Preserve source IP
  • Handle millions of requests/sec
  • Use Cases: High-performance applications, gaming

Gateway Load Balancer (GLB)

  • Features:
  • Third-party virtual appliances
  • Transparent network gateway
  • Load balancing
  • Use Cases: Firewalls, intrusion detection

🌐 High-Performing Network Architectures

1. Amazon VPC Optimization

Network Design Best Practices

Region
├── VPC (10.0.0.0/16)
│   ├── Public Subnets (10.0.1.0/24, 10.0.2.0/24)
│   │   └── NAT Gateways, Load Balancers
│   ├── Private Subnets (10.0.11.0/24, 10.0.12.0/24)
│   │   └── Application Servers
│   └── Database Subnets (10.0.21.0/24, 10.0.22.0/24)
│       └── RDS, ElastiCache

Enhanced Networking

  • SR-IOV: Single Root I/O Virtualization
  • Placement Groups:
  • Cluster: Low latency, high throughput
  • Partition: Separate hardware failures
  • Spread: Separate underlying hardware

2. Content Delivery & Caching

Amazon CloudFront

  • Global Edge Locations: 400+ locations
  • Caching Strategies:
  • Static content: Long TTL (24 hours)
  • Dynamic content: Short TTL (5 minutes)
  • API responses: Custom caching policies
  • Performance Features:
  • HTTP/2 và HTTP/3 support
  • Compression
  • WebSocket support

CloudFront Optimization

Cache Behaviors:
  Static Assets (*.css, *.js, *.png):
    TTL: 31536000 (1 year)
    Compression: Enabled

  API Endpoints (/api/*):
    TTL: 300 (5 minutes)
    Headers: Authorization, Content-Type
    Query Strings: All

  Dynamic Content (*.php, *.jsp):
    TTL: 0 (no caching)
    Forward: All headers

3. DNS Performance

Amazon Route 53

  • Routing Policies:
  • Latency-based: Route to lowest latency
  • Geolocation: Route based on user location
  • Weighted: Distribute traffic by percentage
  • Failover: Primary/secondary setup

DNS Optimization

  • Health Checks: Monitor endpoint health
  • Alias Records: No additional DNS lookup
  • Resolver: Recursive DNS service

💾 High-Performing Storage Solutions

1. Amazon S3 Performance

Request Patterns

  • Sequential Naming: Avoid hotspotting
  • Random Prefixes: Better distribution
  • Request Rate: 3,500 PUT/COPY/POST/DELETE, 5,500 GET/HEAD per prefix

S3 Performance Features

S3 Configurations:
  Transfer Acceleration:
    - Uses CloudFront edge locations
    - 50-500% faster uploads

  Multipart Upload:
    - Files > 100MB: Recommended
    - Files > 5GB: Required
    - Parallel uploads

  S3 Select:
    - Query data without retrieving entire object
    - Filter at S3 level
    - 400% faster, 80% cheaper

Storage Classes Performance

  • S3 Standard: Millisecond access
  • S3 IA: Millisecond access, retrieval fee
  • S3 Glacier Instant Retrieval: Millisecond access
  • S3 Glacier Flexible Retrieval: 1-12 hours
  • S3 Glacier Deep Archive: 12+ hours

2. Amazon EBS Performance

Volume Types

EBS Volume Types:
  gp3 (General Purpose SSD):
    - Baseline: 3,000 IOPS, 125 MiB/s
    - Burstable: Up to 16,000 IOPS, 1,000 MiB/s
    - Cost: Most cost-effective

  io2 (Provisioned IOPS SSD):
    - Up to 64,000 IOPS per volume
    - Up to 1,000 MiB/s throughput
    - 99.999% durability

  st1 (Throughput Optimized HDD):
    - Baseline: 40 MiB/s per TB
    - Burst: 250 MiB/s per TB
    - Big data, data warehouses

EBS Optimization

  • EBS-Optimized Instances: Dedicated bandwidth
  • RAID Configurations:
  • RAID 0: Better performance
  • RAID 1: Better durability
  • Snapshot Strategies: Incremental backups

3. Amazon EFS Performance

Performance Modes

  • General Purpose: Lower latency, 7,000 operations/sec
  • Max I/O: Higher latency, >7,000 operations/sec

Throughput Modes

  • Bursting: 100 MiB/s baseline, burst to 100-500 MiB/s
  • Provisioned: Independent of storage size

🗄️ High-Performing Database Solutions

1. Amazon RDS Performance

Instance Optimization

RDS Performance:
  Instance Types:
    - Memory Optimized (R6g, R5, X1e)
    - Compute Optimized (C6g, C5)
    - Burstable (T4g, T3)

  Storage:
    - gp2: Baseline 3 IOPS/GB, burst to 3,000
    - gp3: Baseline 3,000 IOPS, configurable
    - io1: Up to 64,000 IOPS

  Read Replicas:
    - Up to 15 read replicas
    - Cross-region replication
    - Promote to master

RDS Proxy

  • Connection Pooling: Reduce database connections
  • Failover: Faster failover times
  • Security: IAM authentication

2. Amazon Aurora Performance

Aurora Features

  • Storage: Auto-scaling từ 10GB to 128TB
  • Read Replicas: Up to 15 Aurora replicas
  • Global Database: Cross-region với < 1 second lag
  • Serverless: Automatic scaling

Aurora Optimization

Aurora Performance:
  Writer Instance:
    - Single writer per cluster
    - All read/write operations

  Reader Instances:
    - Up to 15 readers
    - Load balancing
    - Different instance sizes

  Aurora Serverless:
    - ACU (Aurora Capacity Units)
    - Auto pause/resume
    - Cost optimization

3. Amazon DynamoDB Performance

Performance Features

DynamoDB Performance:
  On-Demand Mode:
    - Pay per request
    - Automatic scaling
    - Handle traffic spikes

  Provisioned Mode:
    - Pre-configured capacity
    - Auto scaling available
    - Reserved capacity discounts

  Global Tables:
    - Multi-region replication
    - < 1 second replication lag
    - Active-active setup

DynamoDB Optimization

  • Partition Key Design: Even distribution
  • GSI (Global Secondary Index): Query flexibility
  • LSI (Local Secondary Index): Alternative sort key
  • DAX (DynamoDB Accelerator): Microsecond latency

⚡ Caching Strategies

1. Amazon ElastiCache

Redis vs Memcached

Redis:
  Features:
    - Data persistence
    - Complex data types
    - Pub/Sub messaging
    - Lua scripting
  Use Cases:
    - Session store
    - Real-time analytics
    - Gaming leaderboards

Memcached:
  Features:
    - Simple key-value store
    - Multi-threaded
    - Horizontal scaling
  Use Cases:
    - Database query caching
    - Object caching
    - Simple key-value operations

Caching Patterns

  1. Cache-Aside (Lazy Loading):
  2. Application manages cache
  3. Cache misses load from database
  4. Good for read-heavy workloads

  5. Write-Through:

  6. Write to cache và database simultaneously
  7. Always consistent
  8. Higher write latency

  9. Write-Behind (Write-Back):

  10. Write to cache first
  11. Asynchronous database updates
  12. Better write performance

2. Application-Level Caching

Caching Layers

Client Browser Cache (TTL: 1 hour)
    ↓
CloudFront CDN (TTL: 24 hours)
    ↓
Application Load Balancer
    ↓
API Gateway Cache (TTL: 5 minutes)
    ↓
Application Server Cache (TTL: 1 minute)
    ↓
Database Query Cache (TTL: 30 seconds)
    ↓
Database

🖥️ Compute Performance

1. Amazon EC2 Optimization

Instance Types Selection

Compute Optimized (C6g, C5, C4):
  Use Cases:
    - Web servers
    - Scientific computing
    - Gaming servers

Memory Optimized (R6g, R5, X1e):
  Use Cases:
    - In-memory databases
    - Real-time analytics
    - Big data processing

Storage Optimized (I4i, I3, D3):
  Use Cases:
    - NoSQL databases
    - Data warehousing
    - Search engines

EC2 Performance Features

  • Nitro System: Better performance, security
  • Enhanced Networking: SR-IOV, 100 Gbps
  • Placement Groups: Optimized network performance
  • Dedicated Hosts: Physical server dedication

2. AWS Lambda Performance

Lambda Optimization

Lambda Performance Factors:
  Memory Allocation:
    - 128MB - 10,240MB
    - CPU scales with memory
    - Network bandwidth scales with memory

  Cold Start Optimization:
    - Provisioned Concurrency
    - Keep functions warm
    - Minimize deployment package size

  Runtime Performance:
    - Choose appropriate runtime
    - Reuse connections
    - Cache computations

3. Container Performance

Amazon ECS Performance

  • EC2 Launch Type: Full control over instances
  • Fargate Launch Type: Serverless containers
  • Task Placement: Optimize resource utilization

Amazon EKS Performance

  • Node Groups: Managed worker nodes
  • Fargate: Serverless Kubernetes pods
  • Horizontal Pod Autoscaler: Scale based on metrics

📊 Monitoring & Performance Analysis

1. Amazon CloudWatch

Performance Metrics

EC2 Metrics:
  - CPUUtilization
  - NetworkIn/Out
  - DiskReadOps/WriteOps
  - MemoryUtilization (custom)

RDS Metrics:
  - CPUUtilization
  - DatabaseConnections
  - ReadIOPS/WriteIOPS
  - ReadLatency/WriteLatency

Lambda Metrics:
  - Duration
  - Invocations
  - Errors
  - ConcurrentExecutions

2. AWS X-Ray

  • Distributed Tracing: End-to-end request tracking
  • Service Map: Visual representation
  • Performance Analysis: Identify bottlenecks

3. Performance Testing

  • Load Testing: Normal expected traffic
  • Stress Testing: Beyond normal capacity
  • Spike Testing: Sudden traffic increases
  • Volume Testing: Large amounts of data

🎯 Common Performance Scenarios

Scenario 1: E-commerce Website

Requirements: Handle Black Friday traffic Solution: - CloudFront CDN cho static content - Auto Scaling Groups với predictive scaling - RDS Read Replicas cho database reads - ElastiCache cho session management - API Gateway caching

Scenario 2: Video Streaming Platform

Requirements: Global content delivery Solution: - CloudFront với multiple origins - S3 Transfer Acceleration - Elastic Transcoder cho video processing - Lambda@Edge cho content customization

Scenario 3: Real-time Gaming Backend

Requirements: Low latency, high throughput Solution: - Network Load Balancer - Placement Groups cho EC2 instances - ElastiCache Redis với cluster mode - DynamoDB với DAX - API Gateway WebSocket

📝 Performance Best Practices

Design Principles

  • [ ] Design for horizontal scaling
  • [ ] Use caching at multiple layers
  • [ ] Choose appropriate database solutions
  • [ ] Optimize network architecture
  • [ ] Monitor và analyze performance metrics

Architecture Review

  • [ ] Identify performance bottlenecks
  • [ ] Implement auto scaling
  • [ ] Use CDN cho static content
  • [ ] Database query optimization
  • [ ] Network latency optimization

Cost vs Performance

  • [ ] Right-size instances
  • [ ] Use spot instances where appropriate
  • [ ] Reserved instances cho predictable workloads
  • [ ] Monitor cost metrics

🔍 Practice Questions

  1. Auto Scaling: Khi nào sử dụng target tracking vs step scaling?
  2. CloudFront: Làm thế nào để optimize cache hit ratio?
  3. Database: RDS vs DynamoDB cho different workloads?
  4. EBS: Khi nào chọn gp3 vs io2?
  5. Lambda: Làm thế nào để minimize cold starts?

📖 Further Reading