Media Streaming Platform - AWS Architecture
🎬 Business Overview
Platform Requirements
- Video streaming: Support millions of concurrent users
- Content delivery: Global reach với low latency
- Live streaming: Real-time events và sports
- User management: Subscription và authentication
- Analytics: Viewer behavior và content performance
- Mobile/Web: Cross-platform compatibility
Scale Requirements
- Peak concurrent users: 5 million globally
- Video library: 100,000+ hours của content
- Upload volume: 1000+ hours/day của new content
- Global regions: Support cho 50+ countries
- Availability: 99.99% uptime requirement
🏗️ High-Level Architecture
Global Users
|
[CloudFront CDN with Edge Locations]
|
┌───────────────┼───────────────┐
| | |
[Route 53] [API Gateway] [MediaLive]
| | |
[WAF/Shield] [Lambda@Edge] [MediaPackage]
| | |
[ALB/NLB] [Microservices] [MediaStore]
| | |
[ECS Fargate] [DynamoDB/RDS] [S3 Storage]
| | |
[ElastiCache] [Elasticsearch] [Glacier Archive]
🎥 Content Ingestion & Processing
Video Upload Pipeline
Content Ingestion Flow:
1. Content Creator Upload:
- S3 Transfer Acceleration
- Large file multipart upload
- Pre-signed URLs for security
2. MediaConvert Processing:
- Multiple resolutions (360p to 4K)
- Adaptive bitrate streaming
- Thumbnail generation
- Subtitle extraction
3. Content Validation:
- Lambda functions for metadata
- AI/ML content moderation
- Quality assurance checks
4. Storage Distribution:
- S3 for master files
- CloudFront origin for delivery
- DynamoDB for metadata
MediaConvert Configuration
{
"Role": "arn:aws:iam::account:role/MediaConvertRole",
"Settings": {
"OutputGroups": [
{
"Name": "HLS",
"OutputGroupSettings": {
"Type": "HLS_GROUP_SETTINGS",
"HlsGroupSettings": {
"Destination": "s3://video-output-bucket/hls/",
"SegmentLength": 6,
"MinSegmentLength": 1
}
},
"Outputs": [
{
"VideoDescription": {
"CodecSettings": {
"Codec": "H_264",
"H264Settings": {
"Bitrate": 5000000,
"FramerateControl": "SPECIFIED",
"FramerateNumerator": 30,
"FramerateDenominator": 1
}
}
},
"AudioDescriptions": [
{
"CodecSettings": {
"Codec": "AAC",
"AacSettings": {
"Bitrate": 128000,
"SampleRate": 48000
}
}
}
]
}
]
}
]
}
}
📡 Live Streaming Architecture
Live Event Pipeline
Live Streaming Flow:
1. Input Sources:
- RTMP/RTP streams
- SDI/HDMI cameras
- Mobile apps
2. MediaLive Processing:
- Real-time encoding
- Multi-bitrate outputs
- Redundant pipelines
3. MediaPackage Delivery:
- Just-in-time packaging
- DRM protection
- DVR functionality
4. CloudFront Distribution:
- Global edge caching
- Viewer geolocation
- Real-time metrics
MediaLive Channel Configuration
{
"Name": "LiveSportsChannel",
"InputSpecification": {
"Codec": "AVC",
"MaximumBitrate": "MAX_20_MBPS",
"Resolution": "HD"
},
"EncoderSettings": {
"VideoDescriptions": [
{
"Name": "HD1080",
"CodecSettings": {
"H264Settings": {
"Bitrate": 5000000,
"FramerateControl": "SPECIFIED",
"FramerateNumerator": 30,
"FramerateDenominator": 1
}
},
"Height": 1080,
"Width": 1920
},
{
"Name": "HD720",
"CodecSettings": {
"H264Settings": {
"Bitrate": 3000000,
"FramerateControl": "SPECIFIED",
"FramerateNumerator": 30,
"FramerateDenominator": 1
}
},
"Height": 720,
"Width": 1280
}
],
"OutputGroups": [
{
"Name": "HLSOutput",
"OutputGroupSettings": {
"HlsGroupSettings": {
"Destination": {
"DestinationRefId": "hlsOutput"
},
"SegmentLength": 6,
"PlaylistType": "EVENT"
}
}
}
]
}
}
🌐 Global Content Delivery
CloudFront Configuration
CloudFront Distribution:
Origins:
- S3 Bucket: Static video files
- MediaPackage: Live streams
- ALB: API endpoints
Behaviors:
/api/*:
- Origin: ALB
- Cache: None
- Compress: Yes
/live/*:
- Origin: MediaPackage
- Cache: Custom (30 seconds)
- Viewer Protocol: HTTPS only
/videos/*:
- Origin: S3
- Cache: 1 year
- Compress: Yes
- Origin Shield: Yes
Security:
- WAF Integration
- Signed URLs for premium content
- Geo-restriction for licensing
Edge Computing với Lambda@Edge
// Viewer request function
exports.handler = (event, context, callback) => {
const request = event.Records[0].cf.request;
const headers = request.headers;
// Device detection
const userAgent = headers['user-agent'][0].value;
const isMobile = /Mobile|Android|iPhone/.test(userAgent);
// Geographic content restriction
const country = headers['cloudfront-viewer-country'][0].value;
const restrictedCountries = ['CN', 'RU', 'IR'];
if (restrictedCountries.includes(country)) {
const response = {
status: '403',
statusDescription: 'Forbidden',
body: 'Content not available in your region'
};
callback(null, response);
return;
}
// Adaptive URL rewriting
if (isMobile && request.uri.includes('/video/')) {
request.uri = request.uri.replace('/video/', '/video/mobile/');
}
callback(null, request);
};
// Origin response function for caching
exports.handler = (event, context, callback) => {
const response = event.Records[0].cf.response;
const headers = response.headers;
// Add custom cache headers
headers['cache-control'] = [
{
key: 'Cache-Control',
value: 'public, max-age=86400, s-maxage=31536000'
}
];
// Add security headers
headers['strict-transport-security'] = [
{
key: 'Strict-Transport-Security',
value: 'max-age=31536000; includeSubdomains'
}
];
callback(null, response);
};
👤 User Management & Authentication
Cognito User Pool Configuration
User Authentication:
Cognito User Pool:
- Email/Username login
- MFA support
- Social identity providers (Google, Facebook)
- Custom attributes (subscription_tier, preferences)
User Journey:
1. Registration/Login → Cognito
2. JWT Token → API Gateway
3. Authorization → Lambda
4. Content Access → Signed URLs
Subscription Management:
- Free tier: Ads supported
- Premium tier: Ad-free, 4K content
- Family tier: Multiple profiles
API Gateway với JWT Authorization
API Gateway Configuration:
Authorizers:
- Cognito User Pool
- Custom Lambda authorizer
Resources:
/users:
- GET: User profile
- PUT: Update preferences
- POST: Subscription management
/content:
- GET: Content catalog
- POST: Search và recommendations
/streaming:
- GET: Streaming URLs
- POST: Playback analytics
📊 Data Architecture
Database Strategy
DynamoDB Tables:
Users:
PK: UserID
Attributes: profile, subscription, preferences
GSI: email, subscription_tier
Content:
PK: ContentID
SK: Version
Attributes: metadata, encoding_status, analytics
GSI: genre, release_date, popularity
ViewingSessions:
PK: UserID
SK: SessionTimestamp
Attributes: content_id, duration, quality, location
TTL: 90 days
Recommendations:
PK: UserID
SK: ContentID
Attributes: score, generated_at, viewed
TTL: 30 days
RDS (PostgreSQL):
Content Management:
- Content metadata
- User subscriptions
- Financial transactions
- Reporting và analytics
Read Replicas:
- Analytics queries
- Reporting dashboards
- Data warehouse ETL
Real-time Analytics
Kinesis Data Streams:
Player Events:
- Play/pause/seek events
- Quality changes
- Buffering metrics
- Error tracking
Processors:
- Kinesis Analytics: Real-time metrics
- Lambda: Event processing
- Elasticsearch: Log aggregation
- Redshift: Data warehousing
🤖 AI/ML Integration
Content Recommendation Engine
Recommendation Pipeline:
Data Sources:
- Viewing history (DynamoDB)
- Content metadata (RDS)
- User preferences (Cognito)
- Real-time events (Kinesis)
ML Models:
- SageMaker: Collaborative filtering
- Personalize: Real-time recommendations
- Comprehend: Content categorization
- Rekognition: Video analysis
Recommendation Types:
- Trending content
- Similar content
- Personalized picks
- Continue watching
Content Moderation
Automated Moderation:
Rekognition Video:
- Explicit content detection
- Violence và unsafe content
- Celebrity recognition
- Text in video analysis
Transcribe + Comprehend:
- Audio transcription
- Sentiment analysis
- Inappropriate language detection
- Content categorization
Custom Models:
- Brand-specific rules
- Cultural sensitivity
- Age-appropriate content
- Copyright detection
📈 Monitoring & Analytics
Business Intelligence Dashboard
CloudWatch Dashboards:
Real-time Metrics:
- Concurrent viewers
- Stream quality metrics
- Geographic distribution
- Device breakdown
Business KPIs:
- New subscriptions
- Churn rate
- Content popularity
- Revenue metrics
QuickSight Analytics:
- Executive dashboards
- Content performance reports
- User engagement analysis
- Financial reporting
Performance Monitoring
Application Monitoring:
- API Gateway metrics
- Lambda function performance
- DynamoDB throttling
- MediaLive stream health
Player Analytics:
- Startup time
- Buffering ratio
- Video quality distribution
- Error rates by region
Cost Monitoring:
- CloudFront bandwidth costs
- MediaConvert processing costs
- Storage costs by tier
- Compute costs optimization
🔒 Security & DRM
Content Protection
DRM Implementation:
PlayReady: Microsoft ecosystem
Widevine: Google/Android
FairPlay: Apple ecosystem
Key Management:
- AWS KMS for encryption keys
- Secure key rotation
- Multi-tenancy support
License Server:
- Custom Lambda functions
- Integration với DRM providers
- User entitlement validation
Security Best Practices
Network Security:
- VPC với private subnets
- Security groups restrictive rules
- WAF for application protection
- Shield Advanced for DDoS
Application Security:
- API rate limiting
- Input validation
- Signed URLs for content
- JWT token validation
Data Protection:
- Encryption at rest (S3, RDS, DynamoDB)
- Encryption in transit (TLS/SSL)
- Field-level encryption
- PII data anonymization
💰 Cost Optimization
Storage Optimization
S3 Lifecycle Policies:
- Standard: Active content (0-30 days)
- IA: Less popular content (30-90 days)
- Glacier: Archive content (90+ days)
- Deep Archive: Long-term storage
Content Optimization:
- Intelligent tiering
- Regional replication strategy
- Compression optimization
- Duplicate content detection
Compute Cost Management
ECS/Fargate Optimization:
- Spot instances for batch processing
- Reserved capacity for predictable workloads
- Auto-scaling based on demand
- Right-sizing containers
Lambda Optimization:
- Memory optimization
- Provisioned concurrency for critical functions
- Step Functions for orchestration
- EventBridge for event routing
🚀 Scalability Patterns
Auto-scaling Strategy
Application Scaling:
ECS Services:
- CPU/Memory based scaling
- Custom metrics (concurrent users)
- Scheduled scaling for events
Database Scaling:
- DynamoDB on-demand mode
- RDS read replicas
- ElastiCache cluster scaling
CDN Scaling:
- CloudFront automatic scaling
- Origin Shield optimization
- Edge location utilization
Global Expansion
Multi-Region Strategy:
Primary Region: us-east-1
Secondary Regions: eu-west-1, ap-southeast-1
Content Strategy:
- Global content replication
- Regional content libraries
- Localized recommendations
- Compliance với local regulations
📖 Key Takeaways
Architecture Principles
- Microservices: Independent scaling và deployment
- Event-driven: Asynchronous processing
- CDN-first: Global content delivery optimization
- Serverless: Reduced operational overhead
- Multi-region: Global availability và compliance
Technical Decisions
- Storage: S3 với intelligent tiering cho cost optimization
- Compute: Mix của ECS Fargate và Lambda cho different workloads
- Database: DynamoDB cho scale, RDS cho complex queries
- CDN: CloudFront với edge computing capabilities
- Streaming: AWS Media Services cho professional-grade video
Platform này demonstrates comprehensive use của AWS services cho building scalable, global media streaming solution với enterprise-grade features.