MongoDB - Ưu Nhược Điểm
✅ Ưu Điểm
1. 🚀 Schema Flexibility
- Dynamic Schema: Không cần define schema trước
- Easy Evolution: Thêm fields mới dễ dàng
- Mixed Data Types: Lưu trữ đa dạng data types trong cùng collection
- Rapid Prototyping: Development nhanh hơn
// Có thể insert documents với structure khác nhau
db.users.insertMany([
{
"name": "John",
"email": "john@example.com"
},
{
"name": "Alice",
"email": "alice@example.com",
"age": 25,
"preferences": {
"newsletter": true,
"theme": "dark"
}
}
])
2. 📄 Document-Oriented Storage
- Natural Object Mapping: Mapping trực tiếp với application objects
- Embedded Documents: Giảm joins, tăng performance
- Rich Data Types: Arrays, nested objects, dates
- JSON-like Structure: Dễ hiểu và làm việc
// Complex nested structure
{
"_id": ObjectId("..."),
"name": "E-commerce Order",
"customer": {
"name": "John Doe",
"email": "john@example.com",
"address": {
"street": "123 Main St",
"city": "New York",
"zipCode": "10001"
}
},
"items": [
{
"productId": "prod1",
"name": "Laptop",
"price": 999.99,
"quantity": 1
},
{
"productId": "prod2",
"name": "Mouse",
"price": 29.99,
"quantity": 2
}
],
"totalAmount": 1059.97,
"orderDate": ISODate("2024-01-15T10:30:00Z")
}
3. 🔍 Powerful Query Language
- Rich Query Operators: $and, $or, $in, $regex, etc.
- Aggregation Framework: Complex analytics và data processing
- Indexing Support: Compound, partial, text, geospatial indexes
- Full-text Search: Built-in text search capabilities
// Complex aggregation pipeline
db.sales.aggregate([
{
$match: {
"date": { $gte: ISODate("2024-01-01") },
"status": "completed"
}
},
{
$group: {
"_id": {
"month": { $month: "$date" },
"category": "$productCategory"
},
"totalSales": { $sum: "$amount" },
"averageOrder": { $avg: "$amount" }
}
},
{
$sort: { "totalSales": -1 }
}
])
4. 📈 Horizontal Scalability
- Automatic Sharding: Distributes data across multiple machines
- Linear Scaling: Add nodes để tăng capacity
- Load Distribution: Automatic balancing across shards
- High Availability: Replica sets với automatic failover
// Sharding setup
sh.enableSharding("ecommerce")
sh.shardCollection("ecommerce.orders", { "customerId": "hashed" })
- Fast Reads: Optimized cho read-heavy workloads
- Memory Mapping: WiredTiger storage engine efficiency
- Index Optimization: Automatic index selection
- Connection Pooling: Efficient connection management
6. 🛠️ Developer Experience
- Easy to Learn: Familiar JSON-like syntax
- Rich Ecosystem: Extensive drivers và tools
- Active Community: Large community support
- Cloud Integration: Atlas, AWS DocumentDB
7. 🔄 Replication & HA
- Replica Sets: Automatic failover và data redundancy
- Read Scaling: Read từ secondary nodes
- Oplog: Change streams for real-time updates
- Geographic Distribution: Cross-datacenter replication
❌ Nhược Điểm
1. 💾 Memory Usage
- High Memory Requirements: Working set phải fit trong RAM
- Index Overhead: Indexes consume significant memory
- Document Overhead: BSON format có overhead
- Memory Leaks: Potential memory issues với large datasets
// Memory monitoring
db.serverStatus().mem
db.serverStatus().wiredTiger.cache
2. 🔗 ACID Limitations
- Single Document ACID: Multi-document transactions có performance cost
- Eventual Consistency: Default replication là asynchronous
- Transaction Overhead: Multi-document transactions expensive
- Limited Isolation: Default read concern có thể read uncommitted data
// Transaction performance impact
const session = db.getMongo().startSession()
session.startTransaction() // Performance overhead
// ... operations
session.commitTransaction() // Network round trips
- No Native Joins: $lookup operations expensive
- Denormalization Required: Encourage data duplication
- Complex Relationships: Difficult to model complex relations
- Referential Integrity: No foreign key constraints
// Expensive $lookup operation
db.orders.aggregate([
{
$lookup: {
from: "customers", // Expensive operation
localField: "customerId",
foreignField: "_id",
as: "customer"
}
}
])
4. 🔍 Query Limitations
- Limited SQL Features: No window functions, CTEs
- Complex Analytics: Not suitable for complex reporting
- Aggregation Memory Limits: 100MB limit per stage
- Full-table Scans: Poorly designed queries can be expensive
5. 📏 Storage Overhead
- Field Name Duplication: Field names stored in every document
- BSON Overhead: Binary format adds size
- Padding: Document growth can cause fragmentation
- Index Size: Multiple indexes increase storage requirements
// Storage overhead example
{
"very_long_field_name_that_gets_repeated": "value1", // Field name stored
"another_very_long_field_name": "value2", // in every document
"yet_another_long_field_name": "value3" // causing overhead
}
6. 🔧 Operational Complexity
- Sharding Complexity: Shard key selection critical
- Balancing Issues: Chunk migration can impact performance
- Backup Challenges: Point-in-time recovery complexity
- Monitoring Requirements: Need specialized monitoring tools
7. 💰 Enterprise Costs
- Atlas Pricing: Managed service can be expensive
- Enterprise Features: Advanced features require paid license
- Scaling Costs: Horizontal scaling increases infrastructure costs
- Support Costs: Professional support not cheap
📊 So Sánh Với Competitors
MongoDB vs MySQL
| Feature |
MongoDB |
MySQL |
| Schema |
Flexible |
Fixed |
| Scalability |
Horizontal |
Vertical (mainly) |
| ACID |
Limited |
Full |
| Joins |
$lookup (expensive) |
Native (efficient) |
| Learning Curve |
Easy |
Moderate |
| Analytics |
Basic |
Good |
MongoDB vs PostgreSQL
| Feature |
MongoDB |
PostgreSQL |
| JSON Support |
Native |
Good (JSONB) |
| Schema Flexibility |
High |
Medium |
| Complex Queries |
Limited |
Excellent |
| Performance |
Good reads |
Balanced |
| Ecosystem |
NoSQL focused |
SQL focused |
MongoDB vs Cassandra
| Feature |
MongoDB |
Cassandra |
| Data Model |
Document |
Column-family |
| Write Performance |
Good |
Excellent |
| Read Performance |
Excellent |
Good |
| Consistency |
Eventual |
Tunable |
| Query Language |
Rich |
Limited (CQL) |
🎯 Khi Nào Nên Chọn MongoDB
✅ Suitable For:
- Content Management: Blogs, CMS, catalogs
- Real-time Analytics: Event logging, user tracking
- IoT Applications: Sensor data, time-series data
- Mobile Applications: Offline-first apps
- Rapid Development: Prototyping, agile development
- Flexible Schema: Evolving data models
- Geographic Data: Location-based services
- Caching Layer: Session storage, temporary data
❌ Avoid When:
- Complex Transactions: Banking, financial systems
- Heavy Analytics: Business intelligence, reporting
- Fixed Schema: Well-defined, stable data structures
- Budget Constraints: Cost-sensitive applications
- Small Team: Limited NoSQL expertise
- Regulatory Compliance: Strict ACID requirements
- Complex Relationships: Heavy relational data
💡 Best Practices To Mitigate Weaknesses
1. Schema Design
// Embed related data to avoid joins
{
"_id": ObjectId("..."),
"order": {
"customer": { // Embedded customer info
"name": "John Doe",
"email": "john@example.com"
},
"items": [...], // Embedded order items
"shipping": {...} // Embedded shipping info
}
}
2. Indexing Strategy
// Compound indexes for query patterns
db.users.createIndex({
"status": 1, // Equality first
"createdAt": -1, // Sort second
"age": 1 // Range last
})
// Partial indexes to save space
db.users.createIndex(
{ "email": 1 },
{ partialFilterExpression: { "status": "active" }}
)
3. Connection Management
// Connection pooling
const MongoClient = require('mongodb').MongoClient;
const client = new MongoClient(uri, {
maxPoolSize: 10, // Limit connection pool
serverSelectionTimeoutMS: 5000,
socketTimeoutMS: 45000,
})
4. Memory Optimization
// Monitor memory usage
db.serverStatus().wiredTiger.cache
// Use projections to limit data transfer
db.users.find(
{ "status": "active" },
{ "name": 1, "email": 1, "_id": 0 } // Only return needed fields
)
5. Transaction Best Practices
// Keep transactions short
const session = db.getMongo().startSession()
try {
session.startTransaction()
// Minimize operations in transaction
db.collection1.updateOne({...}, {...}, { session })
db.collection2.insertOne({...}, { session })
session.commitTransaction()
} catch (error) {
session.abortTransaction()
throw error
} finally {
session.endSession()
}
6. Monitoring Setup
// Enable profiling for slow operations
db.setProfilingLevel(1, { slowms: 100 })
// Monitor with mongostat
mongostat --host localhost:27017
// Use MongoDB Compass for GUI monitoring
Hiểu rõ strengths và limitations giúp sử dụng MongoDB hiệu quả và trả lời tốt câu hỏi phỏng vấn.