Loading learning content...
Understanding when to use a document database—and when not to—is perhaps the most valuable knowledge a database practitioner can possess. Technology choices are rarely black and white; the best choice depends on your specific requirements, constraints, and trade-offs.
Document databases have become the default choice for many modern applications, but this popularity has also led to misuse. Not every application benefits from schema flexibility. Not every dataset maps naturally to documents.
In this page, we'll develop a nuanced understanding of document database strengths, examine real-world use cases where they excel, identify anti-patterns to avoid, and build a decision framework for technology selection.
By the end of this page, you will understand: the core strengths that make document databases excel; detailed use cases with schema designs; anti-patterns and when to avoid documents; comparison with relational databases for specific scenarios; a decision framework for technology selection; and migration considerations.
Before examining specific use cases, let's establish the foundational strengths that make document databases compelling:
The Benefit: Evolve your data model without migrations. Add fields to new documents without updating existing ones. Handle heterogeneous entities in the same collection.
When It Matters:
The Benefit: Related data embedded together is retrieved in a single read. No joins needed for common access patterns.
When It Matters:
The Benefit: Sharding distributes data across machines transparently. Scale reads and writes by adding nodes.
When It Matters:
Content Management Systems (CMS) are perhaps the quintessential document database use case. The alignment between content and documents is natural and powerful.
1. Heterogeneous Content Types
A CMS manages diverse content: articles, pages, products, events, media. Each has different attributes:
// Article
{
"_id": "content_001",
"type": "article",
"title": "Introduction to MongoDB",
"author": { "name": "Jane Doe", "avatar": "..." },
"body": "...",
"tags": ["database", "tutorial"],
"metadata": {
"readTime": 8,
"wordCount": 2400
}
}
// Product
{
"_id": "content_002",
"type": "product",
"title": "MongoDB Certification",
"price": { "amount": 150, "currency": "USD" },
"variants": [
{ "level": "associate", "price": 150 },
{ "level": "professional", "price": 300 }
],
"features": ["Online proctored", "Valid 3 years"]
}
// Event
{
"_id": "content_003",
"type": "event",
"title": "MongoDB Conference 2024",
"dates": {
"start": ISODate("2024-06-01"),
"end": ISODate("2024-06-03")
},
"location": {
"venue": "Convention Center",
"address": "...",
"coordinates": { "type": "Point", "coordinates": [-122.4, 37.8] }
},
"speakers": ["..."],
"sessions": ["..."]
}
2. Flexible Custom Fields
Content creators need to add custom metadata without developer involvement:
{
"_id": "article_005",
"type": "article",
"customFields": {
"sponsoredContent": true,
"sponsorName": "Acme Corp",
"campaignId": "Q1_2024"
}
}
3. Embedded Rich Media
{
"_id": "article_006",
"body": [
{ "type": "paragraph", "content": "Introduction..." },
{ "type": "image", "src": "...", "caption": "...", "alt": "..." },
{ "type": "code", "language": "javascript", "content": "..." },
{ "type": "quote", "text": "...", "attribution": "..." }
]
}
The key to CMS success with documents:
• Use a type field to distinguish content kinds
• Create type-specific indexes for queries
• Embed content that's always displayed together
• Reference content that's shared (authors, categories)
• Leverage polymorphic collections for unified content feeds
E-commerce product catalogs demonstrate the power of document flexibility for varying product attributes.
Different product categories have entirely different attributes:
// Electronics product
{
"_id": "prod_001",
"sku": "LAPTOP-PRO-15",
"category": ["electronics", "computers", "laptops"],
"name": "ProBook 15 Laptop",
"brand": "TechCorp",
"price": {
"base": 999.99,
"sale": 899.99,
"currency": "USD"
},
"specs": {
"display": { "size": 15.6, "resolution": "1920x1080", "type": "IPS" },
"processor": { "brand": "Intel", "model": "i7-12700H", "cores": 14 },
"memory": { "ram": 16, "storage": 512, "storageType": "SSD" },
"battery": { "capacity": 72, "life": "10 hours" }
},
"variants": [
{ "sku": "LAPTOP-PRO-15-8GB", "memory": { "ram": 8 }, "priceModifier": -100 },
{ "sku": "LAPTOP-PRO-15-32GB", "memory": { "ram": 32 }, "priceModifier": 200 }
],
"inventory": {
"available": 45,
"warehouses": [
{ "location": "NYC", "qty": 20 },
{ "location": "LA", "qty": 25 }
]
}
}
// Clothing product
{
"_id": "prod_002",
"sku": "TSHIRT-CLASSIC",
"category": ["clothing", "tops", "t-shirts"],
"name": "Classic Cotton T-Shirt",
"brand": "BasicWear",
"price": { "base": 24.99, "currency": "USD" },
"specs": {
"material": "100% Cotton",
"weight": "180gsm",
"care": ["Machine wash cold", "Tumble dry low"],
"fit": "Regular"
},
"variants": [
{ "color": "white", "sizes": ["S", "M", "L", "XL"], "images": ["..."] },
{ "color": "black", "sizes": ["S", "M", "L", "XL"], "images": ["..."] },
{ "color": "navy", "sizes": ["M", "L", "XL"], "images": ["..."] }
]
}
// Shopping Cart (embedded items for atomic operations)
{
"_id": "cart_user123",
"userId": "user123",
"items": [
{
"productId": "prod_001",
"variantSku": "LAPTOP-PRO-15-16GB",
"name": "ProBook 15 Laptop",
"price": 899.99,
"quantity": 1,
"addedAt": ISODate("2024-01-15T10:30:00Z")
},
{
"productId": "prod_002",
"variantSku": "TSHIRT-CLASSIC-BLACK-L",
"name": "Classic Cotton T-Shirt (Black, L)",
"price": 24.99,
"quantity": 2,
"addedAt": ISODate("2024-01-15T10:32:00Z")
}
],
"totals": {
"subtotal": 949.97,
"tax": 76.00,
"shipping": 0,
"total": 1025.97
},
"updatedAt": ISODate("2024-01-15T10:32:00Z")
}
// Order (complete snapshot for historical accuracy)
{
"_id": "order_12345",
"orderNumber": "ORD-2024-12345",
"customerId": "user123",
"status": "shipped",
"items": [
{
"productId": "prod_001",
"sku": "LAPTOP-PRO-15-16GB",
"name": "ProBook 15 Laptop",
"priceAtPurchase": 899.99, // Captured at order time
"quantity": 1
}
],
"shipping": {
"address": { "street": "...", "city": "...", "..." },
"method": "express",
"trackingNumber": "1Z999AA10123456784",
"carrier": "UPS"
},
"payment": {
"method": "credit_card",
"last4": "4242",
"transactionId": "ch_1234567890"
},
"timeline": [
{ "status": "placed", "timestamp": ISODate("2024-01-15T11:00:00Z") },
{ "status": "paid", "timestamp": ISODate("2024-01-15T11:00:05Z") },
{ "status": "processing", "timestamp": ISODate("2024-01-15T14:00:00Z") },
{ "status": "shipped", "timestamp": ISODate("2024-01-16T09:00:00Z") }
]
}
• Embed product snapshot in orders — Prices change; orders need historical accuracy • Use category arrays — Products belong to multiple categories for navigation • Separate inventory — For high-frequency updates, consider separate inventory collection • Denormalize for product pages — Embed reviews count, average rating, images • Reference for reports — Cross-document queries for analytics (use aggregation)
Document databases excel at ingesting and querying time-series and event data, especially when combined with appropriate patterns.
The Bucketing Pattern:
Instead of one document per event (massive document count), bucket events:
// One document per sensor per hour
{
"_id": ObjectId("..."),
"sensorId": "sensor_001",
"bucket": ISODate("2024-01-15T10:00:00Z"),
"measurements": [
{ "t": ISODate("2024-01-15T10:00:05Z"), "temp": 22.5, "humidity": 45 },
{ "t": ISODate("2024-01-15T10:00:10Z"), "temp": 22.6, "humidity": 44 },
{ "t": ISODate("2024-01-15T10:00:15Z"), "temp": 22.4, "humidity": 46 },
// ... up to 720 readings per hour (5-second intervals)
],
"summary": {
"count": 720,
"temp": { "min": 21.8, "max": 23.2, "avg": 22.5 },
"humidity": { "min": 42, "max": 48, "avg": 45 }
}
}
Benefits:
// User activity stream
{
"_id": ObjectId("..."),
"userId": "user_123",
"date": ISODate("2024-01-15"),
"events": [
{
"type": "page_view",
"timestamp": ISODate("2024-01-15T10:30:00Z"),
"data": { "page": "/products/laptop", "duration": 45 }
},
{
"type": "add_to_cart",
"timestamp": ISODate("2024-01-15T10:31:00Z"),
"data": { "productId": "prod_001", "price": 899.99 }
},
{
"type": "checkout_start",
"timestamp": ISODate("2024-01-15T10:32:00Z"),
"data": { "cartValue": 899.99 }
}
],
"sessionCount": 1,
"totalEvents": 3
}
// IoT Device Events
{
"_id": ObjectId("..."),
"deviceId": "thermostat_living_room",
"type": "smart_thermostat",
"location": {
"building": "Office HQ",
"floor": 3,
"room": "Conference A"
},
"state": {
"currentTemp": 72,
"targetTemp": 70,
"mode": "cooling",
"fanSpeed": "auto"
},
"lastReading": ISODate("2024-01-15T10:35:00Z"),
"connectivity": {
"online": true,
"signalStrength": -45,
"lastHeartbeat": ISODate("2024-01-15T10:35:00Z")
},
"alerts": [
{
"type": "maintenance_due",
"triggered": ISODate("2024-01-10"),
"acknowledged": false
}
]
}
• Use MongoDB Time Series collections — Native time-series support since MongoDB 5.0
• Choose bucket granularity — Balance between document size and query patterns
• Index on time + device — { sensorId: 1, bucket: -1 } for efficient queries
• TTL indexes for retention — Automatically expire old data
• Pre-aggregate summaries — Compute min/max/avg at ingestion for fast dashboards
Mobile and gaming applications often require flexible data models, offline sync, and low-latency access—areas where document databases shine.
{
"_id": "user_12345",
"username": "gamer_pro",
"email": "user@example.com",
"profile": {
"displayName": "Pro Gamer",
"avatar": "https://...",
"bio": "Competitive gamer and streamer"
},
"settings": {
"notifications": {
"push": true,
"email": { "marketing": false, "updates": true }
},
"privacy": { "profilePublic": true, "showOnlineStatus": false },
"gameplay": {
"sensitivity": 0.8,
"keybindings": { "jump": "space", "crouch": "ctrl" }
}
},
"stats": {
"gamesPlayed": 1523,
"wins": 892,
"losses": 631,
"winRate": 0.585,
"hoursPlayed": 2450
},
"achievements": [
{ "id": "first_win", "unlockedAt": ISODate("2023-01-15") },
{ "id": "100_wins", "unlockedAt": ISODate("2023-03-20") },
{ "id": "master_rank", "unlockedAt": ISODate("2023-08-01") }
],
"inventory": [
{ "itemId": "skin_001", "acquiredAt": ISODate("2023-05-10"), "equipped": true },
{ "itemId": "weapon_005", "acquiredAt": ISODate("2023-06-15"), "equipped": false }
],
"subscriptions": {
"premium": {
"active": true,
"tier": "gold",
"expiresAt": ISODate("2024-12-31")
}
},
"lastLogin": ISODate("2024-01-15T10:30:00Z"),
"devices": [
{ "id": "device_001", "type": "mobile", "os": "iOS", "lastSeen": "..." },
{ "id": "device_002", "type": "desktop", "os": "Windows", "lastSeen": "..." }
]
}
// Game save document
{
"_id": "save_user123_game1",
"userId": "user_123",
"gameId": "adventure_quest",
"slot": 1,
"character": {
"name": "Aragorn",
"class": "warrior",
"level": 45,
"experience": 125000,
"health": { "current": 850, "max": 1000 },
"mana": { "current": 200, "max": 300 },
"stats": {
"strength": 85,
"dexterity": 45,
"intelligence": 30,
"vitality": 70
}
},
"inventory": {
"gold": 15420,
"items": [
{ "id": "sword_legendary", "slot": "weapon", "enchantments": ["fire", "speed"] },
{ "id": "potion_health", "quantity": 25 }
],
"capacity": { "used": 45, "max": 100 }
},
"progress": {
"currentQuest": "dragon_slayer",
"completedQuests": ["tutorial", "first_boss", "merchant_escort"],
"discoveredLocations": ["starting_village", "dark_forest", "castle_ruins"],
"unlockedAbilities": ["power_strike", "shield_bash", "battle_cry"]
},
"worldState": {
"dayNightCycle": "night",
"weather": "stormy",
"specialEvents": ["lunar_eclipse"]
},
"playTime": 125400, // seconds
"lastSaved": ISODate("2024-01-15T10:30:00Z"),
"version": "2.1.0" // Game version for compatibility
}
For mobile apps with offline support:
• MongoDB Realm provides device-side sync with automatic conflict resolution • Version fields enable optimistic concurrency control • Atomic updates on user documents prevent race conditions • Field-level sync minimizes data transfer for limited connectivity • Compound indexes on userId + lastModified for efficient sync queries
Understanding when document databases are the wrong choice is as important as knowing when they're right.
The Problem:
Students ←→ Courses ←→ Instructors ←→ Departments
↓
Assignments ←→ Grades
Highly interconnected data with queries that traverse relationships ("Find all instructors who taught students who failed assignments in courses from the CS department") require expensive cross-collection operations in document databases.
Better Choice: Relational database with proper JOINs, or graph database for relationship-heavy queries.
The Problem:
If a fact (e.g., company name) is embedded in 100,000 documents and changes frequently, every change requires updating 100,000 documents.
// Bad: Company name embedded everywhere
{ "employeeId": 1, "company": { "name": "Acme Corp", "logo": "..." } }
{ "employeeId": 2, "company": { "name": "Acme Corp", "logo": "..." } }
// ... 99,998 more employees
Solution: Use references for frequently-updated shared data, or choose relational for normalized models.
The Problem:
Double-entry bookkeeping, ledger systems, and financial transactions require:
While MongoDB now supports transactions, relational databases have 40+ years of optimization for these use cases.
Better Choice: PostgreSQL, Oracle, or purpose-built financial databases.
The Problem:
-- Complex ad-hoc query
SELECT
region,
product_category,
SUM(revenue) as total_revenue,
COUNT(DISTINCT customer_id) as unique_customers,
AVG(order_value) as avg_order
FROM orders
JOIN customers ON ...
JOIN products ON ...
WHERE order_date BETWEEN ...
GROUP BY ROLLUP(region, product_category)
HAVING SUM(revenue) > 10000
ORDER BY total_revenue DESC;
Complex analytical queries with multiple JOINs, window functions, and grouping sets are SQL's strength.
Better Choice: Data warehouse (Snowflake, BigQuery) or OLAP-optimized database.
| Scenario | Document DB | Relational | Specialized |
|---|---|---|---|
| Flexible schema, varied content | ✅ Excellent | ⚠️ Possible (JSON columns) | — |
| Read-heavy with denormalization | ✅ Excellent | ⚠️ Possible | — |
| Complex many-to-many | ❌ Difficult | ✅ Excellent | Graph DB |
| Strong ACID transactions | ⚠️ Improved | ✅ Excellent | — |
| Ad-hoc analytics | ⚠️ Aggregation | ✅ Good | Data Warehouse |
| Time-series data | ✅ Good (with patterns) | ⚠️ Possible | Time-series DB |
| Full-text search | ⚠️ Basic | ⚠️ Basic | Elasticsearch |
| High-velocity writes | ✅ Excellent | ⚠️ Scaling challenges | — |
| Geographic distribution | ✅ Built-in sharding | ⚠️ Complex | — |
Use this framework to evaluate whether a document database fits your needs:
Ask:
Document-favorable answers: Yes to most of the above.
Ask:
Document-favorable answers: Yes to most of the above.
Ask:
Document-favorable answers: Yes to first two, No to last two.
Ask:
Document-favorable answers: Yes to any of the above (sharding is a core strength).
Modern applications often use multiple databases:
• MongoDB for user-facing application data • Redis for caching and sessions • PostgreSQL for financial/transactional data • Elasticsearch for full-text search • Snowflake for analytics and reporting
Don't force one database to do everything. Choose the right tool for each job.
We've explored when document databases shine and when to choose alternatives. Let's consolidate the key insights:
Module Complete:
Congratulations! You've completed the Document Databases module. You now understand the document model philosophy, JSON/BSON storage internals, MongoDB's architecture and operations, the complete query language, and when to use (and avoid) document databases.
This knowledge positions you to design, implement, and operate document database solutions for appropriate use cases while making informed technology choices for complex systems.
You've mastered document databases—from conceptual foundations through practical use cases. You can now design appropriate document schemas, write efficient queries, understand when documents are the right choice, and make informed technology decisions for real-world systems.