Loading content...
A database that performs beautifully at 1 million records may crumble at 100 million. A system that handles 1,000 writes per second might fail at 100,000. And a single-region deployment's simple architecture becomes untenable when users span the globe.
Scale is not just 'more of the same.' It introduces qualitative changes in how systems behave. Algorithms that were O(log n) effectively become O(n) when n exceeds memory. Network latencies that were negligible become dominant. Failure modes that were theoretical become daily realities.
Choosing a NoSQL database requires honest assessment of your scale requirements—not just current state, but realistic projections. The database that fits perfectly today may become an expensive migration project in two years if it can't scale with your growth.
By the end of this page, you will understand how to quantify your scale requirements across multiple dimensions (data volume, throughput, geographic distribution), how different NoSQL databases scale along each dimension, and how to anticipate scaling bottlenecks before they become production crises.
Scale isn't one-dimensional. Applications scale along multiple axes, each presenting different challenges and requiring different database capabilities.
Dimension 1: Data Volume
How much data will you store? This determines:
Dimension 2: Read Throughput
How many read operations per second? This determines:
Dimension 3: Write Throughput
How many write operations per second? This determines:
Dimension 4: Geographic Distribution
Where are your users? This determines:
| Dimension | Key-Value | Document | Wide-Column | Graph |
|---|---|---|---|---|
| Data Volume | Excellent (petabytes) | Good (terabytes) | Excellent (petabytes) | Moderate (limited by single-machine memory for traversals) |
| Read Throughput | Excellent (millions/sec) | Good (hundreds of thousands/sec) | Good (hundreds of thousands/sec) | Good (varies by query complexity) |
| Write Throughput | Excellent (millions/sec) | Good (hundreds of thousands/sec) | Excellent (millions/sec) | Moderate (relationship creation overhead) |
| Geographic Distribution | Excellent (simple partitioning) | Good (replica sets + sharding) | Excellent (multi-DC native) | Challenging (graph locality requirements) |
Most applications never need 'web-scale' databases. If your realistic 5-year projection is 10 million records with 1,000 requests/second, nearly any modern database will suffice. Don't over-engineer for hypothetical scale—but do choose a database with a clear scaling path if growth is genuinely anticipated.
Data volume is the most intuitive scale dimension: how much data will you store, and what happens as it grows?
Volume Thresholds and Their Implications:
| Volume | Infrastructure | Typical Challenges | Database Strategies |
|---|---|---|---|
| < 10 GB | Single node sufficient | Rarely limited by data volume | Any database works |
| 10-100 GB | Single node, SSD recommended | Index size, backup time | Optimize indexes, consider partitioning |
| 100 GB - 1 TB | High-memory nodes or sharding | Full scan impossibility, memory pressure | Sharding required for scalability |
| 1-10 TB | Distributed cluster | Backup/restore complexity, rebalancing | Native distributed databases essential |
| 10 TB - 1 PB | Large distributed clusters | Cost optimization, tiered storage | Purpose-built for scale (Cassandra, BigTable) |
1 PB | Custom infrastructure often needed | Everything is hard at this scale | Specialized solutions, often custom |
Database Volume Capabilities:
Key-Value (Redis, DynamoDB)
Document (MongoDB)
Wide-Column (Cassandra, HBase)
Graph (Neo4j)
Calculating Your Volume Requirements:
// Estimation framework
const estimateStorage = {
// Per-record size (average, including indexes)
recordSizeBytes: 2 * 1024, // 2 KB average document
// Growth projection
initialRecords: 1_000_000,
dailyNewRecords: 50_000,
retentionDays: 365 * 3, // 3 years
// Calculate
totalRecords: function() {
return this.initialRecords + (this.dailyNewRecords * this.retentionDays);
},
totalStorageGB: function() {
return (this.totalRecords() * this.recordSizeBytes) / (1024 ** 3);
},
// With replication (3x for typical distributed setup)
replicatedStorageGB: function() {
return this.totalStorageGB() * 3;
}
};
console.log(`3-year projection: ${estimateStorage.replicatedStorageGB().toFixed(0)} GB`);
// Output: 3-year projection: 312 GB (with 3x replication)
Storage isn't just data—it's data + indexes + replication + operational overhead (compaction, etc.). A 100 GB dataset with 5 indexes might consume 300 GB of storage. Factor 2-3x overhead for realistic capacity planning. Additionally, indexes in RAM requirements often exceed raw data storage requirements for query performance.
Throughput—the number of operations your database can handle per second—is often the binding constraint for growing applications. Unlike data volume (which grows predictably), throughput can spike unexpectedly with traffic patterns.
Understanding Throughput Metrics:
Throughput = Operations / Time
Read Throughput: Queries per second (QPS)
Write Throughput: Writes per second (WPS)
Key insight: Read and write throughput scale differently.
Most databases scale reads easily (replicas).
Writes are constrained by consistency/durability.
Throughput Scaling Strategies:
Read Scaling (Relatively Easy)
┌─────────────────┐
│ Application │
└────────┬────────┘
│
┌────────▼────────┐
│ Load Balancer │
└────────┬────────┘
┌──────────────┼──────────────┐
│ │ │
┌─────▼─────┐ ┌─────▼─────┐ ┌─────▼─────┐
│ Replica 1 │ │ Replica 2 │ │ Replica 3 │
│ (reads) │ │ (reads) │ │ (reads) │
└───────────┘ └───────────┘ └───────────┘
10,000 QPS requirement × 3 replicas = ~3,333 QPS per replica
Most databases support read replicas. Adding replicas linearly increases read throughput (with eventual consistency caveats).
Write Scaling (Harder)
Writes must propagate to all replicas, limiting pure replication-based scaling. Solutions:
| Database | Read QPS (Single Node) | Write TPS (Single Node) | Horizontal Scaling |
|---|---|---|---|
| Redis | 100,000+ | 100,000+ | Redis Cluster: millions/sec |
| DynamoDB | N/A (managed) | N/A (managed) | Unlimited with on-demand capacity |
| MongoDB | 10,000-50,000 | 5,000-20,000 | Linear with sharding |
| Cassandra | 10,000-30,000 | 10,000-50,000 | Linear with nodes |
| Neo4j | 10,000-100,000 (varies) | 1,000-10,000 | Limited (read replicas only) |
Calculating Throughput Requirements:
// Throughput estimation
const throughputCalc = {
// Peak active users
peakConcurrentUsers: 100_000,
// Actions per user per minute
readsPerUserPerMinute: 30,
writesPerUserPerMinute: 5,
// Calculate per second (with 2x safety margin)
peakReadQPS: function() {
return (this.peakConcurrentUsers * this.readsPerUserPerMinute / 60) * 2;
},
peakWriteTPS: function() {
return (this.peakConcurrentUsers * this.writesPerUserPerMinute / 60) * 2;
}
};
console.log(`Peak reads: ${throughputCalc.peakReadQPS()} QPS`);
console.log(`Peak writes: ${throughputCalc.peakWriteTPS()} TPS`);
// Output: Peak reads: 100,000 QPS
// Output: Peak writes: 16,667 TPS
High throughput often comes at the cost of latency. Under load, queue depths increase and response times rise. A database might handle 50,000 QPS at p99 latency of 5ms, but only 20,000 QPS at p99 latency of 2ms. Understand both your throughput AND latency requirements—they constrain each other.
For applications serving global users, geographic distribution becomes a critical scaling dimension. Physics imposes hard limits: light travels 200km per millisecond through fiber. A round-trip from New York to Tokyo adds ~200ms minimum latency.
The Geographic Challenge:
┌─────────────────────────────────────────┐
│ Atlantic Ocean: 80ms │
└─────────────────────────────────────────┘
🗽 New York 🗼 London
┌──────────┐ ┌──────────┐
│ User A │ ────── If single datacenter ──────── │ Database │
└──────────┘ in London: 160ms RTT └──────────┘
🏯 Tokyo
┌──────────┐
│ User B │ ────── To London: 280ms RTT ────────
└──────────┘ Unacceptable for interactive apps
Geographic Distribution Strategies:
1. Read Replicas per Region
Most common approach: writes go to primary region, reads served from local replicas.
US-EAST (Primary) EU-WEST (Replica) AP-NORTHEAST (Replica)
┌────────────┐ ┌────────────┐ ┌────────────┐
│ Primary │ ──replicate─▶│ Replica │ ◀─replicate─│ Replica │
│ (R+W) │ │ (R only) │ │ (R only) │
└────────────┘ └────────────┘ └────────────┘
▲ ▲ ▲
│ │ │
US Users EU Users Asia Users
(fast R+W) (fast R, slow W) (fast R, slow W)
Trade-off: Writes from non-primary regions incur cross-region latency.
2. Multi-Primary (Active-Active)
All regions accept writes, with async replication between them.
US-EAST (Primary) EU-WEST (Primary) AP-NORTHEAST (Primary)
┌────────────┐ ┌────────────┐ ┌────────────┐
│ Primary 1 │◀───sync───▶│ Primary 2 │◀───sync───▶│ Primary 3 │
│ (R+W) │ │ (R+W) │ │ (R+W) │
└────────────┘ └────────────┘ └────────────┘
All users get fast reads AND writes.
Trade-off: Conflict resolution required for concurrent writes.
3. Data Locality (Shard by Region)
User data lives in their home region. Cross-region queries rare.
User in EU → Data stored in EU → Read/write in EU → No cross-region traffic
Trade-off: Users traveling see stale data or high latency.
Cross-regional features (EU user views US user profile) are slow.
| Database | Multi-Region Support | Global Strong Consistency | Typical Approach |
|---|---|---|---|
| Google Spanner | Native, first-class | Yes (TrueTime) | Globally distributed with synchronous replication |
| CockroachDB | Native multi-region | Yes (Raft) | Follow-the-sun, geo-partitioning |
| DynamoDB | Global Tables | Eventually consistent | Active-active with LWW conflict resolution |
| Cassandra | Multi-datacenter native | Tunable (LOCAL_QUORUM) | Multi-primary with configurable consistency |
| MongoDB | Zone sharding | With zone-aware reads | Primary region + read replicas |
True global strong consistency (Spanner-style) requires cross-region coordination for every write—adding 100-300ms to write latency. Most applications use region-local strong consistency with cross-region eventual consistency, accepting that a user in Tokyo may see a 2-second-old version of data written in London.
Before selecting a database for scale, understand the fundamental difference between vertical and horizontal scaling—and how different databases support each.
Vertical Scaling (Scale Up)
Add resources to a single node: more CPU, more RAM, more disk.
┌───────────────────┐ ┌─────────────────────────────┐
│ Small Server │ → │ Large Server │
│ 4 CPU, 16GB │ │ 64 CPU, 512GB │
│ 500 GB SSD │ │ 10 TB NVMe RAID │
└───────────────────┘ └─────────────────────────────┘
Advantages: Disadvantages:
- Simple - Hard limits (biggest available machine)
- No distributed logic - Single point of failure
- All data local - Expensive ($/GB increases)
- Strong consistency - Downtime for upgrades
Horizontal Scaling (Scale Out)
Add more nodes to a cluster, distributing data and load.
┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐
│ 1 Node │ → │ Node 1 │ │ Node 2 │ │ Node 3 │
└───────────┘ └───────────┘ └───────────┘ └───────────┘
Advantages: Disadvantages:
- Unlimited theoretical - Distributed system complexity
- Commodity hardware - Network partitions possible
- No single point of - Cross-node queries expensive
failure (if replicated) - Consistency challenges
- Rolling upgrades - Operational complexity
The Scaling Decision Matrix:
| Your Situation | Recommended Approach |
|---|---|
| < 1 TB data, < 10K QPS | Vertical scaling (simpler) |
| > 1 TB data OR > 50K QPS | Horizontal scaling required |
| Unpredictable, bursty traffic | Managed services with auto-scaling |
| Steady, predictable growth | Self-managed with capacity planning |
| Multi-region latency requirements | Horizontally distributed from start |
| Strong consistency required | Vertical OR carefully designed horizontal |
Vertical scaling is simpler and often sufficient for longer than expected. A well-optimized single PostgreSQL instance handles remarkable workloads. Start with vertical scaling but choose a database with clear horizontal scaling path. Don't prematurely add distributed system complexity.
Beyond choosing a database, how you design your data model and access patterns profoundly impacts scaling behavior. Some patterns scale gracefully; others create bottlenecks.
Anti-Pattern 1: Hot Partitions
All traffic concentrates on one partition while others sit idle.
// Bad: Sequential IDs create time-based hot partitions
Partition Key: order_id (auto-increment)
99% of reads are for recent orders → all on latest partition
┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐
│ Partition │ │ Partition │ │ Partition │ │ Partition │
│ 1-1M │ │ 1M-2M │ │ 2M-3M │ │ 3M-4M │
│ 0% │ │ 0% │ │ 1% │ │ 99% │
└───────────┘ └───────────┘ └───────────┘ └───────────┘
Hot partition!
// Better: Hash-based partitioning
Partition Key: SHA256(order_id)
Even distribution across partitions
Anti-Pattern 2: Large Partitions
Unbounded partition growth creates operational problems.
-- Bad: All events for a user in one partition
PRIMARY KEY (user_id)
A power user with 10 million events → 10 GB partition
Slow queries, backup issues, rebalancing problems
-- Better: Bucket by time
PRIMARY KEY ((user_id, year_month), event_time)
Partitions bounded to ~1 month of data each
Anti-Pattern 3: Scatter-Gather Queries
Queries that touch all partitions don't scale.
// Bad: Unfiltered aggregations
SELECT AVG(amount) FROM orders; -- Touches every partition
// Better: Pre-computed rollups or time-bounded queries
SELECT avg_amount FROM daily_order_stats WHERE date = '2024-01-15';
Hot partitions and scatter-gather queries often aren't apparent at small scale. They emerge suddenly when traffic exceeds a threshold: 'Yesterday it worked fine, today the database is on fire.' Invest in load testing with realistic data distributions before production. Synthetic uniform data hides partition-related problems.
How do you translate business projections into database scaling requirements? Use this systematic framework.
Step 1: Define Your Time Horizon
Typically plan for 2-3 years. Beyond that, re-evaluation is prudent.
Step 2: Project Growth Across Dimensions
| Metric | Current | 1 Year | 2 Years | 3 Years |
|---|---|---|---|---|
| Active users | 10K | 50K | 200K | 500K |
| Data volume (GB) | 20 | 100 | 400 | 1,000 |
| Peak read QPS | 500 | 2,500 | 10,000 | 25,000 |
| Peak write TPS | 100 | 500 | 2,000 | 5,000 |
| Regions needed | 1 | 1 | 2 | 3 |
Step 3: Identify Scaling Ceiling for Current Choice
For each candidate database, determine where you'll hit limits:
MongoDB single replica set:
✓ 20 GB → easily handled
✓ 100 GB → comfortable
✓ 400 GB → approaching practical limits
✗ 1 TB → sharding required
MongoDB sharded cluster:
✓ 1 TB → managed with 3-shard cluster
✓ 10 TB → managed with larger cluster
Verdict: Need to implement sharding by Year 2.
Choose MongoDB with sharding-ready key design from Day 1.
Step 4: Factor in Operational Requirements
Step 5: Choose Based on Ceiling, Not Floor
Don't choose based on what works today. Choose based on what works at 10x scale. The database that barely handles your 3-year projection is wrong—unexpected growth happens.
For scale-uncertain applications, managed services (DynamoDB, MongoDB Atlas, Cosmos DB) provide elastic scaling without operational burden. You pay more per-unit, but you avoid the operational cost of managing infrastructure at scale. For startups with unpredictable growth, managed services often make the most economic sense.
Scale requirements are the fourth essential lens for database selection. A database might fit your data model, support your query patterns, and offer acceptable consistency—but if it can't scale to your projected needs, it's wrong for your use case.
The core insight: scale is multi-dimensional. Consider data volume, read throughput, write throughput, and geographic distribution separately. Different databases have different ceilings along each dimension.
What's next:
With all four lenses understood—data model fit, query patterns, consistency requirements, and scale requirements—the final page presents a comprehensive decision framework that synthesizes these considerations into a systematic database selection process. This framework transforms what seems like an overwhelming choice into a structured evaluation with clear criteria.
You now understand how to evaluate scale requirements as a critical database selection factor. The next step is projecting your application's growth across all scale dimensions and validating that candidate databases have adequate headroom. Choose based on where you'll be, not where you are.