System Design (HLD)Sharding (Partitioning)

Sharding and Database Partitioning

LevelAdvanced

Duration90 mins

TopicSharding (Partitioning)

1 / 6

Why Sharding is Needed

The Single-Node Ceiling

Every database, no matter how powerful, eventually hits a wall. This wall isn't a bug to fix or an optimization to discover—it's a fundamental limitation of running on a single machine. The most expensive server money can buy has finite CPU cores, finite memory, finite disk I/O, and finite network bandwidth. When your data grows beyond what one machine can handle, you face a critical architectural decision.

Sharding—also called horizontal partitioning—is the strategy of splitting a database across multiple machines, where each machine holds a subset of the data. Unlike vertical scaling (buying bigger hardware) or replication (copying data for redundancy), sharding actually divides your dataset, allowing you to scale storage and throughput almost linearly by adding more nodes.

What You Will Learn

By the end of this page, you will understand why sharding becomes necessary, the specific limitations it addresses, and the fundamental tradeoffs you accept when moving from a single database to a sharded architecture. This understanding is essential before diving into specific sharding strategies.

The Scaling Story: Where It All Breaks Down

To understand why sharding is necessary, let's trace the journey of a growing application—a story that plays out in thousands of companies every year.

Phase 1: The Single Database (0 to 1 million users)

Your startup launches with a single PostgreSQL or MySQL instance. Everything works beautifully. Queries are fast, joins are simple, transactions are straightforward. You focus on building features, not infrastructure. This is the golden era.

Phase 2: Read Scaling with Replicas (1 to 10 million users)

As traffic grows, read queries start competing with write queries. You add read replicas. Now reads fan out across multiple replicas while writes go to a single primary. This buys time, but only for read-heavy workloads.

Phase 3: Vertical Scaling (10 to 50 million users)

Write traffic increases. You upgrade to increasingly expensive hardware—more RAM, faster SSDs, more CPU cores. Each upgrade is costly and provides diminishing returns. You're now paying $50,000/month for a database server.

Phase 4: The Breaking Point (50+ million users)

You hit hard limits that no amount of money can solve:

Single-Node Hard Limits

•Storage Capacity — A single server can only attach so many disks. Even with 100TB SSDs, there's a physical limit. When your data exceeds this, there's no hardware solution.
•Write Throughput — A single database can only process so many write transactions per second. Each write must be serialized, logged, and indexed. No amount of CPU helps beyond a certain point.
•Memory Limits — Indexes must fit in memory for fast access. When your index exceeds available RAM, every query becomes a disk seek. Performance degrades catastrophically.
•Connection Limits — Each database connection consumes memory and context-switching overhead. At tens of thousands of concurrent connections, the database itself becomes the bottleneck.
•Backup and Recovery Windows — Backing up a 50TB database takes hours. Recovery can take days. Your business can't tolerate these windows.

The Vertical Scaling Trap

Many teams fall into the trap of continuously buying bigger servers, hoping the next upgrade will 'hold for another year.' This is expensive and ultimately futile. The largest available servers have known limits, and cloud providers charge premium prices at the top tier. Worse, you're building on a foundation that cannot support your future growth.

Understanding the Limits Mathematically

Let's put concrete numbers to these limits. Understanding the math helps you predict when sharding becomes necessary for your system.

Storage Growth Analysis

Consider a system that stores user activity data. Each user generates 10 events per day, each event is 1KB on average. With 10 million daily active users:

Daily data: 10M users × 10 events × 1KB = 100GB/day
Monthly data: 100GB × 30 = 3TB/month
Yearly data: 3TB × 12 = 36TB/year

With retention requirements and indexes (which often 2-3x your data size), you're looking at 100TB+ within two years. No single database instance handles this gracefully.

Write Throughput Analysis

A well-tuned PostgreSQL instance on high-end hardware might sustain 30,000-50,000 write transactions per second. Sounds like a lot? Let's check:

10M DAU, peak hour has 20% of daily traffic
Each active user performs 5 write operations
Peak TPS: (10M × 0.2 × 5) / 3600 = ~2,800 TPS

This seems safe, but consider growth. At 100M DAU:

Peak TPS: ~28,000 TPS

You're now at the edge. Factor in spikes (viral content, Black Friday) that can 10x traffic, and you've exceeded capacity.

Approximate Single-Node Database Limits (2024 Hardware)
Resource	Practical Limit	Hard Limit	Consequence of Exceeding
Storage	20-50TB	~100TB	Cannot store more data; system halts
Write TPS	10K-50K	~100K	Writes queue indefinitely; timeouts cascade
Working Set (RAM)	256GB-1TB	~4TB	Every query hits disk; latency explodes
Connections	1,000-5,000	~10,000	New connections refused; cascading failures
Index Size	10-50GB/table	RAM limit	Index scans become table scans; queries timeout

The Latency Tax

As you approach these limits, performance degrades non-linearly. A database at 50% capacity might have p99 latency of 50ms. At 80%, it might be 200ms. At 95%, you're looking at multi-second latencies and timeout storms.

This is why experienced architects plan for sharding before hitting limits—the transition is much smoother when you're not fighting fires.

What Sharding Actually Solves

Sharding addresses the fundamental bottlenecks we've discussed by distributing data across multiple independent database instances. Each shard is a complete database that handles a subset of your data. Let's examine how sharding solves each limit:

How Sharding Addresses Limits

•Storage Scalability — With 10 shards, you can store 10× the data. With 100 shards, 100×. Storage scales linearly with shard count. Companies like Facebook use thousands of shards to store petabytes of data.
•Write Throughput — Each shard handles writes independently. If one shard handles 30K TPS, 10 shards handle 300K TPS. Writes targeting different shards can execute in parallel with zero contention.
•Memory Distribution — Each shard only needs memory for its portion of the data. Instead of one machine needing 2TB of RAM, 10 machines each need 200GB. This is dramatically cheaper and more available.
•Connection Distribution — Application servers connect to specific shards. With 10 shards, each shard handles 1/10th of the connections. You can support 50,000 total connections across 10 shards easily.
•Backup/Recovery — Each shard backs up independently and in parallel. A 50TB database that takes 10 hours to backup becomes 10 × 5TB shards, each backing up in 1 hour simultaneously.

Single Database Approach

•All data on one machine
•Single point of write contention
•Entire dataset must fit working set
•All connections hit one server
•Backup affects entire system
•Hardware upgrades require migration
•Growth requires bigger machines

Sharded Database Approach

•Data distributed across many machines
•Write contention isolated per shard
•Each shard has manageable working set
•Connections distributed across shards
•Shard-level backup with no global impact
•Add shards without migrating hardware
•Growth requires more commodity machines

The Sharding Tradeoffs: Nothing is Free

Sharding is not a silver bullet. It introduces significant complexity and constraints that must be understood before adoption. Many teams have sharded prematurely and regretted it. Others have sharded too late and suffered outages. Understanding tradeoffs helps you make the right decision.

Fundamental Tradeoffs:

Sharding Tradeoffs Analysis
Aspect	Single Database	Sharded Database
Cross-entity Queries	Join any tables freely	Cross-shard joins are expensive or impossible
Transactions	ACID across all data	ACID within shard; distributed transactions complex
Schema Changes	Single migration	Coordinate across all shards
Operational Complexity	One database to manage	N databases to manage
Application Complexity	Query any data	Determine correct shard for every query
Cost Model	One expensive server	Many cheaper servers (often lower total cost)
Failure Domain	Total outage if down	Partial outage (only affected shard)

The Cross-Shard Query Problem

This is often the most painful tradeoff. In a single database, you can join users with orders with products with inventory in a single query. With sharding, if users and orders are on different shards (or different rows are on different shards), you need to:

Query multiple shards
Collect results in application layer
Join/aggregate in memory
Handle partial failures gracefully

This is why shard key selection (covered later in this module) is so critical. A good shard key minimizes cross-shard queries. A bad shard key turns every query into an expensive scatter-gather operation.

The 80/20 Rule of Sharding

A well-designed sharding strategy routes 80% of queries to a single shard. The remaining 20% may require cross-shard coordination, but these are typically background jobs, analytics, or rare user flows. If your access patterns require constant cross-shard queries, you've chosen the wrong shard key—or sharding may not be the right solution for your use case.

When to Shard (and When Not To)

Sharding is a major architectural decision. Implementing it too early adds unnecessary complexity. Implementing it too late leads to painful migrations under pressure. Here's a framework for making this decision:

Signals That You Need Sharding

•Data volume exceeds single-node capacity — You're approaching 10-20TB and growing. No vertical scaling option is economically viable.
•Write throughput is saturated — Even with optimized queries and proper indexing, you're hitting write limits. Read replicas don't help because the problem is writes.
•Working set exceeds available RAM — Your most frequently accessed data doesn't fit in memory. Query latencies are increasing despite optimization.
•Backup windows are unacceptable — Full backups take so long they impact business operations or create unacceptable recovery point objectives.
•Clear partitioning dimension exists — Your data has an obvious shard key (tenant_id, user_id, region) that aligns with access patterns.

Signals That You Should NOT Shard Yet

•You haven't optimized queries — Slow queries often indicate missing indexes, inefficient JOINs, or N+1 problems. Fix these first.
•You're read-heavy — Read replicas provide almost unlimited read scalability without sharding complexity. Use them first.
•Data fits comfortably in memory — If your working set is 50GB and you could afford 256GB RAM, vertical scaling is simpler.
•You need complex cross-entity queries — Analytics, reporting, and complex JOINs become very difficult with sharding. Consider a read replica or data warehouse instead.
•Team lacks database expertise — Sharding requires deep database knowledge to implement and maintain. If you don't have this expertise, the operational burden may outweigh benefits.

The Premature Sharding Trap

I've seen teams shard at 1TB 'for future scale' and spend years managing complexity they didn't need. I've also seen teams refuse to shard at 50TB and suffer constant outages. The key is honest assessment of current limits and growth trajectory. Plan for sharding when you're 12-18 months from hitting limits—enough time to implement well, but not so early that you're solving imaginary problems.

Sharding in the Real World

Every major internet company uses sharding. Understanding how they approach it illuminates the patterns and challenges you'll face.

Facebook/Meta: User-Based Sharding

Facebook shards primarily by user_id. Each user's data—posts, photos, messages, friendships—lives together on a shard. This makes the common case (show me my news feed) hit a single shard. Cross-shard queries are needed for interactions between users on different shards, but these are handled asynchronously.

Stripe: Customer-Based Sharding

Stripe shards by merchant (customer_id). All of a merchant's transactions, subscriptions, and payment methods live on the same shard. This ensures transactional integrity for the operations that matter most—processing payments for a single merchant.

Instagram: User and Media Sharding

Instagram uses a combination approach. User data is sharded by user_id, but media (photos/videos) is sharded separately using a different strategy optimized for large binary storage.

Uber: Geographic Sharding

Uber shards by city/region. A ride in San Francisco only needs data from the SF shard. This also provides data locality—EU data in EU shards, addressing regulatory requirements.

Real-World Sharding Strategies
Company	Primary Shard Key	Rationale	Challenges Addressed
Facebook	user_id	User operations access their own data	News feed, timeline, notifications
Stripe	customer_id	Payment operations are per-merchant	Transaction integrity, reporting
Slack	workspace_id	All channel data by workspace	Message history, search within workspace
Discord	guild_id	Server-centric access patterns	Messages, roles, member data
Shopify	shop_id	Merchant operations isolated	Product catalog, orders, customers

Multi-Tenant SaaS Pattern

Notice how many successful sharding implementations use a tenant identifier (customer_id, workspace_id, shop_id). Multi-tenant SaaS naturally aligns with sharding—each tenant's data is isolated, very few operations cross tenant boundaries, and the shard key is obvious. If you're building multi-tenant SaaS, sharding by tenant_id is often the right answer.

The Sharding Decision Framework

Here's a practical framework for deciding when and how to approach sharding in your organization:

Sharding Decision Checklist

•Quantify Current State — How much data do you have? What's your growth rate? What's your write TPS at peak? What's your query latency at p99?
•Project Future State — Where will you be in 1 year? 2 years? When will you hit single-node limits at current growth?
•Exhaust Simpler Options — Have you added proper indexes? Optimized slow queries? Added read replicas? Upgraded hardware? Partitioned large tables?
•Identify Shard Key Candidates — What field(s) naturally partition your data? Do your access patterns align with this partitioning?
•Assess Cross-Shard Requirements — What percentage of queries would cross shards? Can you redesign to minimize cross-shard queries?
•Evaluate Team Capability — Does your team have experience with distributed databases? What's the learning curve and risk?
•Consider Managed Solutions — Would a managed sharded database (Vitess, CockroachDB, PlanetScale, Spanner) reduce operational burden enough to justify the cost?

The Right Time to Start Planning

If you're at 30% of single-node capacity and growing 10%+ monthly, start planning sharding now. You won't implement immediately, but you'll design your schema, identify shard keys, and instrument for the transition. When you hit 70% capacity, you'll execute a well-planned migration instead of a panicked one.

Summary: Why Sharding is Necessary

We've covered the fundamental case for sharding. Let's consolidate the key insights:

Key Takeaways

•Single-node databases have hard limits — Storage, write throughput, memory, and connections all have ceilings that no amount of money can overcome.
•Sharding distributes data across multiple nodes — Each shard handles a subset of data, allowing near-linear scaling of storage and throughput.
•Sharding introduces significant complexity — Cross-shard queries, distributed transactions, and operational overhead are real costs that must be weighed.
•The shard key determines success or failure — A good shard key minimizes cross-shard queries. A bad one makes everything harder.
•Timing matters — Shard too early and you add unnecessary complexity. Shard too late and you migrate under crisis. Plan ahead.
•Most successful sharding uses tenant-based keys — Multi-tenant patterns (user_id, customer_id, workspace_id) naturally align with sharding.

What's Next:

Now that you understand why sharding is necessary, we'll explore the foundational concept of horizontal partitioning—how data is actually divided across shards. This builds the conceptual framework for understanding specific sharding strategies like range-based and hash-based sharding.

Page Complete

You now understand the fundamental reasons for database sharding, the limits of single-node databases, and the framework for deciding when to shard. Next, we'll dive into horizontal partitioning—the core mechanism that makes sharding possible.

1 / 6

Loading learning content...

System Design (HLD)Sharding (Partitioning)

Sharding and Database Partitioning

LevelAdvanced

Duration90 mins

TopicSharding (Partitioning)

1 / 6

Why Sharding is Needed

The Single-Node Ceiling

What You Will Learn

The Scaling Story: Where It All Breaks Down

To understand why sharding is necessary, let's trace the journey of a growing application—a story that plays out in thousands of companies every year.

Phase 1: The Single Database (0 to 1 million users)

Phase 2: Read Scaling with Replicas (1 to 10 million users)

Phase 3: Vertical Scaling (10 to 50 million users)

Phase 4: The Breaking Point (50+ million users)

You hit hard limits that no amount of money can solve:

Single-Node Hard Limits

•Storage Capacity — A single server can only attach so many disks. Even with 100TB SSDs, there's a physical limit. When your data exceeds this, there's no hardware solution.
•Write Throughput — A single database can only process so many write transactions per second. Each write must be serialized, logged, and indexed. No amount of CPU helps beyond a certain point.
•Memory Limits — Indexes must fit in memory for fast access. When your index exceeds available RAM, every query becomes a disk seek. Performance degrades catastrophically.
•Connection Limits — Each database connection consumes memory and context-switching overhead. At tens of thousands of concurrent connections, the database itself becomes the bottleneck.
•Backup and Recovery Windows — Backing up a 50TB database takes hours. Recovery can take days. Your business can't tolerate these windows.

The Vertical Scaling Trap

Understanding the Limits Mathematically

Let's put concrete numbers to these limits. Understanding the math helps you predict when sharding becomes necessary for your system.

Storage Growth Analysis

Consider a system that stores user activity data. Each user generates 10 events per day, each event is 1KB on average. With 10 million daily active users:

Daily data: 10M users × 10 events × 1KB = 100GB/day
Monthly data: 100GB × 30 = 3TB/month
Yearly data: 3TB × 12 = 36TB/year

With retention requirements and indexes (which often 2-3x your data size), you're looking at 100TB+ within two years. No single database instance handles this gracefully.

Write Throughput Analysis

A well-tuned PostgreSQL instance on high-end hardware might sustain 30,000-50,000 write transactions per second. Sounds like a lot? Let's check:

10M DAU, peak hour has 20% of daily traffic
Each active user performs 5 write operations
Peak TPS: (10M × 0.2 × 5) / 3600 = ~2,800 TPS

This seems safe, but consider growth. At 100M DAU:

Peak TPS: ~28,000 TPS

You're now at the edge. Factor in spikes (viral content, Black Friday) that can 10x traffic, and you've exceeded capacity.

Approximate Single-Node Database Limits (2024 Hardware)
Resource	Practical Limit	Hard Limit	Consequence of Exceeding
Storage	20-50TB	~100TB	Cannot store more data; system halts
Write TPS	10K-50K	~100K	Writes queue indefinitely; timeouts cascade
Working Set (RAM)	256GB-1TB	~4TB	Every query hits disk; latency explodes
Connections	1,000-5,000	~10,000	New connections refused; cascading failures
Index Size	10-50GB/table	RAM limit	Index scans become table scans; queries timeout

The Latency Tax

This is why experienced architects plan for sharding before hitting limits—the transition is much smoother when you're not fighting fires.

What Sharding Actually Solves

How Sharding Addresses Limits

•Storage Scalability — With 10 shards, you can store 10× the data. With 100 shards, 100×. Storage scales linearly with shard count. Companies like Facebook use thousands of shards to store petabytes of data.
•Write Throughput — Each shard handles writes independently. If one shard handles 30K TPS, 10 shards handle 300K TPS. Writes targeting different shards can execute in parallel with zero contention.
•Memory Distribution — Each shard only needs memory for its portion of the data. Instead of one machine needing 2TB of RAM, 10 machines each need 200GB. This is dramatically cheaper and more available.
•Connection Distribution — Application servers connect to specific shards. With 10 shards, each shard handles 1/10th of the connections. You can support 50,000 total connections across 10 shards easily.
•Backup/Recovery — Each shard backs up independently and in parallel. A 50TB database that takes 10 hours to backup becomes 10 × 5TB shards, each backing up in 1 hour simultaneously.

Single Database Approach

•All data on one machine
•Single point of write contention
•Entire dataset must fit working set
•All connections hit one server
•Backup affects entire system
•Hardware upgrades require migration
•Growth requires bigger machines

Sharded Database Approach

•Data distributed across many machines
•Write contention isolated per shard
•Each shard has manageable working set
•Connections distributed across shards
•Shard-level backup with no global impact
•Add shards without migrating hardware
•Growth requires more commodity machines

The Sharding Tradeoffs: Nothing is Free

Fundamental Tradeoffs:

Sharding Tradeoffs Analysis
Aspect	Single Database	Sharded Database
Cross-entity Queries	Join any tables freely	Cross-shard joins are expensive or impossible
Transactions	ACID across all data	ACID within shard; distributed transactions complex
Schema Changes	Single migration	Coordinate across all shards
Operational Complexity	One database to manage	N databases to manage
Application Complexity	Query any data	Determine correct shard for every query
Cost Model	One expensive server	Many cheaper servers (often lower total cost)
Failure Domain	Total outage if down	Partial outage (only affected shard)

The Cross-Shard Query Problem

Query multiple shards
Collect results in application layer
Join/aggregate in memory
Handle partial failures gracefully

The 80/20 Rule of Sharding

When to Shard (and When Not To)

Signals That You Need Sharding

•Data volume exceeds single-node capacity — You're approaching 10-20TB and growing. No vertical scaling option is economically viable.
•Write throughput is saturated — Even with optimized queries and proper indexing, you're hitting write limits. Read replicas don't help because the problem is writes.
•Working set exceeds available RAM — Your most frequently accessed data doesn't fit in memory. Query latencies are increasing despite optimization.
•Backup windows are unacceptable — Full backups take so long they impact business operations or create unacceptable recovery point objectives.
•Clear partitioning dimension exists — Your data has an obvious shard key (tenant_id, user_id, region) that aligns with access patterns.

Signals That You Should NOT Shard Yet

•You haven't optimized queries — Slow queries often indicate missing indexes, inefficient JOINs, or N+1 problems. Fix these first.
•You're read-heavy — Read replicas provide almost unlimited read scalability without sharding complexity. Use them first.
•Data fits comfortably in memory — If your working set is 50GB and you could afford 256GB RAM, vertical scaling is simpler.
•You need complex cross-entity queries — Analytics, reporting, and complex JOINs become very difficult with sharding. Consider a read replica or data warehouse instead.
•Team lacks database expertise — Sharding requires deep database knowledge to implement and maintain. If you don't have this expertise, the operational burden may outweigh benefits.

The Premature Sharding Trap

Sharding in the Real World

Every major internet company uses sharding. Understanding how they approach it illuminates the patterns and challenges you'll face.

Facebook/Meta: User-Based Sharding

Stripe: Customer-Based Sharding

Instagram: User and Media Sharding

Instagram uses a combination approach. User data is sharded by user_id, but media (photos/videos) is sharded separately using a different strategy optimized for large binary storage.

Uber: Geographic Sharding

Uber shards by city/region. A ride in San Francisco only needs data from the SF shard. This also provides data locality—EU data in EU shards, addressing regulatory requirements.

Real-World Sharding Strategies
Company	Primary Shard Key	Rationale	Challenges Addressed
Facebook	user_id	User operations access their own data	News feed, timeline, notifications
Stripe	customer_id	Payment operations are per-merchant	Transaction integrity, reporting
Slack	workspace_id	All channel data by workspace	Message history, search within workspace
Discord	guild_id	Server-centric access patterns	Messages, roles, member data
Shopify	shop_id	Merchant operations isolated	Product catalog, orders, customers

Multi-Tenant SaaS Pattern

The Sharding Decision Framework

Here's a practical framework for deciding when and how to approach sharding in your organization:

Sharding Decision Checklist

•Quantify Current State — How much data do you have? What's your growth rate? What's your write TPS at peak? What's your query latency at p99?
•Project Future State — Where will you be in 1 year? 2 years? When will you hit single-node limits at current growth?
•Exhaust Simpler Options — Have you added proper indexes? Optimized slow queries? Added read replicas? Upgraded hardware? Partitioned large tables?
•Identify Shard Key Candidates — What field(s) naturally partition your data? Do your access patterns align with this partitioning?
•Assess Cross-Shard Requirements — What percentage of queries would cross shards? Can you redesign to minimize cross-shard queries?
•Evaluate Team Capability — Does your team have experience with distributed databases? What's the learning curve and risk?
•Consider Managed Solutions — Would a managed sharded database (Vitess, CockroachDB, PlanetScale, Spanner) reduce operational burden enough to justify the cost?

The Right Time to Start Planning

Summary: Why Sharding is Necessary

We've covered the fundamental case for sharding. Let's consolidate the key insights:

Key Takeaways

•Single-node databases have hard limits — Storage, write throughput, memory, and connections all have ceilings that no amount of money can overcome.
•Sharding distributes data across multiple nodes — Each shard handles a subset of data, allowing near-linear scaling of storage and throughput.
•Sharding introduces significant complexity — Cross-shard queries, distributed transactions, and operational overhead are real costs that must be weighed.
•The shard key determines success or failure — A good shard key minimizes cross-shard queries. A bad one makes everything harder.
•Timing matters — Shard too early and you add unnecessary complexity. Shard too late and you migrate under crisis. Plan ahead.
•Most successful sharding uses tenant-based keys — Multi-tenant patterns (user_id, customer_id, workspace_id) naturally align with sharding.

What's Next:

Page Complete

1 / 6