Loading content...
Every DynamoDB implementation rises or falls on a single decision: partition key design.
This isn't hyperbole. I've witnessed systems handling 100,000 requests per second with rock-solid sub-5ms latency—and I've seen systems throttle at 1,000 requests per second while burning money on unused capacity. The difference? Not hardware, not configuration, not code optimization. The difference was partition key selection.
The partition key is DynamoDB's fundamental distribution mechanism. It determines which physical partition stores your data, how load is distributed across the cluster, and ultimately whether your system can scale. Choose wisely, and DynamoDB scales linearly with your traffic. Choose poorly, and you create hot partitions that become bottlenecks no amount of provisioned capacity can fix.
By the end of this page, you will understand how partition keys determine data distribution, the mathematics of partition capacity limits, strategies for designing high-cardinality keys, techniques for handling low-cardinality data, and advanced patterns like write sharding that unlock unlimited scale.
Before we can design effective partition keys, we must understand what partitions are and how they work.
What is a Partition?
A partition is the fundamental unit of data storage and throughput in DynamoDB. Physically, a partition is a slice of an SSD managed by a storage node, but conceptually, it's simpler to think of it as a container with hard limits:
When you create a table and start writing data, DynamoDB allocates partitions based on your throughput settings (for provisioned mode) or automatically (for on-demand mode). As data grows or throughput needs increase, DynamoDB automatically splits partitions.
123456789101112131415161718192021222324
┌──────────────────────────────────────────────────────────────────────┐│ How Partition Keys Map to Partitions │└──────────────────────────────────────────────────────────────────────┘ Partition Key Value ───► Hash Function ───► Partition Hash (Internal) │ ▼ ┌──────────────────────────────────────────┐ │ Hash Range: 0 to 2^128 │ │ │ Partition 1 │ ████████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ │ (0 - X) Partition 2 │ ░░░░░░░░████████░░░░░░░░░░░░░░░░░░░░░░ │ (X - Y) Partition 3 │ ░░░░░░░░░░░░░░░░████████░░░░░░░░░░░░░░ │ (Y - Z) Partition N │ ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░████████│ (Z - Max) │ │ └──────────────────────────────────────────┘ Example: userId = "U-12345" Hash("U-12345") = 0x7A3F... (falls in Partition 2's range) All items with userId="U-12345" stored in Partition 2 Example: userId = "U-67890" Hash("U-67890") = 0x1B2C... (falls in Partition 1's range) All items with userId="U-67890" stored in Partition 1Key Insight: Partition Key = Partition Affinity
All items with the same partition key value are stored in the same partition. This has profound implications:
This is why partition key design is so critical. The hash function distributes partition key values across partitions uniformly, but if your application doesn't distribute traffic across partition key values uniformly, you'll have hot partitions.
DynamoDB allocates throughput per partition, not per table. If you provision 10,000 WCU but all traffic goes to one partition, you effectively have only 1,000 WCU available. The other 9,000 WCU sit unused on other partitions. This is why throwing more capacity at a hot partition problem doesn't work—you need to fix the partition key design.
Designing effective partition keys requires understanding the properties that lead to uniform data and traffic distribution. Let's examine these characteristics systematically.
| Candidate | Cardinality | Distribution | Verdict |
|---|---|---|---|
| userId (UUID) | Very High (billions) | Usually uniform | ✅ Excellent |
| orderId (UUID) | Very High | Uniform by design | ✅ Excellent |
| deviceId (IoT) | High (millions) | Depends on device activity | ✅ Good |
| customerId | Medium-High | May have power users | ⚠️ Watch for hot keys |
| date (YYYY-MM-DD) | Low (365/year) | Today's date is always hot | ❌ Poor |
| status (enum) | Very Low (5-10) | Concentrated on 'active' | ❌ Very Poor |
| country | Low (~200) | Concentrated on populous countries | ❌ Very Poor |
| constant value | 1 | All traffic to one partition | ❌ Catastrophic |
The Cardinality Principle
The minimum number of partitions your table can effectively use is bounded by the cardinality of your partition key. If your partition key has only 100 distinct values, you can never have more than 100 partitions actively receiving traffic—even if DynamoDB creates more for storage.
Calculation example:
Table: User Sessions
Partition Key: status (5 values: active, expired, pending, suspended, deleted)
Even with 1 million sessions:
- Maximum effective partitions for load distribution: 5
- If 80% of sessions are 'active': That partition handles 80% of all traffic
- Maximum usable write throughput: ~5 × 1,000 WCU = 5,000 WCU
- But 80% of traffic hits one partition: Effective limit = 1,000 / 0.8 = 1,250 WCU
This is why status fields, booleans, and enums should never be partition keys.
Learning from common mistakes is often more instructive than studying ideal designs. Let's examine partition key anti-patterns that cause real-world failures.
deviceId#2024-06-15)tenantId#entityId or shard large tenants12345678910111213141516171819202122232425262728293031323334353637
// ❌ ANTI-PATTERN: Date as Partition Key// All writes for today go to one partitionconst badDesign = { tableName: "SensorReadings", keySchema: [ { attributeName: "date", keyType: "HASH" }, // Only 365 values/year! { attributeName: "readingId", keyType: "RANGE" } ]}; // One day = all traffic to one partition = throttled at 1,000 WCU// Query: "Get all readings for date" → Full table scan of partition // ✅ CORRECT: High-cardinality partition key with date in sort keyconst goodDesign = { tableName: "SensorReadings", keySchema: [ { attributeName: "sensorId", keyType: "HASH" }, // Millions of sensors = millions of partitions { attributeName: "timestamp", keyType: "RANGE" } ]}; // Traffic distributed across all sensors// Query: "Get readings for sensor X between dates" → Efficient range query // ✅ ALTERNATIVE: Composite partition key for high-write scenariosconst compositeDesign = { tableName: "SensorReadings", // Partition key is a composite: sensorId#YYYY-MM // Ensures no single partition grows beyond 10 GB keySchema: [ { attributeName: "pk", keyType: "HASH" }, // "SENSOR-001#2024-06" { attributeName: "timestamp", keyType: "RANGE" } ]};One of the most powerful techniques in DynamoDB design is the composite primary key: a partition key combined with a sort key. This pattern enables rich querying while maintaining excellent distribution.
Anatomy of a Composite Key
Primary Key = Partition Key (PK) + Sort Key (SK)
The Power of Sort Keys
Sort keys enable:
SK BETWEEN '2024-01-01' AND '2024-06-30'SK begins_with 'ORDER#'USER#123, ORDER#456, PROFILE#789)1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586
// ============================================// EXAMPLE 1: E-Commerce Order History// ============================================// Access Patterns:// 1. Get all orders for a customer// 2. Get orders for a customer in a date range// 3. Get a specific order by orderId const orderTable = { // PK: customerId (high cardinality - millions of customers) // SK: orderTimestamp#orderId (enables time-based queries) items: [ { PK: "CUST#C-12345", SK: "ORDER#2024-06-15T10:30:00Z#ORD-789", orderId: "ORD-789", total: 149.99, status: "SHIPPED" }, { PK: "CUST#C-12345", SK: "ORDER#2024-07-22T14:15:00Z#ORD-891", orderId: "ORD-891", total: 89.50, status: "DELIVERED" } ]}; // Query: Get all orders for customer// PK = "CUST#C-12345" // Query: Get orders for customer in June 2024// PK = "CUST#C-12345" AND SK BETWEEN "ORDER#2024-06-01" AND "ORDER#2024-06-30" // Query: Get most recent orders// PK = "CUST#C-12345" AND ScanIndexForward = false LIMIT 10 // ============================================// EXAMPLE 2: Single-Table Design (Multi-Entity)// ============================================// One table stores Users, Orders, and Products using SK prefixes const singleTable = { items: [ // User Profile { PK: "USER#U-12345", SK: "PROFILE", name: "Alice Smith", email: "alice@example.com", createdAt: "2023-01-15" }, // User's Orders (multiple items, same PK) { PK: "USER#U-12345", SK: "ORDER#2024-06-15#ORD-789", orderId: "ORD-789", total: 149.99 }, { PK: "USER#U-12345", SK: "ORDER#2024-07-22#ORD-891", orderId: "ORD-891", total: 89.50 }, // User's Payment Methods { PK: "USER#U-12345", SK: "PAYMENT#PM-001", type: "VISA", last4: "4242" } ]}; // Query: Get user profile// PK = "USER#U-12345" AND SK = "PROFILE" // Query: Get all user's orders// PK = "USER#U-12345" AND SK begins_with "ORDER#" // Query: Get user with everything (profile, orders, payments)// PK = "USER#U-12345" (returns all items for this user)DynamoDB experts often use a single-table design where multiple entity types share one table, differentiated by sort key prefixes. This reduces the number of tables to manage, enables transactions across entity types, and can reduce costs by consolidating indexes. However, it requires careful planning and is more complex to understand than multi-table designs.
Sometimes your access pattern inherently creates hot partitions. A global counter, a trending topic, or a viral piece of content naturally concentrates traffic. Write sharding is the technique for handling these scenarios.
The Problem: Unavoidable Concentration
Consider a view counter for a viral video:
VID-12345With a simple partition key of videoId, all 50,000 writes/second hit the same partition—far exceeding the 1,000 WCU limit.
The Solution: Shard the Partition Key
Instead of a single partition key value, we create multiple by appending a random suffix:
Original: videoId = "VID-12345"
Sharded: videoId = "VID-12345#shard-0" through "VID-12345#shard-9"
Now writes are distributed across 10 partitions, giving us ~10,000 WCU of write throughput.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465
// ============================================// WRITE SHARDING FOR HOT KEYS// ============================================ const SHARD_COUNT = 10; // Number of shards (tune based on throughput needs) // Write: Increment view count (distributed across shards)async function incrementViewCount(videoId: string): Promise<void> { // Random shard selection distributes writes evenly const shardId = Math.floor(Math.random() * SHARD_COUNT); const shardedKey = `${videoId}#shard-${shardId}`; await dynamodb.updateItem({ TableName: "VideoStats", Key: { PK: { S: shardedKey }, SK: { S: "VIEW_COUNT" } }, UpdateExpression: "ADD #count :inc", ExpressionAttributeNames: { "#count": "count" }, ExpressionAttributeValues: { ":inc": { N: "1" } } });} // Read: Get total view count (aggregate across all shards)async function getTotalViewCount(videoId: string): Promise<number> { const shardKeys = Array.from( { length: SHARD_COUNT }, (_, i) => ({ PK: { S: `${videoId}#shard-${i}` }, SK: { S: "VIEW_COUNT" } }) ); const results = await dynamodb.batchGetItem({ RequestItems: { VideoStats: { Keys: shardKeys } } }); // Sum counts from all shards return results.Responses?.VideoStats?.reduce( (sum, item) => sum + parseInt(item.count?.N || "0", 10), 0 ) || 0;} // ============================================// ALTERNATIVE: Calculated Shard (Deterministic)// ============================================// Use when you need predictable shard placement (e.g., for caching) function getShardForRequest(videoId: string, requestId: string): string { // Hash requestId to get deterministic shard const hash = simpleHash(requestId); const shardId = hash % SHARD_COUNT; return `${videoId}#shard-${shardId}`;} function simpleHash(str: string): number { let hash = 0; for (let i = 0; i < str.length; i++) { const char = str.charCodeAt(i); hash = ((hash << 5) - hash) + char; hash = hash & hash; // Convert to 32-bit integer } return Math.abs(hash);}| Aspect | Without Sharding | With Sharding |
|---|---|---|
| Write Throughput | ≤1,000 WCU per key | N × 1,000 WCU (N shards) |
| Read Complexity | Single GetItem | BatchGetItem + aggregation |
| Read Cost | 1 RCU | N RCU (one per shard) |
| Code Complexity | Simple | Moderate (shard logic required) |
| Eventual Consistency | N/A | Aggregated reads may be slightly stale |
| Use Cases | Normal traffic items | Counters, trending items, viral content |
Write sharding adds complexity and read overhead. Use it only when you have proven hot keys that exceed partition limits. Signs you need sharding: throttling on specific keys despite high table capacity, CloudWatch showing uneven partition utilization, or known viral/trending access patterns.
While good partition key design is essential, DynamoDB provides a safety mechanism called Adaptive Capacity that helps mitigate hot partition issues. Understanding this feature helps you design more resilient tables.
How Adaptive Capacity Works
Traditionally, DynamoDB distributed provisioned capacity evenly across partitions. If you provisioned 10,000 WCU and had 10 partitions, each partition got 1,000 WCU. If one partition needed 2,000 WCU while others used only 500, the hot partition would throttle.
Adaptive Capacity changes this:
Limits of Adaptive Capacity
Adaptive Capacity is a safety net, not a design strategy. It helps smooth out unexpected traffic variations but cannot fix fundamentally poor partition key choices. Always design for uniform distribution first, then rely on Adaptive Capacity as insurance against real-world imperfection.
Let's synthesize everything into a practical framework for selecting partition keys. Follow this decision process when designing new tables.
| Use Case | Recommended Partition Key | Sort Key | Reasoning |
|---|---|---|---|
| User Profiles | userId | None (or PROFILE) | One profile per user, high cardinality |
| User's Orders | userId | orderDate#orderId | Query orders by user, sort by time |
| IoT Sensor Data | sensorId | timestamp | Each sensor writes independently |
| Game Leaderboard | leaderboardId#shard-N | score | Write sharding for hot leaderboards |
| Multi-tenant SaaS | tenantId#entityType#entityId | None or timestamp | Prevent large tenant hot spots |
| Session Storage | sessionId | None | Random IDs distribute perfectly |
| Product Catalog | productId | None or category#name | Products accessed independently |
| Chat Messages | conversationId | timestamp | Messages grouped by conversation |
Partition key design is the foundation of DynamoDB success. Let's consolidate the essential principles:
What's Next
With partition keys mastered, we turn to Global Secondary Indexes (GSIs)—the mechanism that enables querying DynamoDB by attributes other than the primary key. GSIs unlock access patterns that the base table cannot support, but they come with their own design considerations, cost implications, and partition key challenges. Understanding GSIs is essential for building flexible, query-rich DynamoDB applications.
You now understand partition key design—the most critical factor in DynamoDB success. You can identify good and bad partition key candidates, use composite keys for rich querying, implement write sharding for hot keys, and apply a systematic framework for key selection. This knowledge prevents the throttling disasters that plague poorly designed DynamoDB tables.