System Design (HLD)Amazon DynamoDB

Amazon DynamoDB: AWS's Fully Managed NoSQL Database

LevelAdvanced

Duration90 mins

TopicAmazon DynamoDB

1 / 6

Managed NoSQL Service

The Database That Powers Amazon

In 2007, Amazon faced a crisis that would define the future of modern databases. Despite their best efforts with traditional RDBMS systems, the infrastructure supporting amazon.com was buckling under the weight of peak shopping seasons. Database nodes were failing, transactions were timing out, and every second of downtime translated to millions in lost revenue.

Out of this crucible emerged Dynamo—and later its commercial evolution, Amazon DynamoDB—a database designed from first principles to solve the problems that bring traditional databases to their knees. Today, DynamoDB handles over 10 trillion requests per day, powering not just Amazon's retail operations but also services like Alexa, Twitch, IMDb, and thousands of the world's most demanding applications.

This page explores DynamoDB as a fully managed NoSQL service: what it means, how it differs from self-managed databases, and why this operational model represents a fundamental shift in how we think about database infrastructure at scale.

What You Will Learn

By the end of this page, you will understand the architecture and philosophy behind DynamoDB's managed service model, how serverless databases eliminate operational overhead, the SLA guarantees that make DynamoDB enterprise-ready, and the fundamental trade-offs baked into its design that every system designer must understand.

The Origins of DynamoDB: From Dynamo Paper to Managed Service

To understand DynamoDB, we must first understand its intellectual lineage. The story begins with the influential Dynamo paper, published by Amazon in 2007, which outlined a highly available key-value store designed to support Amazon's e-commerce platform during peak load.

The Original Dynamo was an internal Amazon system that pioneered several groundbreaking techniques:

Consistent hashing for partition distribution
Vector clocks for conflict detection
Gossip protocols for membership and failure detection
Sloppy quorums and hinted handoff for availability during failures
Read repair and anti-entropy for eventual consistency

These techniques have influenced an entire generation of distributed databases, including Cassandra, Riak, and Voldemort.

Dynamo vs DynamoDB

Despite sharing a name, Amazon DynamoDB is not the same as the Dynamo system from the 2007 paper. DynamoDB is a commercial managed service that incorporates lessons from Dynamo but with significant architectural differences. Most notably, DynamoDB uses B-trees for storage (not consistent hashing alone) and offers strong consistency as an option—something the original Dynamo could not guarantee.

The Evolution to DynamoDB

By 2012, Amazon had accumulated years of operational experience running distributed databases at unprecedented scale. They observed several critical patterns:

Operational burden dominated engineering time — Teams spent more time managing databases than building features
Provisioning was inherently error-prone — Capacity planning for traffic spikes was more art than science
Availability required constant vigilance — Even minor configuration drifts could cascade into outages
Scaling required specialized expertise — Most teams lacked the distributed systems knowledge to scale safely

DynamoDB emerged as Amazon's answer: a database designed to eliminate operational overhead entirely while providing the scale and availability guarantees that Amazon had spent a decade learning to deliver.

Evolution from Dynamo to DynamoDB
Characteristic	Original Dynamo (2007)	Amazon DynamoDB (2012+)
Deployment	Internal Amazon only	Public AWS service
Management	Self-managed by teams	Fully managed by AWS
Consistency	Eventually consistent only	Eventual and strong consistency options
Storage Engine	Consistent hashing + logs	B-trees on SSDs
Query Model	Key-value only	Key-value + rich query with indexes
Scaling	Manual node addition	Automatic, transparent scaling
Cost Model	Internal infrastructure cost	Pay-per-request or provisioned capacity

What Does 'Fully Managed' Actually Mean?

The term "managed database" has become ubiquitous in cloud marketing, but the degree of management varies dramatically between services. DynamoDB represents the far end of the spectrum: a truly serverless database where virtually all operational concerns are handled by AWS.

To appreciate what this means, let's examine the operational responsibilities that disappear when you choose DynamoDB:

Operational Responsibilities Eliminated by DynamoDB

•Hardware Provisioning — No servers to size, purchase, or rack. No SSDs to replace. No network switches to configure.
•Software Installation and Patching — No OS updates, no database version upgrades, no security patches. AWS handles all maintenance transparently.
•Capacity Planning — In on-demand mode, you never need to predict traffic. DynamoDB scales automatically from zero to millions of requests per second.
•Replication and Durability — Data is automatically replicated across three Availability Zones. You don't configure replication—it's built in.
•Backup and Recovery — Point-in-time recovery (PITR) can be enabled with a single toggle. On-demand backups require no storage management.
•Monitoring and Alerting — CloudWatch metrics are automatically published. No agents to install, no dashboards to configure from scratch.
•Scaling Operations — No resharding scripts, no read replica promotion, no failover procedures. Scaling is invisible.
•Security Hardening — Encryption at rest is automatic (AES-256). Encryption in transit is enforced. IAM integration provides fine-grained access control.

Managed ≠ Zero Responsibility

While DynamoDB eliminates infrastructure operations, it does NOT eliminate responsibility for data modeling, access pattern design, capacity mode selection, cost optimization, or application-level error handling. These remain critical engineering decisions that can make or break your system.

The Serverless Database Model

DynamoDB pioneered the serverless database model, where:

No instances to manage — You don't select instance types or count. There are no "nodes" in the traditional sense.
No idle costs (in on-demand mode) — You pay only for actual read/write operations and storage. A table with no traffic costs only storage.
Instant elasticity — Traffic can spike from 100 to 100,000 requests per second without pre-warming or manual intervention.
Global scale by default — Multi-region replication (Global Tables) requires only configuration, not architecture changes.

This model fundamentally changes how teams approach database design. Instead of planning for peak capacity and leaving servers idle during off-peak hours, you design for actual access patterns and let the infrastructure adapt.

DynamoDB Architecture Overview

While AWS intentionally abstracts DynamoDB's internals, understanding its architecture is essential for effective data modeling and performance optimization. Based on public information, patents, and AWS re:Invent presentations, we can construct a comprehensive picture of how DynamoDB works under the hood.

Core Architectural Components

•Request Routers — Stateless front-end nodes that accept client requests, authenticate them via IAM, and route them to the appropriate storage nodes. These scale horizontally without limit.
•Storage Nodes — Each storage node manages multiple partitions and is responsible for durability, replication, and serving reads. Storage nodes use SSDs with B-tree-based storage engines.
•Partitions — The fundamental unit of data distribution. Each partition can store up to 10 GB of data and handle up to 3,000 RCUs (read capacity units) or 1,000 WCUs (write capacity units).
•Partition Manager — Monitors partition health and triggers automatic splitting when partitions approach capacity limits. This is completely invisible to users.
•Auto Scaler — For provisioned capacity tables, continuously adjusts capacity based on traffic patterns while respecting configured limits.
•Replication Group — Each partition is replicated across three storage nodes in different Availability Zones. One replica is designated the leader for writes.

DynamoDB Request Flow
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
┌─────────────────────────────────────────────────────────────────────┐
│                        DynamoDB Architecture                         │
└─────────────────────────────────────────────────────────────────────┘
 
  Client Request
       │
       ▼
┌─────────────────┐
│ Request Router  │──────── IAM Authentication & Authorization
│   (Stateless)   │──────── Request Validation
└────────┬────────┘
         │
         │ Partition Key Hash → Partition Location
         ▼
┌─────────────────────────────────────────────────────────────────────┐
│                        Storage Layer                                 │
│                                                                      │
│   ┌───────────────┐   ┌───────────────┐   ┌───────────────┐        │
│   │  Partition 1  │   │  Partition 2  │   │  Partition N  │        │
│   │   (Leader)    │   │   (Leader)    │   │   (Leader)    │        │
│   └───────┬───────┘   └───────┬───────┘   └───────┬───────┘        │
│           │                   │                   │                 │
│   ┌───────┴───────┐   ┌───────┴───────┐   ┌───────┴───────┐        │
│   │  Replica AZ-A │   │  Replica AZ-A │   │  Replica AZ-A │        │
│   │  Replica AZ-B │   │  Replica AZ-B │   │  Replica AZ-B │        │
│   │  Replica AZ-C │   │  Replica AZ-C │   │  Replica AZ-C │        │
│   └───────────────┘   └───────────────┘   └───────────────┘        │
│                                                                      │
│   Each partition: ≤10 GB storage, ≤3000 RCU, ≤1000 WCU             │
└─────────────────────────────────────────────────────────────────────┘

Write Path

When a write request arrives:

The request router authenticates the request and hashes the partition key to identify the target partition.
The request is forwarded to the leader replica of that partition.
The leader writes to its local log and synchronously replicates to at least one other replica before acknowledging.
For strongly consistent reads, DynamoDB ensures the response comes from the leader, which has the most recent write.
For eventually consistent reads, any replica can serve the request, potentially returning slightly stale data.

Read Path

The request router hashes the partition key to locate the partition.
For eventually consistent reads (default): The request is routed to any available replica, providing lower latency and higher throughput.
For strongly consistent reads: The request goes to the leader replica, guaranteeing the most recent write is reflected.
Strong consistency costs 2x the RCU of eventual consistency because it bypasses the distributed read optimization.

Data Model Fundamentals

DynamoDB's data model is elegantly simple yet powerful enough to support virtually any access pattern. Understanding this model is prerequisite to effective DynamoDB design.

Core Data Model Concepts

•Tables — The top-level container for data. Unlike relational databases, DynamoDB tables have no fixed schema beyond the primary key.
•Items — Individual records in a table. Each item is a collection of attributes and can have different attributes from other items in the same table (schemaless).
•Attributes — Named data elements within an item. Attributes can be scalar (string, number, binary, boolean, null), document (list, map), or set types.
•Primary Key — The unique identifier for each item. Can be simple (partition key only) or composite (partition key + sort key).
•Partition Key (Hash Key) — The attribute used to distribute data across partitions. DynamoDB hashes this value to determine physical location.
•Sort Key (Range Key) — Optional secondary component that enables range queries and ordering within a partition.

Primary Key Types in DynamoDB
Key Type	Structure	Use Case	Query Capability
Simple Primary Key	Partition Key only	Lookup by unique ID (user by userId)	Point reads only (GetItem)
Composite Primary Key	Partition Key + Sort Key	One-to-many relationships (orders by customerId + orderDate)	Range queries within partition (Query)

Example: E-Commerce Order Table
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
// DynamoDB Table: Orders
// Primary Key: Composite (customerId + orderTimestamp)
 
// Example Items:
const orderItems = [
    {
        // Partition Key
        customerId: "C-12345",
        // Sort Key
        orderTimestamp: "2024-06-15T10:30:00Z",
        // Attributes (no fixed schema)
        orderId: "ORD-789456",
        status: "SHIPPED",
        total: 149.99,
        items: [
            { productId: "PROD-001", quantity: 2, price: 49.99 },
            { productId: "PROD-042", quantity: 1, price: 50.01 }
        ],
        shippingAddress: {
            street: "123 Main St",
            city: "Seattle",
            state: "WA",
            zip: "98101"
        }
    },
    {
        customerId: "C-12345",
        orderTimestamp: "2024-07-22T14:15:00Z",
        orderId: "ORD-891234",
        status: "DELIVERED",
        total: 89.50,
        // Different attributes - totally valid!
        giftMessage: "Happy Birthday!",
        expeditedShipping: true
    }
];
 
// Query: Get all orders for customer C-12345 in July 2024
// Uses partition key (customerId) + sort key range (orderTimestamp BETWEEN)
 
// Query: Get customer's most recent order
// Uses partition key + sort key with ScanIndexForward=false, Limit=1

Schemaless ≠ Schema-free

While DynamoDB doesn't enforce a schema at the database level, your application absolutely should. Best practice is to define item schemas in your application code (e.g., using TypeScript interfaces, Zod schemas, or AWS's own AttributeValue types) and validate all data before writes. This gives you flexibility AND safety.

Capacity Modes: Provisioned vs On-Demand

DynamoDB offers two capacity modes that represent different trade-offs between cost predictability and operational simplicity. Choosing the right mode is one of the most impactful decisions you'll make.

Provisioned Capacity

•You specify RCU and WCU limits upfront
•Predictable costs — Pay for reserved capacity
•Auto-scaling available — Adjusts within limits
•Reserved capacity discounts — Up to 77% savings
•Risk of throttling if traffic exceeds capacity
•Best for: Predictable, steady workloads

On-Demand Capacity

•No capacity planning — Scales automatically
•Pay-per-request — Charged for actual usage
•Never throttled for capacity reasons*
•Higher unit cost — ~5-7x provisioned pricing
•Instant scaling — Handles 2x previous peak
•Best for: Unpredictable, spiky workloads

Capacity Mode Cost Comparison (us-east-1 pricing)
Metric	Provisioned	On-Demand
Write Cost	$0.00065 per WCU-hour ($0.47/month)	$1.25 per million writes
Read Cost	$0.00013 per RCU-hour ($0.09/month)	$0.25 per million reads
Break-even Point	—	~14.4% utilization
Reserved Capacity	Up to 77% discount (1 or 3 year)	Not available

Capacity Units Explained

1 RCU = 1 strongly consistent read/second for items up to 4 KB, OR 2 eventually consistent reads/second
1 WCU = 1 write/second for items up to 1 KB
Items larger than these thresholds consume multiple units proportionally

Decision Framework:

Start with On-Demand for new applications or unknown traffic patterns
Switch to Provisioned once you have 30+ days of traffic data and traffic is predictable
Use Auto-scaling with Provisioned to handle normal variations (target 70% utilization)
Consider Reserved Capacity only for stable, predictable production workloads
Keep some tables On-Demand for development, testing, and rarely-accessed data

The On-Demand Scaling Limit

On-Demand mode can instantly scale to 2x the previous peak traffic. For brand-new tables with no history, the initial limit is 4,000 WCU and 12,000 RCU. If you expect sudden massive traffic to a new table (e.g., product launch), consider pre-warming with Provisioned capacity first, then switching to On-Demand.

SLA Guarantees and Durability

Enterprise systems require predictable reliability. DynamoDB's SLA guarantees are among the strongest in the industry, backed by a decade of operational experience at Amazon scale.

DynamoDB SLA Commitments

•99.999% availability for Global Tables (multi-region) — Less than 26 seconds of downtime per month
•99.99% availability for standard tables (single-region) — Less than 4.3 minutes of downtime per month
•11 nines of durability (99.999999999%) — Data replicated across 3 AZs synchronously
•Single-digit millisecond latency at any scale — P99 latency typically under 10ms
•Financial SLA — Service credits if availability falls below targets

How DynamoDB Achieves These Guarantees

Synchronous Multi-AZ Replication — Every write is committed to at least 2 of 3 replicas before acknowledgment. This ensures durability survives any single AZ failure.
Automatic Partition Management — When a partition approaches its limits (10 GB storage or throughput ceiling), DynamoDB automatically splits it—transparently, without downtime.
Request Router Redundancy — Stateless request routers are deployed across multiple AZs. Failed routers are replaced in seconds.
Continuous Health Monitoring — AWS operates DynamoDB with dedicated engineering teams monitoring 24/7/365. Issues are detected and mitigated before customers notice.
Blast Radius Minimization — DynamoDB's architecture ensures that issues with one partition or one customer's table don't cascade to others.

DynamoDB Availability vs Self-Managed Databases
Aspect	DynamoDB	Self-Managed (e.g., Cassandra)
Time to recover from node failure	Automatic, seconds	Manual/automatic, minutes to hours
Time to add capacity	Instant (On-Demand)	Add nodes, rebalance: hours to days
Ops team required	None (managed)	24/7 on-call required at scale
Multi-region setup	Toggle Global Tables	Complex configuration, weeks of work
Disaster recovery	Built-in PITR, backups	Custom backup solutions required

The Hidden Value of Managed Availability

The true value of DynamoDB's SLA isn't just uptime—it's the engineering hours NOT spent on database operations. A self-managed database cluster at similar scale requires dedicated DBAs, on-call rotations, runbooks for every failure mode, and months of operational expertise. DynamoDB lets you redirect that engineering capacity to building features that differentiate your product.

When DynamoDB Shines

DynamoDB is not a universal database—but in its sweet spot, it is virtually unmatched. Understanding where DynamoDB excels helps you make informed architectural decisions.

Ideal Use Cases for DynamoDB

•Serverless applications — Natural fit with Lambda, API Gateway, and event-driven architectures. No connection pooling headaches.
•Gaming leaderboards and sessions — Sub-millisecond latency for player state, atomic counters for scores, TTL for session cleanup.
•IoT data ingestion — Millions of devices writing telemetry data. DynamoDB handles the write throughput; TTL handles retention.
•E-commerce carts and catalogs — Flexible schema for product attributes, fast lookups, and Global Tables for international operations.
•User profiles and preferences — Schemaless storage for evolving user data, fast reads for personalization.
•Ad tech bidding — Single-digit millisecond latency is mandatory. DynamoDB delivers at any scale.
•Real-time analytics dashboards — Pre-computed aggregates with atomic updates, fast reads for visualization.
•Content metadata — Store media metadata with varied attributes, support arbitrary filtering via GSIs.

Common Patterns in DynamoDB Success Stories

Applications that thrive on DynamoDB typically share these characteristics:

Known access patterns — You can define your queries before you design your tables
Key-based access — Most operations are lookups by primary key or index
Item independence — Items don't require complex joins with other items
Denormalized data — Willing to duplicate data to avoid joins
AWS ecosystem integration — Already using Lambda, Step Functions, or EventBridge
Operational simplicity priority — Team prefers building features over managing infrastructure

Real-World Scale: Amazon Prime Day

During Prime Day 2023, DynamoDB processed 126 million requests per second at peak—with single-digit millisecond response times. This wasn't the result of months of capacity planning. It was DynamoDB's normal operation, automatically scaling to meet demand. This is the promise of a truly managed service: planetary scale without proportional operational effort.

Summary: DynamoDB as a Managed Service

We've explored Amazon DynamoDB's foundational identity as a fully managed NoSQL service. Let's consolidate the key insights:

Key Takeaways

•DynamoDB is truly serverless — No instances to manage, no capacity to pre-provision (in on-demand mode), no operational overhead.
•Architecture is partition-based — Data is distributed by partition key hash, replicated across 3 AZs, with automatic splitting and rebalancing.
•Two capacity modes — Provisioned for predictable workloads with cost savings; On-Demand for unpredictable traffic with operational simplicity.
•Enterprise-grade SLAs — 99.999% availability for Global Tables, 11 nines durability, single-digit millisecond latency at any scale.
•Not just key-value — Rich querying via sort keys, GSIs, and LSIs transforms DynamoDB from simple key-value into flexible document store.
•Managed ≠ no decisions — Data modeling, access patterns, and capacity mode selection remain critical engineering decisions.

What's Next

Understanding that DynamoDB is managed is just the beginning. The next page dives deep into partition key design—the single most important factor determining whether your DynamoDB implementation succeeds or fails. Poor partition key choices lead to hot partitions, throttling, and wasted capacity. Excellent partition key design enables linear scaling to any traffic level.

We'll explore partition key selection strategies, common anti-patterns, and techniques for handling access patterns that don't naturally fit DynamoDB's model.

Page Complete

You now understand DynamoDB's identity as a fully managed NoSQL service—its origins, architecture, capacity modes, and SLA guarantees. This foundation prepares you for the critical design decisions ahead: partition key selection, indexing strategies, and consistency trade-offs that determine real-world success.

1 / 6

Loading learning content...

System Design (HLD)Amazon DynamoDB

Amazon DynamoDB: AWS's Fully Managed NoSQL Database

LevelAdvanced

Duration90 mins

TopicAmazon DynamoDB

1 / 6

Managed NoSQL Service

The Database That Powers Amazon

What You Will Learn

The Origins of DynamoDB: From Dynamo Paper to Managed Service

The Original Dynamo was an internal Amazon system that pioneered several groundbreaking techniques:

Consistent hashing for partition distribution
Vector clocks for conflict detection
Gossip protocols for membership and failure detection
Sloppy quorums and hinted handoff for availability during failures
Read repair and anti-entropy for eventual consistency

These techniques have influenced an entire generation of distributed databases, including Cassandra, Riak, and Voldemort.

Dynamo vs DynamoDB

The Evolution to DynamoDB

By 2012, Amazon had accumulated years of operational experience running distributed databases at unprecedented scale. They observed several critical patterns:

Operational burden dominated engineering time — Teams spent more time managing databases than building features
Provisioning was inherently error-prone — Capacity planning for traffic spikes was more art than science
Availability required constant vigilance — Even minor configuration drifts could cascade into outages
Scaling required specialized expertise — Most teams lacked the distributed systems knowledge to scale safely

Evolution from Dynamo to DynamoDB
Characteristic	Original Dynamo (2007)	Amazon DynamoDB (2012+)
Deployment	Internal Amazon only	Public AWS service
Management	Self-managed by teams	Fully managed by AWS
Consistency	Eventually consistent only	Eventual and strong consistency options
Storage Engine	Consistent hashing + logs	B-trees on SSDs
Query Model	Key-value only	Key-value + rich query with indexes
Scaling	Manual node addition	Automatic, transparent scaling
Cost Model	Internal infrastructure cost	Pay-per-request or provisioned capacity

What Does 'Fully Managed' Actually Mean?

To appreciate what this means, let's examine the operational responsibilities that disappear when you choose DynamoDB:

Operational Responsibilities Eliminated by DynamoDB

•Hardware Provisioning — No servers to size, purchase, or rack. No SSDs to replace. No network switches to configure.
•Software Installation and Patching — No OS updates, no database version upgrades, no security patches. AWS handles all maintenance transparently.
•Capacity Planning — In on-demand mode, you never need to predict traffic. DynamoDB scales automatically from zero to millions of requests per second.
•Replication and Durability — Data is automatically replicated across three Availability Zones. You don't configure replication—it's built in.
•Backup and Recovery — Point-in-time recovery (PITR) can be enabled with a single toggle. On-demand backups require no storage management.
•Monitoring and Alerting — CloudWatch metrics are automatically published. No agents to install, no dashboards to configure from scratch.
•Scaling Operations — No resharding scripts, no read replica promotion, no failover procedures. Scaling is invisible.
•Security Hardening — Encryption at rest is automatic (AES-256). Encryption in transit is enforced. IAM integration provides fine-grained access control.

Managed ≠ Zero Responsibility

The Serverless Database Model

DynamoDB pioneered the serverless database model, where:

No instances to manage — You don't select instance types or count. There are no "nodes" in the traditional sense.
No idle costs (in on-demand mode) — You pay only for actual read/write operations and storage. A table with no traffic costs only storage.
Instant elasticity — Traffic can spike from 100 to 100,000 requests per second without pre-warming or manual intervention.
Global scale by default — Multi-region replication (Global Tables) requires only configuration, not architecture changes.

DynamoDB Architecture Overview

Core Architectural Components

•Request Routers — Stateless front-end nodes that accept client requests, authenticate them via IAM, and route them to the appropriate storage nodes. These scale horizontally without limit.
•Storage Nodes — Each storage node manages multiple partitions and is responsible for durability, replication, and serving reads. Storage nodes use SSDs with B-tree-based storage engines.
•Partitions — The fundamental unit of data distribution. Each partition can store up to 10 GB of data and handle up to 3,000 RCUs (read capacity units) or 1,000 WCUs (write capacity units).
•Partition Manager — Monitors partition health and triggers automatic splitting when partitions approach capacity limits. This is completely invisible to users.
•Auto Scaler — For provisioned capacity tables, continuously adjusts capacity based on traffic patterns while respecting configured limits.
•Replication Group — Each partition is replicated across three storage nodes in different Availability Zones. One replica is designated the leader for writes.

DynamoDB Request Flow
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
┌─────────────────────────────────────────────────────────────────────┐
│                        DynamoDB Architecture                         │
└─────────────────────────────────────────────────────────────────────┘
 
  Client Request
       │
       ▼
┌─────────────────┐
│ Request Router  │──────── IAM Authentication & Authorization
│   (Stateless)   │──────── Request Validation
└────────┬────────┘
         │
         │ Partition Key Hash → Partition Location
         ▼
┌─────────────────────────────────────────────────────────────────────┐
│                        Storage Layer                                 │
│                                                                      │
│   ┌───────────────┐   ┌───────────────┐   ┌───────────────┐        │
│   │  Partition 1  │   │  Partition 2  │   │  Partition N  │        │
│   │   (Leader)    │   │   (Leader)    │   │   (Leader)    │        │
│   └───────┬───────┘   └───────┬───────┘   └───────┬───────┘        │
│           │                   │                   │                 │
│   ┌───────┴───────┐   ┌───────┴───────┐   ┌───────┴───────┐        │
│   │  Replica AZ-A │   │  Replica AZ-A │   │  Replica AZ-A │        │
│   │  Replica AZ-B │   │  Replica AZ-B │   │  Replica AZ-B │        │
│   │  Replica AZ-C │   │  Replica AZ-C │   │  Replica AZ-C │        │
│   └───────────────┘   └───────────────┘   └───────────────┘        │
│                                                                      │
│   Each partition: ≤10 GB storage, ≤3000 RCU, ≤1000 WCU             │
└─────────────────────────────────────────────────────────────────────┘

Write Path

When a write request arrives:

The request router authenticates the request and hashes the partition key to identify the target partition.
The request is forwarded to the leader replica of that partition.
The leader writes to its local log and synchronously replicates to at least one other replica before acknowledging.
For strongly consistent reads, DynamoDB ensures the response comes from the leader, which has the most recent write.
For eventually consistent reads, any replica can serve the request, potentially returning slightly stale data.

Read Path

The request router hashes the partition key to locate the partition.
For eventually consistent reads (default): The request is routed to any available replica, providing lower latency and higher throughput.
For strongly consistent reads: The request goes to the leader replica, guaranteeing the most recent write is reflected.
Strong consistency costs 2x the RCU of eventual consistency because it bypasses the distributed read optimization.

Data Model Fundamentals

DynamoDB's data model is elegantly simple yet powerful enough to support virtually any access pattern. Understanding this model is prerequisite to effective DynamoDB design.

Core Data Model Concepts

•Tables — The top-level container for data. Unlike relational databases, DynamoDB tables have no fixed schema beyond the primary key.
•Items — Individual records in a table. Each item is a collection of attributes and can have different attributes from other items in the same table (schemaless).
•Attributes — Named data elements within an item. Attributes can be scalar (string, number, binary, boolean, null), document (list, map), or set types.
•Primary Key — The unique identifier for each item. Can be simple (partition key only) or composite (partition key + sort key).
•Partition Key (Hash Key) — The attribute used to distribute data across partitions. DynamoDB hashes this value to determine physical location.
•Sort Key (Range Key) — Optional secondary component that enables range queries and ordering within a partition.

Primary Key Types in DynamoDB
Key Type	Structure	Use Case	Query Capability
Simple Primary Key	Partition Key only	Lookup by unique ID (user by userId)	Point reads only (GetItem)
Composite Primary Key	Partition Key + Sort Key	One-to-many relationships (orders by customerId + orderDate)	Range queries within partition (Query)

Example: E-Commerce Order Table
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
// DynamoDB Table: Orders
// Primary Key: Composite (customerId + orderTimestamp)
 
// Example Items:
const orderItems = [
    {
        // Partition Key
        customerId: "C-12345",
        // Sort Key
        orderTimestamp: "2024-06-15T10:30:00Z",
        // Attributes (no fixed schema)
        orderId: "ORD-789456",
        status: "SHIPPED",
        total: 149.99,
        items: [
            { productId: "PROD-001", quantity: 2, price: 49.99 },
            { productId: "PROD-042", quantity: 1, price: 50.01 }
        ],
        shippingAddress: {
            street: "123 Main St",
            city: "Seattle",
            state: "WA",
            zip: "98101"
        }
    },
    {
        customerId: "C-12345",
        orderTimestamp: "2024-07-22T14:15:00Z",
        orderId: "ORD-891234",
        status: "DELIVERED",
        total: 89.50,
        // Different attributes - totally valid!
        giftMessage: "Happy Birthday!",
        expeditedShipping: true
    }
];
 
// Query: Get all orders for customer C-12345 in July 2024
// Uses partition key (customerId) + sort key range (orderTimestamp BETWEEN)
 
// Query: Get customer's most recent order
// Uses partition key + sort key with ScanIndexForward=false, Limit=1

Schemaless ≠ Schema-free

Capacity Modes: Provisioned vs On-Demand

Provisioned Capacity

•You specify RCU and WCU limits upfront
•Predictable costs — Pay for reserved capacity
•Auto-scaling available — Adjusts within limits
•Reserved capacity discounts — Up to 77% savings
•Risk of throttling if traffic exceeds capacity
•Best for: Predictable, steady workloads

On-Demand Capacity

•No capacity planning — Scales automatically
•Pay-per-request — Charged for actual usage
•Never throttled for capacity reasons*
•Higher unit cost — ~5-7x provisioned pricing
•Instant scaling — Handles 2x previous peak
•Best for: Unpredictable, spiky workloads

Capacity Mode Cost Comparison (us-east-1 pricing)
Metric	Provisioned	On-Demand
Write Cost	$0.00065 per WCU-hour ($0.47/month)	$1.25 per million writes
Read Cost	$0.00013 per RCU-hour ($0.09/month)	$0.25 per million reads
Break-even Point	—	~14.4% utilization
Reserved Capacity	Up to 77% discount (1 or 3 year)	Not available

Capacity Units Explained

1 RCU = 1 strongly consistent read/second for items up to 4 KB, OR 2 eventually consistent reads/second
1 WCU = 1 write/second for items up to 1 KB
Items larger than these thresholds consume multiple units proportionally

Decision Framework:

Start with On-Demand for new applications or unknown traffic patterns
Switch to Provisioned once you have 30+ days of traffic data and traffic is predictable
Use Auto-scaling with Provisioned to handle normal variations (target 70% utilization)
Consider Reserved Capacity only for stable, predictable production workloads
Keep some tables On-Demand for development, testing, and rarely-accessed data

The On-Demand Scaling Limit

SLA Guarantees and Durability

Enterprise systems require predictable reliability. DynamoDB's SLA guarantees are among the strongest in the industry, backed by a decade of operational experience at Amazon scale.

DynamoDB SLA Commitments

•99.999% availability for Global Tables (multi-region) — Less than 26 seconds of downtime per month
•99.99% availability for standard tables (single-region) — Less than 4.3 minutes of downtime per month
•11 nines of durability (99.999999999%) — Data replicated across 3 AZs synchronously
•Single-digit millisecond latency at any scale — P99 latency typically under 10ms
•Financial SLA — Service credits if availability falls below targets

How DynamoDB Achieves These Guarantees

Synchronous Multi-AZ Replication — Every write is committed to at least 2 of 3 replicas before acknowledgment. This ensures durability survives any single AZ failure.
Automatic Partition Management — When a partition approaches its limits (10 GB storage or throughput ceiling), DynamoDB automatically splits it—transparently, without downtime.
Request Router Redundancy — Stateless request routers are deployed across multiple AZs. Failed routers are replaced in seconds.
Continuous Health Monitoring — AWS operates DynamoDB with dedicated engineering teams monitoring 24/7/365. Issues are detected and mitigated before customers notice.
Blast Radius Minimization — DynamoDB's architecture ensures that issues with one partition or one customer's table don't cascade to others.

DynamoDB Availability vs Self-Managed Databases
Aspect	DynamoDB	Self-Managed (e.g., Cassandra)
Time to recover from node failure	Automatic, seconds	Manual/automatic, minutes to hours
Time to add capacity	Instant (On-Demand)	Add nodes, rebalance: hours to days
Ops team required	None (managed)	24/7 on-call required at scale
Multi-region setup	Toggle Global Tables	Complex configuration, weeks of work
Disaster recovery	Built-in PITR, backups	Custom backup solutions required

The Hidden Value of Managed Availability

When DynamoDB Shines

DynamoDB is not a universal database—but in its sweet spot, it is virtually unmatched. Understanding where DynamoDB excels helps you make informed architectural decisions.

Ideal Use Cases for DynamoDB

•Serverless applications — Natural fit with Lambda, API Gateway, and event-driven architectures. No connection pooling headaches.
•Gaming leaderboards and sessions — Sub-millisecond latency for player state, atomic counters for scores, TTL for session cleanup.
•IoT data ingestion — Millions of devices writing telemetry data. DynamoDB handles the write throughput; TTL handles retention.
•E-commerce carts and catalogs — Flexible schema for product attributes, fast lookups, and Global Tables for international operations.
•User profiles and preferences — Schemaless storage for evolving user data, fast reads for personalization.
•Ad tech bidding — Single-digit millisecond latency is mandatory. DynamoDB delivers at any scale.
•Real-time analytics dashboards — Pre-computed aggregates with atomic updates, fast reads for visualization.
•Content metadata — Store media metadata with varied attributes, support arbitrary filtering via GSIs.

Common Patterns in DynamoDB Success Stories

Applications that thrive on DynamoDB typically share these characteristics:

Known access patterns — You can define your queries before you design your tables
Key-based access — Most operations are lookups by primary key or index
Item independence — Items don't require complex joins with other items
Denormalized data — Willing to duplicate data to avoid joins
AWS ecosystem integration — Already using Lambda, Step Functions, or EventBridge
Operational simplicity priority — Team prefers building features over managing infrastructure

Real-World Scale: Amazon Prime Day

Summary: DynamoDB as a Managed Service

We've explored Amazon DynamoDB's foundational identity as a fully managed NoSQL service. Let's consolidate the key insights:

Key Takeaways

•DynamoDB is truly serverless — No instances to manage, no capacity to pre-provision (in on-demand mode), no operational overhead.
•Architecture is partition-based — Data is distributed by partition key hash, replicated across 3 AZs, with automatic splitting and rebalancing.
•Two capacity modes — Provisioned for predictable workloads with cost savings; On-Demand for unpredictable traffic with operational simplicity.
•Enterprise-grade SLAs — 99.999% availability for Global Tables, 11 nines durability, single-digit millisecond latency at any scale.
•Not just key-value — Rich querying via sort keys, GSIs, and LSIs transforms DynamoDB from simple key-value into flexible document store.
•Managed ≠ no decisions — Data modeling, access patterns, and capacity mode selection remain critical engineering decisions.

What's Next

We'll explore partition key selection strategies, common anti-patterns, and techniques for handling access patterns that don't naturally fit DynamoDB's model.

Page Complete

1 / 6