Loading content...
A user transfers $1,000 from their checking account to savings. The transfer completes successfully. They refresh the page and see... the old balance. Panic sets in. Where is the money? Is it lost?\n\nOf course, it's not lost—it's replicating. The user's read request hit a replica that hasn't received the update yet. But to the user, the system appears broken. Their trust is shaken.\n\nThis is the consistency trade-off in action. The system chose to provide a fast response (availability) over a guaranteed-accurate response (consistency). Was it the right choice for a banking application? Almost certainly not.\n\nUnderstanding consistency trade-offs is perhaps the most important skill in distributed database design. Every system makes these trade-offs—explicitly or implicitly. Systems that make them explicitly and appropriately build user trust. Systems that make them accidentally create confusion and failures.
By the end of this page, you will understand the consistency spectrum from strong to eventual consistency, the CAP theorem and its implications, the PACELC theorem for more nuanced analysis, practical consistency models and their guarantees, and how to select appropriate consistency levels for different application requirements.
Consistency in distributed systems exists on a spectrum, not as a binary choice. Understanding this spectrum helps you select the appropriate level for your application's needs:
| Consistency Level | Guarantee | Cost | Use Case |
|---|---|---|---|
| Linearizability | Operations appear atomic and in real-time order | Highest latency; requires coordination | Financial transactions, leader election |
| Sequential Consistency | Operations appear in some global order consistent with program order | High latency; relaxed real-time | Distributed locks, counters |
| Causal Consistency | Causally related operations appear in order; unrelated may differ | Moderate latency; tracks dependencies | Social media, collaborative editing |
| Read-Your-Writes | A user always sees their own writes | Low-moderate; session tracking | User profile updates, shopping carts |
| Monotonic Reads | Once a value is read, future reads won't return older values | Low; version tracking | Newsfeed, dashboards |
| Eventual Consistency | Replicas will converge eventually; no timing guarantee | Lowest latency; no coordination | DNS, caching, logging |
Understanding the Trade-offs:\n\nStronger consistency levels require more coordination between nodes:\n\n- Linearizability often requires consensus protocols (Paxos, Raft) or distributed locking\n- Causal consistency requires tracking and propagating dependency information\n- Session guarantees require sticky sessions or client-side version tracking\n- Eventual consistency requires only background anti-entropy processes\n\nEach step down the spectrum reduces latency and increases availability, but also increases the complexity of application logic required to handle potential inconsistencies.
The CAP theorem (Brewer's theorem) is the foundational result for understanding consistency trade-offs. It states that a distributed system can provide at most two of three guarantees simultaneously:
Why Only Two?\n\nConsider a network partition separating two database nodes. A client sends a write to Node A. The partition prevents Node A from communicating with Node B.\n\n- If we choose Consistency (CP): Node A must reject the write (or wait indefinitely) because it cannot confirm that Node B will also have the update. The system sacrifices availability.\n\n- If we choose Availability (AP): Node A accepts the write and responds immediately. Node B continues serving reads with stale data. When the partition heals, the nodes must reconcile potentially conflicting states.\n\nPartition tolerance is not optional. Networks partition. Hardware fails. Cables get unplugged. Any distributed system must handle partitions. Therefore, the real choice is between CP (consistent during partitions, unavailable) and AP (available during partitions, inconsistent).
CAP doesn't mean you sacrifice C or A permanently—only during network partitions. When the network is healthy, you can have both C and A. CAP is about the choice you make when things go wrong. Most systems are CP or AP by default but can be tuned. The real question is: what does YOUR system do when a partition occurs?
CAP only addresses behavior during partitions. But distributed systems make trade-offs even when the network is healthy. The PACELC theorem extends CAP to capture the full picture:
PACELC States:\n\n> If there is a Partition (P), choose between Availability (A) and Consistency (C). Else (E), when operating normally, choose between Latency (L) and Consistency (C).\n\nBreaking It Down:\n\n- PAC: During a partition, trade off Availability vs. Consistency (same as CAP)\n- ELC: During normal operation, trade off Latency vs. Consistency\n\nThis captures the reality that even without partitions, providing strong consistency requires coordination that adds latency. You can have fast responses OR consistent responses—not always both.
| System | If Partition (PAC) | Else Normal (ELC) | Classification |
|---|---|---|---|
| DynamoDB | Availability | Latency | PA/EL |
| Cassandra | Availability | Latency (tunable) | PA/EL |
| MongoDB | Consistency | Consistency | PC/EC |
| HBase | Consistency | Consistency | PC/EC |
| CockroachDB | Consistency | Latency | PC/EL |
| VoltDB | Consistency | Consistency | PC/EC |
| Spanner | Consistency | Consistency | PC/EC |
| Riak | Availability | Latency | PA/EL |
Interpreting PACELC Classifications:\n\n- PA/EL (Dynamo-style systems): Prioritize availability and speed. Accept that data may be stale or conflicting. Best for high-throughput, eventually consistent workloads.\n\n- PC/EC (Traditional databases): Prioritize correctness at all times. Accept higher latency as the cost. Best for financial, transactional, and correctness-critical workloads.\n\n- PC/EL (Modern distributed SQL): Consistent during partitions (sacrifice availability), but optimize for latency during normal operation. A balance that works for many applications.\n\nWhy PACELC Matters More Than CAP:\n\nNetwork partitions are rare—most systems experience partitions for minutes per year. But every single request experiences the ELC trade-off. Choosing a system that's PC/EC when you need low latency (should be PC/EL or PA/EL) will hurt performance 100% of the time, not just during rare partitions.
Many databases allow per-request tuning of ELC trade-offs. MongoDB's read/write concerns, Cassandra's consistency levels, and DynamoDB's consistent read option let you choose latency vs. consistency on a per-operation basis. Critical operations can demand consistency; bulk reads can accept staleness.
Beyond the theoretical spectrum, practical systems implement specific consistency models with concrete guarantees. Understanding these helps you reason about application behavior:
Read-Your-Writes (RYW) Consistency:\n\nAfter a user performs a write, all subsequent reads by that same user (session, device, token) will reflect that write. Other users may see stale data.\n\nWhy It Matters:\nRYW prevents the confusing scenario where a user submits a form and the page appears unchanged. It's the minimum consistency most users expect.\n\nImplementation Approaches:\n\n1. Route to Primary: After a write, direct user's reads to the primary until replicas catch up.\n\n2. Timestamp-Based: Include a logical timestamp with writes; replicas wait until they've reached that timestamp before serving reads.\n\n3. Sticky Sessions: Route all requests from a user to the same replica (which received their writes).\n\njavascript\n// Implementation: Return version token with writes\nconst result = await db.update(user.id, newData);\nsetCookie('lastWriteVersion', result.version);\n\n// Subsequent reads include the token\nconst data = await db.read(user.id, {\n minVersion: getCookie('lastWriteVersion')\n});\n
Eventual consistency is the weakest consistency model, but also the most available and performant. It deserves detailed examination because it's ubiquitous in modern distributed systems:
Definition:\n\n> If no new updates are made to a data item, eventually all reads will return the last updated value. The convergence time is not bounded.\n\nKey Points:\n\n1. "Eventually" is unbounded: The system provides no guarantee about when convergence will occur. Could be milliseconds, could be minutes.\n\n2. During convergence, anything goes: Different replicas may return different values. The same replica may return different values for the same key in successive reads.\n\n3. Conflicts must be resolved: If concurrent writes occur, the system must eventually pick a winner or merge. Common strategies: last-write-wins, merge functions, or CRDTs.\n\n4. No coordination required: Nodes can accept writes without talking to each other, then exchange updates in the background. This enables high availability and low latency.
Last-write-wins (LWW) is the simplest conflict resolution: the write with the highest timestamp wins. But LWW silently drops data. If two users update the same record simultaneously, one update is lost without notification. For append-like operations (adding items to a list), LWW is particularly dangerous. Consider merge semantics or CRDTs for such use cases.
Measuring Eventual Consistency:\n\nWhile eventual consistency provides no timing guarantee, you can measure convergence time empirically:\n\n- Replication Lag Metrics: Monitor lag across replicas (99th percentile, max)\n- Staleness Probability: Calculate P(stale read) given lag distribution and read patterns\n- Convergence SLOs: Set and measure targets like "99% of reads see writes within 500ms"\n\nThese measurements help you understand actual consistency behavior and make informed design decisions.
When applications require strong consistency (linearizability), specific mechanisms must be employed. Each comes with its own trade-offs:
12345678910111213
-- PostgreSQL: Force synchronous replication for critical tables-- postgresql.conf: synchronous_commit = on-- Ensures writes are durable on replica before acknowledgment -- Session-level control for critical transactionsBEGIN;SET LOCAL synchronous_commit = 'remote_apply'; -- Wait for replica to applyUPDATE accounts SET balance = balance - 1000 WHERE id = 123;UPDATE accounts SET balance = balance + 1000 WHERE id = 456;COMMIT; -- Returns only after replica has applied -- For reads requiring absolute freshness (route to primary)-- Application logic: critical reads go to primary connectionStrong consistency typically adds 1-2 network round-trips to each operation. For same-datacenter deployments, this adds 1-5ms. For cross-region deployments, this adds 50-200ms. Always measure and understand the latency impact before requiring strong consistency globally.
Selecting appropriate consistency levels requires analyzing your application's requirements. Here's a framework for making these decisions:
Step 1: Classify Your Operations\n\n| Operation Type | Consistency Need | Example |\n|---------------|------------------|---------|\n| Financial transactions | Strong (linearizable) | Transfers, payments, trades |\n| User-generated content | Eventually consistent | Posts, comments, likes |\n| User's own data writes | Read-your-writes | Profile updates, settings |\n| Configuration/ACL | Strong or causal | Permission changes |\n| Analytics/Reporting | Eventually consistent | Dashboards, aggregations |\n| Inventory/Stock | Strong for decrements | Purchase, reservation |\n| Session state | Read-your-writes | Login state, cart contents |
Step 2: Evaluate Business Impact of Stale Reads\n\nFor each operation class, ask:\n\n1. What happens if a user sees stale data?\n - Confused user (annoying but recoverable)\n - Incorrect business decision (serious)\n - Regulatory violation (critical)\n - Safety issue (unacceptable)\n\n2. How long can staleness be tolerated?\n - Seconds: Eventual consistency often sufficient\n - Milliseconds: Need low-lag or synchronous replication\n - Zero: Need strong consistency\n\n3. Can the application compensate?\n - Optimistic UI with correction\n - Retry/reconciliation loops\n - Explicit refresh options for users
Most applications have ~20% of operations requiring strong consistency and ~80% tolerating eventual consistency. Identify and protect the critical 20%. Let the other 80% enjoy the performance and scalability benefits of eventual consistency. Over-applying strong consistency is as much a mistake as under-applying it.
We've explored the consistency trade-offs inherent in distributed database systems. Let's consolidate the key insights:
Module Complete: Replication\n\nYou've now completed a comprehensive exploration of data replication in distributed database systems. You understand:\n\n- Fundamental concepts: Why we replicate, the challenges it introduces\n- Synchronous replication: Strong guarantees, latency costs, failure handling\n- Asynchronous replication: Performance benefits, lag implications, data loss risks\n- Replication strategies: Primary-replica, multi-primary, consensus, quorum\n- Consistency trade-offs: CAP, PACELC, practical consistency models, selection criteria\n\nThis knowledge forms the foundation for designing and operating distributed database systems that balance durability, availability, performance, and consistency according to your application's specific requirements.
Congratulations! You've mastered the principles of replication in distributed databases. You can now design replication architectures that provide appropriate consistency, availability, and performance trade-offs for your specific use cases. Remember: consistency is not a binary—it's a spectrum of trade-offs that must be matched to application semantics.