Database Management SystemsCAP Theorem

CAP Theorem: Fundamental Trade-offs in Distributed Systems

LevelAdvanced

Duration90 mins

TopicCAP Theorem

1 / 5

Consistency in Distributed Systems

The Illusion of a Single Machine

Imagine a banking system where you transfer $1,000 from your savings account to your checking account. You execute the transfer, receive a confirmation, and then immediately check your checking account balance—only to find the $1,000 hasn't arrived. You panic. Is it lost? No—it's simply that you're reading from a different database node that hasn't received the update yet.

This scenario illustrates one of the most fundamental challenges in distributed systems: consistency. When data is replicated across multiple machines, ensuring that all nodes reflect the same state at the same time becomes extraordinarily difficult—and in some cases, mathematically impossible without sacrificing other properties.

Consistency, in the context of the CAP theorem, is the guarantee that every read receives the most recent write. It's the promise that a distributed system behaves as if it were a single machine with a single copy of the data.

What You Will Master

By the end of this page, you will understand the precise definition of consistency in distributed systems, the spectrum of consistency models from strongest to weakest, implementation techniques for achieving different consistency levels, and the profound trade-offs that make consistency one of the three pillars of the CAP theorem.

Defining Consistency Precisely

The term "consistency" is notoriously overloaded in computer science. Before we can discuss CAP theorem consistency, we must disambiguate it from other uses:

ACID Consistency (Database Transactions): In traditional database theory, consistency means that transactions bring the database from one valid state to another, respecting all defined integrity constraints (foreign keys, unique constraints, check constraints). This is a local property—it applies to a single database instance.

CAP Consistency (Distributed Systems): In distributed systems, consistency refers to data agreement across nodes. Specifically, it guarantees that all nodes in the distributed system see the same data at the same time. If you write a value and immediately read it from any node, you get the value you just wrote.

Eventual Consistency: A weaker form where, if no new updates are made, all nodes will eventually converge to the same value—but there's no guarantee about when.

Critical Distinction

CAP consistency is NOT the same as ACID consistency. Eric Brewer (who proposed CAP) uses consistency to mean 'atomic, or linearizable, consistency'—a single, consistent view of the data across all nodes. This is fundamentally about distributed state synchronization, not constraint enforcement.

The Formal Definition:

In CAP theorem terms, a system provides consistency if and only if:

All reads return the most recently completed write's value
Once a write completes, all subsequent reads see that write
There is a total order of operations that all nodes agree upon

This is equivalent to what theoreticians call linearizability—the strongest form of consistency guarantee in distributed computing.

Types of Consistency in Computing
Consistency Type	Context	Guarantee	Scope
ACID Consistency	Transaction processing	Integrity constraints maintained	Single database
CAP Consistency	Distributed systems	All nodes see same data	Multiple nodes
Eventual Consistency	Distributed systems	Nodes converge over time	Multiple nodes
Sequential Consistency	Memory models	Operations appear in some sequential order	Multiple processors
Causal Consistency	Distributed systems	Causally related operations ordered correctly	Multiple nodes

Linearizability: The Gold Standard

Linearizability is the consistency model that CAP theorem refers to as 'C'. It's the formally rigorous definition of what it means for a distributed system to behave like a single-copy system.

Definition: A system is linearizable if:

Each operation appears to take effect instantaneously at some point between its invocation and response
There exists a total ordering of all operations that is consistent with real-time ordering
Each read returns the value of the most recent write in this total order

The Key Insight: Linearizability creates the illusion that there is only one copy of the data, and all operations act on it atomically. Even though data may be replicated across many nodes, the system behaves as if reads and writes happen at a single point.

linearizability_example.txt
Timeline of Operations (Linearizable Execution):
 
Client A:     |---Write(x=1)---|
Client B:              |---Read()---|   → Must return 1
                                        (Write completed before Read started)
 
Client C:                         |---Write(x=2)---|
Client D:                                   |---Read()---|  → Must return 2
 
Non-Linearizable (INVALID) Execution:
 
Client A:     |---Write(x=1)---|
Client B:              |---Read()---|   → Returns 0 (stale!)  ✗ VIOLATION
                                        (Should have seen x=1)
 
Concurrent Operations (Both Valid):
 
Client A:     |---Write(x=1)---------|
Client B:           |---Read()---|    → Can return 0 OR 1
                                        (Operations overlap, either is valid)
 
Linearization Point: The conceptual instant when the operation 
"takes effect." Must fall within the operation's duration.

Understanding Linearization Points:

Every operation in a linearizable system has a linearization point—the instant when it conceptually takes effect. This point must:

Fall between the operation's start and end time
Respect real-time ordering: if operation A completes before B starts, A's linearization point must precede B's
Create a consistent total order when all operations are arranged by their linearization points

The beauty and challenge of linearizability is that the system doesn't need to explicitly timestamp operations. It just needs to behave as if such points exist.

Real-World Analogy

Think of linearizability like a shared whiteboard in a meeting. When someone writes on the whiteboard, everyone in the room immediately sees the change. There's no delay, no inconsistency—everyone has the same view. Linearizability makes a distributed system behave like that single whiteboard, even though data is actually stored on multiple machines.

Properties of Linearizable Systems

•Real-time ordering preserved — If operation A completes before B starts, A appears before B in the linearization order
•Single-copy semantics — The system behaves as if there's only one copy of each data item
•Read-your-writes — A client always sees its own most recent write
•Monotonic reads — Once a client sees a value, it never sees an older value
•No time travel — Reads cannot return values that 'haven't happened yet' according to the total order

The Consistency Spectrum

While linearizability is the strongest consistency model, it's not the only one. In practice, systems operate across a spectrum of consistency guarantees, each with different properties, performance characteristics, and use cases.

Understanding this spectrum is crucial because:

Stronger consistency is more expensive (latency, availability, throughput)
Many applications can tolerate weaker consistency
Choosing the right consistency level is a critical architectural decision

Consistency Models from Strongest to Weakest
Model	Guarantee	Latency	Availability	Use Case
Linearizability	All operations ordered as if on single node	Highest	Lowest	Financial transactions, locks
Sequential Consistency	All processes see same order (not real-time)	High	Low	Coordination services
Causal Consistency	Causally related operations ordered correctly	Medium	Medium	Social media, collaboration
Read-Your-Writes	Client sees own writes immediately	Low-Medium	Medium-High	User profiles, shopping carts
Monotonic Reads	Reads never go backwards in time	Low	High	News feeds, timelines
Eventual Consistency	All replicas converge eventually	Lowest	Highest	DNS, caches, analytics

Sequential Consistency:

Slightly weaker than linearizability. All operations appear in some sequential order that all processes agree on, but this order doesn't have to respect real-time. A read might return a value from the 'future' in real-time, as long as the sequential order is maintained.

Causal Consistency:

Preserves the ordering of causally related operations. If operation A happens before operation B (and there's a causal relationship), then all nodes see A before B. Operations that aren't causally related can appear in any order.

Example: In a social media app, a reply must appear after the original post on all nodes. But two independent posts can appear in different orders on different nodes.

Converting Mermaid diagram...

Session Guarantees (Read-Your-Writes, Monotonic Reads):

These are per-session guarantees that provide some consistency within a client's session without requiring global consistency:

Read-Your-Writes: A client always sees its own updates. You post a message and immediately see it in your feed.
Monotonic Reads: Once a client sees a value, it never sees an older one. If you saw 10 comments on a post, you'll never see only 8 later.
Monotonic Writes: A client's writes are applied in the order issued.
Writes Follow Reads: A write that follows a read will see that read's data.

Eventual Consistency:

The weakest practical guarantee. If updates stop, all replicas will eventually converge to the same value. No guarantee about when, and reads during updates may return any version.

Pitfall: 'Eventually' has no time bound. In theory, convergence could take seconds or hours. In practice, it's usually fast, but there are no guarantees.

Implementing Strong Consistency

Achieving linearizability in a distributed system is challenging because:

Network delays vary unpredictably — Messages between nodes take different amounts of time
Clocks are imperfect — There's no global clock all nodes can reference precisely
Nodes can fail — A node might crash in the middle of an operation

Despite these challenges, several techniques enable strong consistency:

Techniques for Linearizability

•Single Leader (Primary) Replication — All writes go through one node that serializes them. Reads can go to the leader or use synchronous replication.
•Majority Quorums — Writes and reads both contact a majority of nodes. Since any two majorities overlap, reads always see the latest write.
•Consensus Protocols (Paxos, Raft) — Distributed algorithms that achieve agreement on a single value despite failures.
•Hybrid Logical Clocks — Combine physical time with logical counters to order events without perfect clock synchronization.
•Two-Phase Commit — Coordinate writes across nodes to ensure atomicity (though this reduces availability).

The Quorum Approach in Detail:

Quorum-based systems are fundamental to understanding how consistency is achieved. The key insight is that if both reads and writes contact overlapping sets of nodes, consistency is guaranteed.

For a system with N nodes:

W = number of nodes that must acknowledge a write
R = number of nodes that must respond to a read

Quorum Condition: W + R > N

If this condition holds, every read quorum overlaps with every write quorum, so reads always see at least one node with the latest value.

quorum_consistency.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
# Quorum-based consistency in a distributed key-value store
 
class QuorumKVStore:
    """
    Implements linearizable reads and writes using quorums.
    For N=5 nodes, with W=3 and R=3, we guarantee W+R > N (6 > 5).
    """
    
    def __init__(self, nodes: List[Node], write_quorum: int, read_quorum: int):
        self.nodes = nodes
        self.N = len(nodes)
        self.W = write_quorum  # Nodes that must ack writes
        self.R = read_quorum   # Nodes that must respond to reads
        
        # Verify quorum condition for linearizability
        assert self.W + self.R > self.N, "Quorum condition not met!"
        assert self.W > self.N // 2, "Write quorum must be majority for durability"
    
    def write(self, key: str, value: Any, timestamp: int) -> bool:
        """
        Write to W nodes with a monotonic timestamp.
        All writes carry a timestamp to resolve conflicts.
        """
        acks = 0
        write_msg = WriteMessage(key, value, timestamp)
        
        for node in self.nodes:
            try:
                if node.accept_write(write_msg):
                    acks += 1
            except NodeUnavailable:
                continue
        
        # Write succeeds if W nodes acknowledge
        if acks >= self.W:
            return True
        else:
            # Not enough acks - write failed
            # In practice, we might retry or return error
            raise WriteQuorumNotReached(f"Only {acks}/{self.W} acks received")
    
    def read(self, key: str) -> Tuple[Any, int]:
        """
        Read from R nodes and return the value with the highest timestamp.
        Since W + R > N, at least one node has the latest write.
        """
        responses = []
        
        for node in self.nodes:
            try:
                value, timestamp = node.read(key)
                responses.append((value, timestamp))
                if len(responses) >= self.R:
                    break
            except NodeUnavailable:
                continue
        
        if len(responses) < self.R:
            raise ReadQuorumNotReached(f"Only {len(responses)}/{self.R} responses")
        
        # Return value with highest timestamp (most recent write)
        return max(responses, key=lambda x: x[1])
    
    def linearizable_read(self, key: str) -> Any:
        """
        Fully linearizable read with read-repair.
        After reading, write the latest value back to ensure
        all read quorum nodes are up-to-date.
        """
        value, timestamp = self.read(key)
        
        # Read repair: propagate latest value to any stale nodes
        # This ensures subsequent reads see this value
        self.repair(key, value, timestamp)
        
        return value
    
    def repair(self, key: str, value: Any, timestamp: int):
        """
        Write the latest value to nodes that might be stale.
        This 'repairs' inconsistencies and aids convergence.
        """
        for node in self.nodes:
            try:
                # Node only accepts if timestamp is newer than what it has
                node.accept_repair(key, value, timestamp)
            except NodeUnavailable:
                continue
 
 
# Example: N=5, W=3, R=3
# 
# Scenario: Client A writes x=42 at t=100
#   - Nodes 1, 2, 3 receive the write (W=3 satisfied)
#   - Nodes 4, 5 are temporarily slow/partitioned
#
# Scenario: Client B reads x
#   - Contacts nodes 2, 3, 4 for read (R=3)
#   - Node 2, 3: return x=42, t=100
#   - Node 4: returns x=41, t=99 (stale)
#   - Client B sees t=100 is highest, returns x=42 ✓
#
# Key: At least one node (2 or 3) has the latest value because
# write quorum (1,2,3) overlaps with read quorum (2,3,4).

Quorum Trade-offs

You can adjust W and R for different needs:

• W=N, R=1: All nodes must ack writes, but reads are fast. Good for read-heavy workloads. • W=1, R=N: Writes are fast, but reads must check all nodes. Good for write-heavy workloads. • W=R=(N+1)/2: Balanced approach, tolerates up to N/2 failures for both reads and writes.

Each configuration trades write latency against read latency while maintaining consistency.

The Cost of Consistency

Strong consistency doesn't come free. There are fundamental costs that cannot be engineered away—only traded against other properties:

Latency Cost: For linearizability, every operation must reach agreement across nodes. This means:

Writes must wait for acknowledgments from W nodes
Reads might need to contact R nodes and await responses
In a geographically distributed system, cross-datacenter latency becomes the bottleneck

Example: A write that must synchronously replicate to a datacenter 100ms away adds at least 100ms to every write operation.

Consistency vs. Latency in Geographic Distribution
Replication Type	Write Latency	Consistency	Data Loss Risk
Synchronous to all DCs	Highest (sum of RTTs)	Strongest (linearizable)	None if quorum succeeds
Synchronous to local DC only	Low (~1-5ms)	Local only	Up to last sync interval
Async to all DCs	Lowest	Eventual	Up to replication lag
Semi-sync (wait for 1 remote)	Medium	Strong within 2 DCs	One DC worth of data

Throughput Cost:

Coordination limits parallelism. If all writes must be serialized through a single leader, that leader becomes a bottleneck. More nodes don't help throughput for writes—they may even hurt it due to coordination overhead.

Availability Cost:

This is the crux of CAP: during a network partition, you must choose between:

Maintaining consistency: Reject operations that can't reach quorum (sacrifcing availability)
Maintaining availability: Accept operations locally, risk inconsistency when the partition heals

Strong consistency systems are by design unavailable during partitions. This is not a bug—it's the mathematically proven trade-off that CAP describes.

When Consistency Is Worth It

•Financial transactions (bank transfers)
•Inventory systems (can't sell items you don't have)
•Distributed locks and coordination
•Leader election and consensus
•Configuration management
•Any system where incorrect behavior is worse than unavailability

When Consistency Is Too Expensive

•High-traffic web applications (page views)
•Social media feeds and timelines
•Analytics and metrics collection
•Caching layers
•Shopping carts (eventual convergence acceptable)
•Any system where availability is more important than perfect accuracy

The Fundamental Trade-off

You cannot have perfect consistency, perfect availability, and partition tolerance simultaneously. This is not a limitation of current technology—it's a mathematical impossibility proven by the CAP theorem. Every distributed system must choose its trade-offs based on business requirements.

Consistency in Real-World Systems

Let's examine how production systems navigate consistency trade-offs:

Google Spanner: Achieves external consistency (stronger than linearizability!) using TrueTime—GPS clocks and atomic clocks synchronized across all datacenters. Transactions are committed with timestamps that reflect real-time ordering globally. The cost: significant infrastructure investment and latency (commits wait for clock uncertainty to pass).

Apache ZooKeeper: Provides linearizable writes and sequential consistency for reads. Used for coordination, configuration, and leader election. Sacrifices throughput for correctness—intended for small, critical data, not high-volume operations.

Amazon DynamoDB: Offers a choice: eventually consistent reads (default, cheaper, faster) or strongly consistent reads (costs more, higher latency). Applications choose per-read based on their needs.

Consistency Models in Production Systems
System	Default Consistency	Strongest Available	Trade-off Made
Google Spanner	External consistency	External consistency	Latency for correctness
ZooKeeper	Sequential (reads)	Linearizable (sync reads)	Throughput for coordination
DynamoDB	Eventual	Strongly consistent	Configurable per-operation
Cassandra	Eventual	Linearizable (LWT)	Performance unless critical
CockroachDB	Serializable	Serializable	Latency for ACID guarantees
MongoDB	Eventual	Linearizable (majority)	Configurable write/read concern

Multi-Level Consistency Strategies:

Modern systems often use different consistency levels for different data types:

User authentication data: Strong consistency (can't risk stale credentials)
User profile data: Read-your-writes within session, eventual across sessions
Social feed data: Eventual consistency (acceptable to miss a post briefly)
Financial ledgers: Linearizable within transactions
Analytics data: Eventual (correctness not required in real-time)

This polyglot consistency approach lets systems optimize each use case appropriately.

Practical Wisdom

Most applications don't need linearizability for all operations. The art of distributed systems design is identifying which operations require strong consistency (and paying the cost) versus which can tolerate eventual consistency (and gaining performance). Over-specifying consistency is as harmful as under-specifying it—both lead to systems that don't serve their users well.

Summary: Consistency in the CAP Triangle

We've explored the 'C' in CAP theorem in depth. Let's consolidate the key insights:

Key Takeaways

•CAP consistency is linearizability — All nodes see the same data at the same time, with operations appearing to execute atomically.
•Consistency exists on a spectrum — From linearizability (strongest) through sequential, causal, and session guarantees, down to eventual consistency (weakest).
•Implementation requires coordination — Quorums, consensus protocols, and synchronous replication enable consistency but add latency and complexity.
•Consistency costs availability — During network partitions, consistent systems reject operations rather than risk inconsistency.
•Choose consistency per-operation — Modern systems let you specify different consistency levels for different data types based on business requirements.
•Linearizability is expensive but sometimes necessary — Financial transactions, locks, and coordination require it; social feeds and analytics don't.

What's Next:

Consistency is just one corner of the CAP triangle. In the next page, we'll explore Availability—the guarantee that every request receives a response. You'll learn how availability is defined, what it means for a system to be 'highly available,' and why maintaining availability during partitions forces you to sacrifice consistency.

Page Complete

You now understand consistency in the context of the CAP theorem—its formal definition as linearizability, the spectrum of weaker consistency models, implementation techniques, and the fundamental trade-offs involved. This foundation is essential for understanding why CAP forces a choice between consistency and availability during partitions.

1 / 5

Loading learning content...

Database Management SystemsCAP Theorem

CAP Theorem: Fundamental Trade-offs in Distributed Systems

LevelAdvanced

Duration90 mins

TopicCAP Theorem

1 / 5

Consistency in Distributed Systems

The Illusion of a Single Machine

What You Will Master

Defining Consistency Precisely

The term "consistency" is notoriously overloaded in computer science. Before we can discuss CAP theorem consistency, we must disambiguate it from other uses:

Eventual Consistency: A weaker form where, if no new updates are made, all nodes will eventually converge to the same value—but there's no guarantee about when.

Critical Distinction

The Formal Definition:

In CAP theorem terms, a system provides consistency if and only if:

All reads return the most recently completed write's value
Once a write completes, all subsequent reads see that write
There is a total order of operations that all nodes agree upon

This is equivalent to what theoreticians call linearizability—the strongest form of consistency guarantee in distributed computing.

Types of Consistency in Computing
Consistency Type	Context	Guarantee	Scope
ACID Consistency	Transaction processing	Integrity constraints maintained	Single database
CAP Consistency	Distributed systems	All nodes see same data	Multiple nodes
Eventual Consistency	Distributed systems	Nodes converge over time	Multiple nodes
Sequential Consistency	Memory models	Operations appear in some sequential order	Multiple processors
Causal Consistency	Distributed systems	Causally related operations ordered correctly	Multiple nodes

Linearizability: The Gold Standard

Linearizability is the consistency model that CAP theorem refers to as 'C'. It's the formally rigorous definition of what it means for a distributed system to behave like a single-copy system.

Definition: A system is linearizable if:

Each operation appears to take effect instantaneously at some point between its invocation and response
There exists a total ordering of all operations that is consistent with real-time ordering
Each read returns the value of the most recent write in this total order

linearizability_example.txt
Timeline of Operations (Linearizable Execution):
 
Client A:     |---Write(x=1)---|
Client B:              |---Read()---|   → Must return 1
                                        (Write completed before Read started)
 
Client C:                         |---Write(x=2)---|
Client D:                                   |---Read()---|  → Must return 2
 
Non-Linearizable (INVALID) Execution:
 
Client A:     |---Write(x=1)---|
Client B:              |---Read()---|   → Returns 0 (stale!)  ✗ VIOLATION
                                        (Should have seen x=1)
 
Concurrent Operations (Both Valid):
 
Client A:     |---Write(x=1)---------|
Client B:           |---Read()---|    → Can return 0 OR 1
                                        (Operations overlap, either is valid)
 
Linearization Point: The conceptual instant when the operation 
"takes effect." Must fall within the operation's duration.

Understanding Linearization Points:

Every operation in a linearizable system has a linearization point—the instant when it conceptually takes effect. This point must:

Fall between the operation's start and end time
Respect real-time ordering: if operation A completes before B starts, A's linearization point must precede B's
Create a consistent total order when all operations are arranged by their linearization points

The beauty and challenge of linearizability is that the system doesn't need to explicitly timestamp operations. It just needs to behave as if such points exist.

Real-World Analogy

Properties of Linearizable Systems

•Real-time ordering preserved — If operation A completes before B starts, A appears before B in the linearization order
•Single-copy semantics — The system behaves as if there's only one copy of each data item
•Read-your-writes — A client always sees its own most recent write
•Monotonic reads — Once a client sees a value, it never sees an older value
•No time travel — Reads cannot return values that 'haven't happened yet' according to the total order

The Consistency Spectrum

Understanding this spectrum is crucial because:

Stronger consistency is more expensive (latency, availability, throughput)
Many applications can tolerate weaker consistency
Choosing the right consistency level is a critical architectural decision

Consistency Models from Strongest to Weakest
Model	Guarantee	Latency	Availability	Use Case
Linearizability	All operations ordered as if on single node	Highest	Lowest	Financial transactions, locks
Sequential Consistency	All processes see same order (not real-time)	High	Low	Coordination services
Causal Consistency	Causally related operations ordered correctly	Medium	Medium	Social media, collaboration
Read-Your-Writes	Client sees own writes immediately	Low-Medium	Medium-High	User profiles, shopping carts
Monotonic Reads	Reads never go backwards in time	Low	High	News feeds, timelines
Eventual Consistency	All replicas converge eventually	Lowest	Highest	DNS, caches, analytics

Sequential Consistency:

Causal Consistency:

Example: In a social media app, a reply must appear after the original post on all nodes. But two independent posts can appear in different orders on different nodes.

Converting Mermaid diagram...

Session Guarantees (Read-Your-Writes, Monotonic Reads):

These are per-session guarantees that provide some consistency within a client's session without requiring global consistency:

Read-Your-Writes: A client always sees its own updates. You post a message and immediately see it in your feed.
Monotonic Reads: Once a client sees a value, it never sees an older one. If you saw 10 comments on a post, you'll never see only 8 later.
Monotonic Writes: A client's writes are applied in the order issued.
Writes Follow Reads: A write that follows a read will see that read's data.

Eventual Consistency:

The weakest practical guarantee. If updates stop, all replicas will eventually converge to the same value. No guarantee about when, and reads during updates may return any version.

Pitfall: 'Eventually' has no time bound. In theory, convergence could take seconds or hours. In practice, it's usually fast, but there are no guarantees.

Implementing Strong Consistency

Achieving linearizability in a distributed system is challenging because:

Network delays vary unpredictably — Messages between nodes take different amounts of time
Clocks are imperfect — There's no global clock all nodes can reference precisely
Nodes can fail — A node might crash in the middle of an operation

Despite these challenges, several techniques enable strong consistency:

Techniques for Linearizability

•Single Leader (Primary) Replication — All writes go through one node that serializes them. Reads can go to the leader or use synchronous replication.
•Majority Quorums — Writes and reads both contact a majority of nodes. Since any two majorities overlap, reads always see the latest write.
•Consensus Protocols (Paxos, Raft) — Distributed algorithms that achieve agreement on a single value despite failures.
•Hybrid Logical Clocks — Combine physical time with logical counters to order events without perfect clock synchronization.
•Two-Phase Commit — Coordinate writes across nodes to ensure atomicity (though this reduces availability).

The Quorum Approach in Detail:

Quorum-based systems are fundamental to understanding how consistency is achieved. The key insight is that if both reads and writes contact overlapping sets of nodes, consistency is guaranteed.

For a system with N nodes:

W = number of nodes that must acknowledge a write
R = number of nodes that must respond to a read

Quorum Condition: W + R > N

If this condition holds, every read quorum overlaps with every write quorum, so reads always see at least one node with the latest value.

quorum_consistency.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
# Quorum-based consistency in a distributed key-value store
 
class QuorumKVStore:
    """
    Implements linearizable reads and writes using quorums.
    For N=5 nodes, with W=3 and R=3, we guarantee W+R > N (6 > 5).
    """
    
    def __init__(self, nodes: List[Node], write_quorum: int, read_quorum: int):
        self.nodes = nodes
        self.N = len(nodes)
        self.W = write_quorum  # Nodes that must ack writes
        self.R = read_quorum   # Nodes that must respond to reads
        
        # Verify quorum condition for linearizability
        assert self.W + self.R > self.N, "Quorum condition not met!"
        assert self.W > self.N // 2, "Write quorum must be majority for durability"
    
    def write(self, key: str, value: Any, timestamp: int) -> bool:
        """
        Write to W nodes with a monotonic timestamp.
        All writes carry a timestamp to resolve conflicts.
        """
        acks = 0
        write_msg = WriteMessage(key, value, timestamp)
        
        for node in self.nodes:
            try:
                if node.accept_write(write_msg):
                    acks += 1
            except NodeUnavailable:
                continue
        
        # Write succeeds if W nodes acknowledge
        if acks >= self.W:
            return True
        else:
            # Not enough acks - write failed
            # In practice, we might retry or return error
            raise WriteQuorumNotReached(f"Only {acks}/{self.W} acks received")
    
    def read(self, key: str) -> Tuple[Any, int]:
        """
        Read from R nodes and return the value with the highest timestamp.
        Since W + R > N, at least one node has the latest write.
        """
        responses = []
        
        for node in self.nodes:
            try:
                value, timestamp = node.read(key)
                responses.append((value, timestamp))
                if len(responses) >= self.R:
                    break
            except NodeUnavailable:
                continue
        
        if len(responses) < self.R:
            raise ReadQuorumNotReached(f"Only {len(responses)}/{self.R} responses")
        
        # Return value with highest timestamp (most recent write)
        return max(responses, key=lambda x: x[1])
    
    def linearizable_read(self, key: str) -> Any:
        """
        Fully linearizable read with read-repair.
        After reading, write the latest value back to ensure
        all read quorum nodes are up-to-date.
        """
        value, timestamp = self.read(key)
        
        # Read repair: propagate latest value to any stale nodes
        # This ensures subsequent reads see this value
        self.repair(key, value, timestamp)
        
        return value
    
    def repair(self, key: str, value: Any, timestamp: int):
        """
        Write the latest value to nodes that might be stale.
        This 'repairs' inconsistencies and aids convergence.
        """
        for node in self.nodes:
            try:
                # Node only accepts if timestamp is newer than what it has
                node.accept_repair(key, value, timestamp)
            except NodeUnavailable:
                continue
 
 
# Example: N=5, W=3, R=3
# 
# Scenario: Client A writes x=42 at t=100
#   - Nodes 1, 2, 3 receive the write (W=3 satisfied)
#   - Nodes 4, 5 are temporarily slow/partitioned
#
# Scenario: Client B reads x
#   - Contacts nodes 2, 3, 4 for read (R=3)
#   - Node 2, 3: return x=42, t=100
#   - Node 4: returns x=41, t=99 (stale)
#   - Client B sees t=100 is highest, returns x=42 ✓
#
# Key: At least one node (2 or 3) has the latest value because
# write quorum (1,2,3) overlaps with read quorum (2,3,4).

Quorum Trade-offs

You can adjust W and R for different needs:

Each configuration trades write latency against read latency while maintaining consistency.

The Cost of Consistency

Strong consistency doesn't come free. There are fundamental costs that cannot be engineered away—only traded against other properties:

Latency Cost: For linearizability, every operation must reach agreement across nodes. This means:

Writes must wait for acknowledgments from W nodes
Reads might need to contact R nodes and await responses
In a geographically distributed system, cross-datacenter latency becomes the bottleneck

Example: A write that must synchronously replicate to a datacenter 100ms away adds at least 100ms to every write operation.

Consistency vs. Latency in Geographic Distribution
Replication Type	Write Latency	Consistency	Data Loss Risk
Synchronous to all DCs	Highest (sum of RTTs)	Strongest (linearizable)	None if quorum succeeds
Synchronous to local DC only	Low (~1-5ms)	Local only	Up to last sync interval
Async to all DCs	Lowest	Eventual	Up to replication lag
Semi-sync (wait for 1 remote)	Medium	Strong within 2 DCs	One DC worth of data

Throughput Cost:

Availability Cost:

This is the crux of CAP: during a network partition, you must choose between:

Maintaining consistency: Reject operations that can't reach quorum (sacrifcing availability)
Maintaining availability: Accept operations locally, risk inconsistency when the partition heals

Strong consistency systems are by design unavailable during partitions. This is not a bug—it's the mathematically proven trade-off that CAP describes.

When Consistency Is Worth It

•Financial transactions (bank transfers)
•Inventory systems (can't sell items you don't have)
•Distributed locks and coordination
•Leader election and consensus
•Configuration management
•Any system where incorrect behavior is worse than unavailability

When Consistency Is Too Expensive

•High-traffic web applications (page views)
•Social media feeds and timelines
•Analytics and metrics collection
•Caching layers
•Shopping carts (eventual convergence acceptable)
•Any system where availability is more important than perfect accuracy

The Fundamental Trade-off

Consistency in Real-World Systems

Let's examine how production systems navigate consistency trade-offs:

Consistency Models in Production Systems
System	Default Consistency	Strongest Available	Trade-off Made
Google Spanner	External consistency	External consistency	Latency for correctness
ZooKeeper	Sequential (reads)	Linearizable (sync reads)	Throughput for coordination
DynamoDB	Eventual	Strongly consistent	Configurable per-operation
Cassandra	Eventual	Linearizable (LWT)	Performance unless critical
CockroachDB	Serializable	Serializable	Latency for ACID guarantees
MongoDB	Eventual	Linearizable (majority)	Configurable write/read concern

Multi-Level Consistency Strategies:

Modern systems often use different consistency levels for different data types:

User authentication data: Strong consistency (can't risk stale credentials)
User profile data: Read-your-writes within session, eventual across sessions
Social feed data: Eventual consistency (acceptable to miss a post briefly)
Financial ledgers: Linearizable within transactions
Analytics data: Eventual (correctness not required in real-time)

This polyglot consistency approach lets systems optimize each use case appropriately.

Practical Wisdom

Summary: Consistency in the CAP Triangle

We've explored the 'C' in CAP theorem in depth. Let's consolidate the key insights:

Key Takeaways

•CAP consistency is linearizability — All nodes see the same data at the same time, with operations appearing to execute atomically.
•Consistency exists on a spectrum — From linearizability (strongest) through sequential, causal, and session guarantees, down to eventual consistency (weakest).
•Implementation requires coordination — Quorums, consensus protocols, and synchronous replication enable consistency but add latency and complexity.
•Consistency costs availability — During network partitions, consistent systems reject operations rather than risk inconsistency.
•Choose consistency per-operation — Modern systems let you specify different consistency levels for different data types based on business requirements.
•Linearizability is expensive but sometimes necessary — Financial transactions, locks, and coordination require it; social feeds and analytics don't.

What's Next:

Page Complete

1 / 5