System DesignStrong Consistency

Strong Consistency

LevelAdvanced

Duration75 mins

TopicStrong Consistency

1 / 5

Linearizability: The Gold Standard of Consistency

The Promise of a Single-Copy Illusion

Imagine you're designing a system where multiple users can simultaneously bid on an auction item. User A places a bid of $100 from New York. At almost the same instant, User B places a bid of $105 from Tokyo. Which bid wins? When each user refreshes their screen, what should they see?

In a world of distributed systems—where data is replicated across continents for availability and performance—answering this seemingly simple question becomes extraordinarily complex. Without strong guarantees, User A might see their $100 bid as the current highest, while User B simultaneously sees $105 as the highest. Worse, a third user might see an older bid of $95 as the current highest.

This is where linearizability enters the picture. It's the strongest practical consistency model in distributed systems, providing a guarantee that makes a distributed system appear to behave as if there's only a single copy of the data, with all operations happening atomically at some point in time.

What You Will Learn

By the end of this page, you will deeply understand what linearizability means, how it differs from other consistency models, the formal definition and intuition behind it, and why it's considered the strongest practical consistency guarantee. You'll also understand the concept of 'linearization points' and how to reason about whether a system is linearizable.

What Is Linearizability?

Linearizability (also known as atomic consistency or strong consistency) is a consistency model that provides the strongest guarantee about how operations on shared data appear to clients. At its core, linearizability makes a powerful promise:

Once a write operation completes, all subsequent reads (from any client, on any replica) will return that written value or a more recent value. The system behaves as if there is only one copy of the data, and all operations are atomic.

This might sound straightforward, but in a distributed system where data exists on multiple machines across the globe, achieving this guarantee is remarkably challenging and comes with significant trade-offs.

The Single-Copy Illusion

Think of linearizability as creating an illusion: even though your data is stored on dozens of servers across multiple continents, the system behaves as if there's only one server holding one copy of the data. Every client sees the same data at the same logical time, and every operation appears to happen instantaneously at some point between its invocation and completion.

The Formal Definition

Formally, a system is linearizable if:

Each operation appears to take effect atomically at some instant between its invocation and response.
This instant is called the linearization point — the logical moment when the operation "happens."
The order of these linearization points must be consistent with the real-time order of non-overlapping operations.

In simpler terms: if operation A completes before operation B starts (in real time), then A must appear to have happened before B in the total order. However, if A and B overlap in time, the system can order them in either way—as long as it picks one order consistently.

A Concrete Example

Consider a simple key-value store with operations:

write(x, v) — writes value v to key x
read(x) — reads the current value of key x

Let's trace through a scenario with three clients:

Timeline of Operations in a Linearizable System
Time	Client A	Client B	Client C	Value of x
T0	Initial state			x = 0
T1	write(x, 1) starts			x = 0
T2		read(x) starts		x = 0
T3	write(x, 1) completes			x = 1 (linearization point)
T4		read(x) returns 1		x = 1
T5			read(x) starts & completes	returns 1

In this example:

Client A's write linearizes at T3
Client B's read overlaps with A's write, so it could return either 0 or 1—but once it returns 1, all subsequent reads must return 1 or later values
Client C's read, starting after A's write completes, must return 1 (or a more recent value)

The key insight: linearizability ensures that once you observe a value, you can never "go back in time" and see an older value.

Understanding Linearization Points

The concept of linearization points is central to understanding and proving linearizability. Every operation in a linearizable system has a linearization point—a specific instant in time when the operation "logically takes effect."

Properties of Linearization Points

Between invocation and response: The linearization point must occur after the operation is invoked and before the response is returned to the client.
Atomic: All effects of the operation happen instantaneously at this point. There's no intermediate state.
Ordered: Linearization points of all operations form a total order that respects:
- The real-time order of non-overlapping operations
- The sequential semantics of the data structure

Visualizing Linearization Points

Converting Mermaid diagram...

The Challenge of Overlapping Operations

When operations overlap in time, the system has flexibility in ordering their linearization points. However, this flexibility comes with constraints:

Legal orderings must be consistent with sequential semantics. If a read overlaps with a write, the read can see either the old or new value. But once any read returns the new value, all subsequent reads must also return the new value (or an even newer one).

This is what distinguishes linearizability from weaker models. Consider this scenario:

Linearizable Execution

•T1: write(x, 1) starts
•T2: read(x) by Client A returns 0
•T3: read(x) by Client B returns 1
•T4: read(x) by Client C returns 1
•T5: write(x, 1) completes
•✓ Valid: Client A's read linearized before write, B and C's reads linearized after

Non-Linearizable Execution

•T1: write(x, 1) starts
•T2: read(x) by Client A returns 1
•T3: read(x) by Client B returns 0
•T4: read(x) by Client C returns 1
•T5: write(x, 1) completes
•✗ Invalid: Client B saw 0 after A saw 1 — time went backward!

The Monotonicity Property

Linearizability guarantees monotonic reads: once you observe a value, you (and everyone else) will never observe an older value. This property is violated by many eventually consistent systems, leading to confusing user experiences where data appears to 'jump backward in time.'

Linearizability vs. Other Consistency Models

Understanding linearizability requires comparing it with other consistency models. Each model makes different trade-offs between strength of guarantees and system performance/availability.

The Consistency Spectrum

Consistency models form a spectrum from strongest to weakest:

Consistency Models Comparison
Model	Guarantee	Intuition	Cost
Strict Consistency	Instantaneous global visibility	Theoretically impossible in distributed systems	N/A (impossible)
Linearizability	Single-copy illusion with real-time order	As if one copy exists, operations atomic	Highest practical cost
Sequential Consistency	Global order respects program order	Same order seen by all, but timing not guaranteed	High cost
Causal Consistency	Cause-effect relationships preserved	Related operations ordered, unrelated may differ	Medium cost
Eventual Consistency	Replicas eventually converge	Given enough time, all see same data	Lowest cost

Linearizability vs. Sequential Consistency

The distinction between these two is subtle but crucial:

Sequential Consistency requires that all processes see the same order of operations, and this order respects the program order within each process. However, it says nothing about real-time ordering between processes.

Linearizability adds the real-time constraint: if operation A completes before B starts (in wall-clock time), then A must appear before B in the global order.

Example of the difference:

Process 1: write(x, 1)  [completes at T1]
Process 2: read(x)      [starts at T2 where T2 > T1]

Sequentially Consistent: read(x) could return 0 (if ordered before write)
Linearizable: read(x) MUST return 1 (write completed before read started)

This real-time guarantee makes linearizability much stronger—and much more expensive to implement.

Why Real-Time Matters

The real-time guarantee is what makes linearizability intuitive for programmers. If you write data and then tell your colleague to read it (via phone, email, etc.), linearizability guarantees they'll see your write. Sequential consistency does not provide this guarantee—your colleague might see stale data even though your write completed before they started reading.

Linearizability vs. Serializability

These terms are often confused because they sound similar and relate to ordering guarantees:

Serializability is a property of transactions (groups of operations). It guarantees that the outcome of executing concurrent transactions is equivalent to some serial (one-at-a-time) execution.

Linearizability is a property of single operations on single objects. It guarantees that each operation appears atomic.

They operate at different levels:

Linearizability: per-object, per-operation
Serializability: per-transaction (possibly spanning multiple objects)

Strict Serializability combines both: transactions appear to execute in some serial order that respects real-time ordering. This is what systems like Google Spanner provide.

Linearizability vs. Serializability
Aspect	Linearizability	Serializability
Scope	Single operations on single objects	Transactions on multiple objects
Guarantee	Real-time ordering of operations	Equivalent to serial execution
Composability	Operations on different objects are independent	Transaction encompasses all operations
Origin	Distributed systems / concurrent objects	Database theory
Example System	Linearizable key-value store	Traditional RDBMS

Why Linearizability Matters in Practice

Linearizability isn't an academic curiosity—it's essential for implementing many real-world systems correctly. Without linearizability, certain applications become impossible or extremely difficult to build correctly.

Critical Use Cases Requiring Linearizability

1. Distributed Locking

Distributed locks coordinate access to shared resources across multiple processes. Without linearizability, two processes could simultaneously believe they hold the lock:

Process A: acquireLock() returns SUCCESS
Process B: acquireLock() returns SUCCESS  // DISASTER!

If the lock service isn't linearizable, both processes might see a state where the lock is available, leading to concurrent access to the protected resource.

2. Leader Election

In systems with a single leader (databases, message queues, coordination services), exactly one node must be elected leader at any time. Non-linearizable systems can result in a "split-brain" scenario:

Node A: checkLeader() returns 'A is leader'
Node B: checkLeader() returns 'B is leader'  // SPLIT-BRAIN!

Both nodes believe they're the leader, potentially causing data corruption as both accept writes.

3. Unique Constraint Enforcement

Enforcing uniqueness (usernames, email addresses, transaction IDs) requires linearizability. Consider username registration:

username_registration_race.pseudo
1
2
3
4
5
6
7
8
9
10
11
// Without linearizability, this can fail:
 
// Process A (New York)            // Process B (Tokyo)
exists = check("alice")            exists = check("alice")
// returns false                    // returns false (stale!)
 
insert("alice")                     insert("alice")
// SUCCESS                          // SUCCESS  — DUPLICATE USERNAME!
 
// Both processes saw the username as available because
// their reads were not linearized with respect to each other's writes.

4. Compare-and-Swap (CAS) Operations

Many distributed algorithms rely on atomic compare-and-swap:

cas(key, expectedValue, newValue)

This operation should atomically: read the current value, compare it to the expected value, and only write the new value if they match. Without linearizability, the "read" portion might see stale data, breaking the atomicity guarantee.

5. Financial Systems

Bank transfers, trading systems, and payment processing require linearizability to prevent:

Double-spending
Lost transactions
Inconsistent balances across systems

The Cost of Getting It Wrong

Without linearizability in critical coordination paths, systems experience split-brain scenarios, data loss, duplicate operations, and cascading failures. The 2018 Jepsen tests revealed that many databases claiming 'strong consistency' actually violated linearizability under partition conditions, leading to real data loss.

Applications Requiring Linearizability

•Distributed Coordination Services — ZooKeeper, etcd, Consul for service discovery and configuration
•Distributed Locks and Semaphores — Redis Redlock (with caveats), Chubby-style lock services
•Unique ID Generation — Snowflake-style globally unique identifiers
•Linearizable Key-Value Stores — etcd, FoundationDB, Spanner
•Single-Leader Database Replication — Ensuring failover doesn't lose acknowledged writes
•Financial Transaction Processing — Payment systems, stock trading, banking

How Linearizability Is Achieved

Achieving linearizability in a distributed system requires careful design. There are several fundamental approaches, each with different trade-offs:

Approach 1: Single Leader (Synchronous Replication)

The simplest approach: route all reads and writes through a single node.

How it works:

All writes go to the leader
Leader applies write synchronously (possibly waiting for replicas)
Leader responds to client
All reads also go to the leader

Trade-offs:

✅ Simple to implement and reason about
❌ Leader is a single point of failure
❌ All traffic goes through one node (bottleneck)
❌ Latency equals round-trip to leader + processing

Converting Mermaid diagram...

Approach 2: Consensus Protocols

Consensus protocols (Paxos, Raft, Viewstamped Replication) achieve linearizability through coordinated agreement among a quorum of nodes.

How it works:

Client sends request to a replica (often the leader)
Leader proposes the operation to all replicas
Wait for a quorum (majority) to acknowledge
Operation is committed and response sent to client

Why it's linearizable:

Any two quorums must overlap by at least one node
This overlap ensures information propagates correctly
No operation can commit unless a majority agrees
Reads can be linearizable by going through the same process or using "linearizable leases"

raft_write_pseudocode.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// Simplified Raft write flow for linearizability
 
function write(key, value):
    entry = createLogEntry(key, value)
    
    // Leader appends to its log
    leader.log.append(entry)
    
    // Replicate to followers
    for follower in followers:
        sendAppendEntries(follower, entry)
    
    // Wait for majority acknowledgment
    waitForQuorum(entry.index)
    
    // Commit and apply
    commit(entry.index)
    applyToStateMachine(entry)
    
    // Linearization point: when entry is committed
    return SUCCESS

Approach 3: Linearizable Quorum Reads

In quorum-based systems, achieving linearizable reads is subtle. Simply reading from a quorum is not sufficient—you might read from a quorum that hasn't yet seen the latest write.

Techniques for linearizable reads:

Read through consensus: Route reads through the same consensus protocol as writes
Lease-based reads: Leader holds a lease; reads are valid while lease is active
Read repair with quorum intersection: Read from quorum, write any newer values back

Each technique has latency implications:

Consensus reads: ~2 round trips
Lease reads: 1 round trip (to leader), but adds complexity
Quorum reads with repair: Variable, depending on conflicts

The CAP Theorem Connection

The CAP theorem states that during a network partition, a system must choose between Consistency (C) and Availability (A). Linearizability is the 'C' in CAP. If you require linearizability, your system will be unavailable during partitions. This is a fundamental trade-off, not an implementation limitation.

Testing and Verifying Linearizability

Claiming linearizability is easy; proving it is hard. How do we verify that a system actually provides linearizable operations?

The Linearizability Checker

Given a history of operations (invocations and responses with timestamps), we check if there exists a valid linearization:

Collect all operations with their start and end times
Try to find an ordering of linearization points such that:
- Each point falls between invocation and response
- The sequential execution with this ordering is valid
- Non-overlapping operations respect real-time order

The challenge: This is an NP-complete problem in general. Checking linearizability of arbitrary histories is computationally hard.

Practical Testing Approaches

1. Jepsen Testing Framework

Jepsen is the industry-standard tool for testing distributed systems' consistency claims:

Runs operations concurrently against the system
Introduces failures (network partitions, node crashes)
Records complete history of operations
Uses linearizability checkers (Knossos, Elle) to verify histories
Has uncovered bugs in Redis, MongoDB, CockroachDB, etcd, and many others

jepsen_test_example.clj
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
; Example Jepsen test checking linearizability of a register
 
(deftest register-test
  (jepsen/run!
    (assoc tests/noop-test
           :name      "etcd-register"
           :client    (client/client etcd-client)
           :nemesis   (nemesis/partition-random-halves)
           :generator (gen/phases
                        (->> (independent/concurrent-generator
                               10
                               (range)
                               (fn [k]
                                 (gen/mix [r w cas])))
                             (gen/nemesis
                               (cycle [(gen/sleep 5)
                                       {:type :info :f :start}
                                       (gen/sleep 5)
                                       {:type :info :f :stop}]))
                             (gen/time-limit 120)))
           :checker   (checker/compose
                        {:linear (checker/linearizable
                                   {:model (model/register)})
                         :perf   (checker/perf)}))))
 
; This test:
; 1. Runs read/write/CAS operations concurrently
; 2. Randomly partitions the network
; 3. Checks if the resulting history is linearizable

2. WGL Algorithm

The Wing & Gong algorithm provides a practical approach:

Constructs a graph where nodes are operations
Edges represent ordering constraints
Checks for cycles (cycles = not linearizable)
Optimized for common patterns

3. Formal Verification

For critical systems, formal methods can prove linearizability:

TLA+ specifications
Model checking with bounded state spaces
Theorem proving (Coq, Isabelle)

Testing Is Necessary But Not Sufficient

Passing Jepsen tests doesn't prove a system is linearizable—it only means no violations were found in the tested scenarios. Many linearizability bugs only manifest under specific timing or failure conditions. Formal verification provides stronger guarantees but is more expensive to apply.

Summary: Understanding Linearizability

We've explored linearizability—the gold standard of distributed systems consistency. Let's consolidate the key concepts:

Key Takeaways

•Linearizability creates a single-copy illusion — Even with multiple replicas, the system behaves as if there's one copy with atomic operations.
•Each operation has a linearization point — The logical instant between invocation and response when the operation takes effect.
•Real-time ordering is preserved — If operation A completes before B starts, A's effect must be visible to B.
•Different from sequential consistency — Linearizability adds the real-time constraint that sequential consistency lacks.
•Different from serializability — Linearizability is per-operation; serializability is per-transaction.
•Essential for coordination primitives — Locks, leader election, unique constraints require linearizability.
•Achieved through consensus or single-leader — Both approaches involve trade-offs in availability and performance.

What's Next:

Now that we understand what linearizability guarantees, the next page explores the implementation costs—what it takes to build a linearizable system and why these costs make linearizability impractical for many use cases.

Page Complete

You now understand linearizability—the strongest practical consistency model in distributed systems. You can reason about linearization points, distinguish linearizability from other consistency models, and identify use cases where linearizability is essential. Next, we'll examine the costs of implementing this powerful guarantee.

1 / 5

Loading learning content...

System DesignStrong Consistency

Strong Consistency

LevelAdvanced

Duration75 mins

TopicStrong Consistency

1 / 5

Linearizability: The Gold Standard of Consistency

The Promise of a Single-Copy Illusion

What You Will Learn

What Is Linearizability?

Once a write operation completes, all subsequent reads (from any client, on any replica) will return that written value or a more recent value. The system behaves as if there is only one copy of the data, and all operations are atomic.

The Single-Copy Illusion

The Formal Definition

Formally, a system is linearizable if:

Each operation appears to take effect atomically at some instant between its invocation and response.
This instant is called the linearization point — the logical moment when the operation "happens."
The order of these linearization points must be consistent with the real-time order of non-overlapping operations.

A Concrete Example

Consider a simple key-value store with operations:

write(x, v) — writes value v to key x
read(x) — reads the current value of key x

Let's trace through a scenario with three clients:

Timeline of Operations in a Linearizable System
Time	Client A	Client B	Client C	Value of x
T0	Initial state			x = 0
T1	write(x, 1) starts			x = 0
T2		read(x) starts		x = 0
T3	write(x, 1) completes			x = 1 (linearization point)
T4		read(x) returns 1		x = 1
T5			read(x) starts & completes	returns 1

In this example:

Client A's write linearizes at T3
Client B's read overlaps with A's write, so it could return either 0 or 1—but once it returns 1, all subsequent reads must return 1 or later values
Client C's read, starting after A's write completes, must return 1 (or a more recent value)

The key insight: linearizability ensures that once you observe a value, you can never "go back in time" and see an older value.

Understanding Linearization Points

Properties of Linearization Points

Between invocation and response: The linearization point must occur after the operation is invoked and before the response is returned to the client.
Atomic: All effects of the operation happen instantaneously at this point. There's no intermediate state.
Ordered: Linearization points of all operations form a total order that respects:
- The real-time order of non-overlapping operations
- The sequential semantics of the data structure

Visualizing Linearization Points

Converting Mermaid diagram...

The Challenge of Overlapping Operations

When operations overlap in time, the system has flexibility in ordering their linearization points. However, this flexibility comes with constraints:

This is what distinguishes linearizability from weaker models. Consider this scenario:

Linearizable Execution

•T1: write(x, 1) starts
•T2: read(x) by Client A returns 0
•T3: read(x) by Client B returns 1
•T4: read(x) by Client C returns 1
•T5: write(x, 1) completes
•✓ Valid: Client A's read linearized before write, B and C's reads linearized after

Non-Linearizable Execution

•T1: write(x, 1) starts
•T2: read(x) by Client A returns 1
•T3: read(x) by Client B returns 0
•T4: read(x) by Client C returns 1
•T5: write(x, 1) completes
•✗ Invalid: Client B saw 0 after A saw 1 — time went backward!

The Monotonicity Property

Linearizability vs. Other Consistency Models

Understanding linearizability requires comparing it with other consistency models. Each model makes different trade-offs between strength of guarantees and system performance/availability.

The Consistency Spectrum

Consistency models form a spectrum from strongest to weakest:

Consistency Models Comparison
Model	Guarantee	Intuition	Cost
Strict Consistency	Instantaneous global visibility	Theoretically impossible in distributed systems	N/A (impossible)
Linearizability	Single-copy illusion with real-time order	As if one copy exists, operations atomic	Highest practical cost
Sequential Consistency	Global order respects program order	Same order seen by all, but timing not guaranteed	High cost
Causal Consistency	Cause-effect relationships preserved	Related operations ordered, unrelated may differ	Medium cost
Eventual Consistency	Replicas eventually converge	Given enough time, all see same data	Lowest cost

Linearizability vs. Sequential Consistency

The distinction between these two is subtle but crucial:

Linearizability adds the real-time constraint: if operation A completes before B starts (in wall-clock time), then A must appear before B in the global order.

Example of the difference:

Process 1: write(x, 1)  [completes at T1]
Process 2: read(x)      [starts at T2 where T2 > T1]

Sequentially Consistent: read(x) could return 0 (if ordered before write)
Linearizable: read(x) MUST return 1 (write completed before read started)

This real-time guarantee makes linearizability much stronger—and much more expensive to implement.

Why Real-Time Matters

Linearizability vs. Serializability

These terms are often confused because they sound similar and relate to ordering guarantees:

Serializability is a property of transactions (groups of operations). It guarantees that the outcome of executing concurrent transactions is equivalent to some serial (one-at-a-time) execution.

Linearizability is a property of single operations on single objects. It guarantees that each operation appears atomic.

They operate at different levels:

Linearizability: per-object, per-operation
Serializability: per-transaction (possibly spanning multiple objects)

Strict Serializability combines both: transactions appear to execute in some serial order that respects real-time ordering. This is what systems like Google Spanner provide.

Linearizability vs. Serializability
Aspect	Linearizability	Serializability
Scope	Single operations on single objects	Transactions on multiple objects
Guarantee	Real-time ordering of operations	Equivalent to serial execution
Composability	Operations on different objects are independent	Transaction encompasses all operations
Origin	Distributed systems / concurrent objects	Database theory
Example System	Linearizable key-value store	Traditional RDBMS

Why Linearizability Matters in Practice

Critical Use Cases Requiring Linearizability

1. Distributed Locking

Distributed locks coordinate access to shared resources across multiple processes. Without linearizability, two processes could simultaneously believe they hold the lock:

Process A: acquireLock() returns SUCCESS
Process B: acquireLock() returns SUCCESS  // DISASTER!

If the lock service isn't linearizable, both processes might see a state where the lock is available, leading to concurrent access to the protected resource.

2. Leader Election

Node A: checkLeader() returns 'A is leader'
Node B: checkLeader() returns 'B is leader'  // SPLIT-BRAIN!

Both nodes believe they're the leader, potentially causing data corruption as both accept writes.

3. Unique Constraint Enforcement

Enforcing uniqueness (usernames, email addresses, transaction IDs) requires linearizability. Consider username registration:

username_registration_race.pseudo
1
2
3
4
5
6
7
8
9
10
11
// Without linearizability, this can fail:
 
// Process A (New York)            // Process B (Tokyo)
exists = check("alice")            exists = check("alice")
// returns false                    // returns false (stale!)
 
insert("alice")                     insert("alice")
// SUCCESS                          // SUCCESS  — DUPLICATE USERNAME!
 
// Both processes saw the username as available because
// their reads were not linearized with respect to each other's writes.

4. Compare-and-Swap (CAS) Operations

Many distributed algorithms rely on atomic compare-and-swap:

cas(key, expectedValue, newValue)

5. Financial Systems

Bank transfers, trading systems, and payment processing require linearizability to prevent:

Double-spending
Lost transactions
Inconsistent balances across systems

The Cost of Getting It Wrong

Applications Requiring Linearizability

•Distributed Coordination Services — ZooKeeper, etcd, Consul for service discovery and configuration
•Distributed Locks and Semaphores — Redis Redlock (with caveats), Chubby-style lock services
•Unique ID Generation — Snowflake-style globally unique identifiers
•Linearizable Key-Value Stores — etcd, FoundationDB, Spanner
•Single-Leader Database Replication — Ensuring failover doesn't lose acknowledged writes
•Financial Transaction Processing — Payment systems, stock trading, banking

How Linearizability Is Achieved

Achieving linearizability in a distributed system requires careful design. There are several fundamental approaches, each with different trade-offs:

Approach 1: Single Leader (Synchronous Replication)

The simplest approach: route all reads and writes through a single node.

How it works:

All writes go to the leader
Leader applies write synchronously (possibly waiting for replicas)
Leader responds to client
All reads also go to the leader

Trade-offs:

✅ Simple to implement and reason about
❌ Leader is a single point of failure
❌ All traffic goes through one node (bottleneck)
❌ Latency equals round-trip to leader + processing

Converting Mermaid diagram...

Approach 2: Consensus Protocols

Consensus protocols (Paxos, Raft, Viewstamped Replication) achieve linearizability through coordinated agreement among a quorum of nodes.

How it works:

Client sends request to a replica (often the leader)
Leader proposes the operation to all replicas
Wait for a quorum (majority) to acknowledge
Operation is committed and response sent to client

Why it's linearizable:

Any two quorums must overlap by at least one node
This overlap ensures information propagates correctly
No operation can commit unless a majority agrees
Reads can be linearizable by going through the same process or using "linearizable leases"

raft_write_pseudocode.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// Simplified Raft write flow for linearizability
 
function write(key, value):
    entry = createLogEntry(key, value)
    
    // Leader appends to its log
    leader.log.append(entry)
    
    // Replicate to followers
    for follower in followers:
        sendAppendEntries(follower, entry)
    
    // Wait for majority acknowledgment
    waitForQuorum(entry.index)
    
    // Commit and apply
    commit(entry.index)
    applyToStateMachine(entry)
    
    // Linearization point: when entry is committed
    return SUCCESS

Approach 3: Linearizable Quorum Reads

In quorum-based systems, achieving linearizable reads is subtle. Simply reading from a quorum is not sufficient—you might read from a quorum that hasn't yet seen the latest write.

Techniques for linearizable reads:

Read through consensus: Route reads through the same consensus protocol as writes
Lease-based reads: Leader holds a lease; reads are valid while lease is active
Read repair with quorum intersection: Read from quorum, write any newer values back

Each technique has latency implications:

Consensus reads: ~2 round trips
Lease reads: 1 round trip (to leader), but adds complexity
Quorum reads with repair: Variable, depending on conflicts

The CAP Theorem Connection

Testing and Verifying Linearizability

Claiming linearizability is easy; proving it is hard. How do we verify that a system actually provides linearizable operations?

The Linearizability Checker

Given a history of operations (invocations and responses with timestamps), we check if there exists a valid linearization:

Collect all operations with their start and end times
Try to find an ordering of linearization points such that:
- Each point falls between invocation and response
- The sequential execution with this ordering is valid
- Non-overlapping operations respect real-time order

The challenge: This is an NP-complete problem in general. Checking linearizability of arbitrary histories is computationally hard.

Practical Testing Approaches

1. Jepsen Testing Framework

Jepsen is the industry-standard tool for testing distributed systems' consistency claims:

Runs operations concurrently against the system
Introduces failures (network partitions, node crashes)
Records complete history of operations
Uses linearizability checkers (Knossos, Elle) to verify histories
Has uncovered bugs in Redis, MongoDB, CockroachDB, etcd, and many others

jepsen_test_example.clj
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
; Example Jepsen test checking linearizability of a register
 
(deftest register-test
  (jepsen/run!
    (assoc tests/noop-test
           :name      "etcd-register"
           :client    (client/client etcd-client)
           :nemesis   (nemesis/partition-random-halves)
           :generator (gen/phases
                        (->> (independent/concurrent-generator
                               10
                               (range)
                               (fn [k]
                                 (gen/mix [r w cas])))
                             (gen/nemesis
                               (cycle [(gen/sleep 5)
                                       {:type :info :f :start}
                                       (gen/sleep 5)
                                       {:type :info :f :stop}]))
                             (gen/time-limit 120)))
           :checker   (checker/compose
                        {:linear (checker/linearizable
                                   {:model (model/register)})
                         :perf   (checker/perf)}))))
 
; This test:
; 1. Runs read/write/CAS operations concurrently
; 2. Randomly partitions the network
; 3. Checks if the resulting history is linearizable

2. WGL Algorithm

The Wing & Gong algorithm provides a practical approach:

Constructs a graph where nodes are operations
Edges represent ordering constraints
Checks for cycles (cycles = not linearizable)
Optimized for common patterns

3. Formal Verification

For critical systems, formal methods can prove linearizability:

TLA+ specifications
Model checking with bounded state spaces
Theorem proving (Coq, Isabelle)

Testing Is Necessary But Not Sufficient

Summary: Understanding Linearizability

We've explored linearizability—the gold standard of distributed systems consistency. Let's consolidate the key concepts:

Key Takeaways

•Linearizability creates a single-copy illusion — Even with multiple replicas, the system behaves as if there's one copy with atomic operations.
•Each operation has a linearization point — The logical instant between invocation and response when the operation takes effect.
•Real-time ordering is preserved — If operation A completes before B starts, A's effect must be visible to B.
•Different from sequential consistency — Linearizability adds the real-time constraint that sequential consistency lacks.
•Different from serializability — Linearizability is per-operation; serializability is per-transaction.
•Essential for coordination primitives — Locks, leader election, unique constraints require linearizability.
•Achieved through consensus or single-leader — Both approaches involve trade-offs in availability and performance.

What's Next:

Page Complete

1 / 5