System Design (HLD)Paxos Algorithm

The Paxos Consensus Algorithm

LevelAdvanced

Duration90 mins

TopicPaxos Algorithm

1 / 5

Basic Paxos Overview

The Quest for Agreement in Distributed Systems

Imagine a scenario that seems deceptively simple: three computers need to agree on a single value. Perhaps they're deciding which server should be the leader, or committing a transaction, or choosing a configuration setting. With reliable communication, this would be trivial—everyone broadcasts their vote, and the majority wins.

But what happens when messages can be delayed, lost, or delivered out of order? What if computers can crash and restart at any moment? What if the network partitions, leaving different groups of computers unable to communicate with each other?

These conditions transform a simple voting problem into one of the most profound challenges in computer science: the distributed consensus problem. And the algorithm that first solved this problem—with mathematical rigor and correctness proofs—has shaped every distributed system built in the past three decades.

That algorithm is Paxos.

What You Will Learn

This page introduces Paxos at a conceptual level. You will understand the historical context that led to Paxos, why achieving consensus is fundamentally hard, what problem Paxos actually solves, and the key insights that make it work. By the end, you'll have a solid foundation for understanding the detailed protocol mechanics in subsequent pages.

Historical Context — The Birth of Paxos

Understanding Paxos requires appreciating the intellectual context from which it emerged. By the late 1980s, distributed systems researchers faced a daunting reality: building reliable systems from unreliable components was extraordinarily difficult, and most approaches either sacrificed correctness or availability.

Leslie Lamport, a computer scientist at SRI International (and later Microsoft Research), had already made groundbreaking contributions to distributed systems theory, including the foundational work on logical clocks and the happens-before relationship. In 1989, he turned his attention to the consensus problem.

The Part-Time Parliament

Lamport originally described Paxos through an allegory about a fictional Greek island called Paxos, where legislators needed to agree on laws despite their part-time attendance at the parliament. This whimsical presentation style was considered unconventional by academic standards, and the paper—'The Part-Time Parliament'—was rejected by multiple journals before finally being published in 1998, nearly a decade after its conception.

Why the long delay mattered:

The delayed publication created an interesting phenomenon. During the 1990s, distributed systems practitioners struggled with consensus problems without access to Paxos. Systems were built with ad-hoc solutions that often had subtle bugs—bugs that only manifested under rare failure conditions that were nearly impossible to test.

When 'The Part-Time Parliament' finally appeared in 1998, many readers found it impenetrable, complaining about the Greek parliament allegory obscuring the technical content. Lamport eventually published a simplified version, 'Paxos Made Simple' (2001), which presents the algorithm more directly.

Key Historical Milestones

•1978 — Lamport introduces logical clocks and the happens-before relationship
•1985 — Fischer, Lynch, and Paterson prove the FLP impossibility result, showing consensus is impossible in purely asynchronous systems with even one faulty process
•1989 — Lamport develops the Paxos algorithm at SRI International
•1998 — 'The Part-Time Parliament' finally published in ACM TOCS
•2001 — 'Paxos Made Simple' provides a more accessible presentation
•2006 — Google's Chubby lock service implements Paxos, validating it at scale
•2014 — Raft emerges as a more understandable alternative, explicitly designed for implementation clarity

The impact of Paxos:

Despite—or perhaps because of—its complexity, Paxos has become the theoretical foundation for virtually every production consensus system. Google's Chubby and Spanner, Apache ZooKeeper (via ZAB), and numerous proprietary systems all implement variants of Paxos. Even systems that claim to use different algorithms (like Raft or Viewstamped Replication) are, at their core, solving the same problem with mechanisms that are provably equivalent to Paxos.

Lamport himself received the Turing Award in 2013, with Paxos cited as one of his seminal contributions.

The Consensus Problem Defined

Before diving into how Paxos works, we must precisely define the problem it solves. Consensus in distributed systems means getting a collection of independent processes to agree on a single value, despite failures and communication uncertainties.

The formal specification:

A correct consensus algorithm must satisfy three properties:

Consensus Properties

•Validity (Non-triviality) — If a value is decided, it must have been proposed by some process. The algorithm cannot invent values out of thin air; it must choose from the values actually proposed.
•Agreement (Safety) — No two correct processes can decide on different values. Once any process has decided on a value, all other processes that decide must choose the same value. This is the core safety guarantee.
•Termination (Liveness) — Eventually, every correct process must decide on some value. The system cannot deadlock indefinitely. However, termination is only guaranteed under certain synchrony assumptions.

The FLP Impossibility Result

In 1985, Fischer, Lynch, and Paterson proved that in a purely asynchronous system (where message delays are unbounded), no deterministic algorithm can guarantee all three properties if even a single process can fail. This is the famous FLP impossibility result. Paxos works around this by relying on eventual synchrony—assuming that the network will eventually behave well enough to make progress.

Why consensus is hard:

The difficulty of consensus stems from the interplay of failures and asynchrony. Consider these challenges:

Challenges in Distributed Consensus
Challenge	Description	Why It Complicates Consensus
Message Loss	Network packets can be dropped without notification	A process cannot distinguish between a crashed peer and a slow/unreachable one
Message Delay	Messages can be delivered after arbitrary delays	A 'late' vote might arrive after a decision has already been made
Message Reordering	Messages may arrive in different order than sent	Two processes might see the same events in different sequences
Process Crashes	Processes can fail at any point during execution	A process might crash after sending some messages but before others
Process Recovery	Crashed processes can restart with persistent state	A recovering process might have outdated information or duplicate messages
Network Partitions	Groups of processes become mutually unreachable	Each partition might try to make independent decisions

The fundamental tension:

Consensus algorithms must navigate a fundamental tension:

Safety demands that we never make conflicting decisions
Liveness demands that we eventually make a decision

Under adversarial conditions (arbitrary message delays, arbitrary process failures), these requirements conflict. If we're too cautious (always waiting for confirmation), we risk deadlock. If we're too aggressive (making decisions quickly), conflicting decisions can be made.

Paxos resolves this tension by ensuring safety unconditionally (under all failure scenarios) while providing liveness under reasonable assumptions about eventual network behavior.

What Problem Does Paxos Actually Solve?

It's easy to have a vague sense that Paxos 'solves consensus.' But understanding precisely what Paxos guarantees—and what it does not guarantee—is essential for using it correctly.

Paxos solves single-value consensus:

The basic Paxos algorithm agrees on a single value. Given a set of processes, possibly proposing different values, Paxos ensures that all processes that complete the protocol agree on exactly one of the proposed values.

Once a value is chosen, it is chosen forever. No subsequent execution of the protocol can change it. This immutability is fundamental—it's what allows replicated state machines to maintain consistent histories.

Single-Value vs. Multi-Value

Basic Paxos decides on exactly one value. For practical systems that need to decide on a sequence of values (like a replicated log), Multi-Paxos runs multiple instances of the protocol, one for each log entry. We'll cover Multi-Paxos in a later page.

The system model assumptions:

Paxos operates under specific assumptions about the environment:

Paxos System Model

•Non-Byzantine failures only — Processes can crash and recover, but they do not behave maliciously or arbitrarily. They follow the protocol correctly when operational.
•Asynchronous communication — Messages can be delayed, reordered, or lost, but messages that are delivered are delivered intact (no corruption).
•Durable storage — Processes have access to stable storage that survives crashes. Promises and acceptances are written to disk before being sent.
•Process identification — Each process has a unique identifier, and processes can authenticate messages.
•Eventual partial synchrony — For liveness, Paxos assumes that eventually the network will behave synchronously enough for messages to be delivered in bounded time.

What Paxos does NOT solve:

Understanding Paxos's limitations is as important as understanding its guarantees:

Paxos Guarantees

•Safety: No conflicting decisions ever
•Single-value agreement among participants
•Tolerance for process crashes and restarts
•Tolerance for message loss and delays
•Tolerance for network partitions (safety-wise)
•Liveness under eventual synchrony

Paxos Does NOT Guarantee

•Tolerance for Byzantine (malicious) failures
•Progress during extended network partitions
•Bounded latency for decisions
•Fairness among proposers
•Automatic reconfiguration of participants
•Multi-value consensus (without Multi-Paxos)

The Intuition Behind Paxos

Before examining the precise mechanics of Paxos, let's build intuition about its core ideas. At its heart, Paxos is remarkably elegant—it's the composition of a few key insights.

Key Insight #1: Majorities Always Overlap

If you have 5 nodes and any decision requires a majority (3 nodes), then any two majorities must share at least one node in common. This overlapping node guarantees information transfer between decisions.

This insight is fundamental: even if the network partitions, at most one partition can contain a majority. And any two majorities will have at least one node that participated in both, carrying information from one decision to the next.

The Quorum Intersection Property

For any two quorums Q1 and Q2 of a group of N nodes, if each quorum contains more than N/2 nodes, then Q1 ∩ Q2 ≠ ∅. This simple mathematical property is the foundation upon which Paxos's safety is built.

Key Insight #2: Proposal Numbers Create Total Ordering

In a distributed system, it's difficult to establish a global notion of time. Paxos sidesteps this by using proposal numbers (also called ballot numbers)—unique, ordered identifiers that establish a total ordering among proposals.

Each proposer generates proposal numbers that are guaranteed to be unique (typically by incorporating the proposer's ID) and monotonically increasing. This ordering allows Paxos to reason about 'newer' and 'older' proposals without relying on synchronized clocks.

Key Insight #3: Promise Before Accept

Paxos uses a two-phase approach:

Phase 1 (Prepare): A proposer asks acceptors to promise not to accept any proposal older than its current proposal number. This 'locks out' older proposals.
Phase 2 (Accept): If the proposer receives promises from a majority, it asks acceptors to accept a value. Importantly, the proposer may be required to propose a value that was previously accepted (discovered in Phase 1) rather than its own preferred value.

This two-phase structure ensures that once a value is accepted by a majority, all future proposals will discover and propagate that value.

Converting Mermaid diagram...

Key Insight #4: Values are 'Stickier' Than They Appear

Here's the crucial subtlety: once a value has been accepted by a majority of acceptors in round N, any proposer attempting a higher-numbered round N+k will discover that value during Phase 1 (because it must contact a majority, which overlaps with the previous accepting majority) and will be forced to propose that same value.

This creates a kind of 'value stickiness'—values, once accepted by a majority, propagate forward to all future proposals, even if the original proposer has crashed.

The Elegance of Paxos

Paxos doesn't use explicit voting, locking, or two-phase commit. Instead, it uses the combination of majority quorums, proposal ordering, and the prepare-accept pattern to achieve consensus. The algorithm is so simple in its core logic that many engineers, upon first understanding it, wonder how it took so long to discover.

The Architecture at a Glance

Paxos defines three logical roles that participants can play. In practice, the same physical node often plays multiple roles, but understanding the roles separately clarifies the algorithm's structure.

Paxos Roles Overview
Role	Responsibility	Key Behaviors
Proposer	Proposes values to be chosen	Generates unique proposal numbers; runs the two-phase protocol; may learn of previously accepted values and adopt them
Acceptor	Votes on proposals; remembers accepted values	Promises to not accept older proposals; accepts values from valid proposals; persists state to stable storage
Learner	Learns the chosen value once consensus is reached	Receives notification when a value is chosen; does not participate in the voting process itself

Role interactions:

Proposers ↔ Acceptors: Proposers send Prepare and Accept messages to acceptors. Acceptors respond with Promise and Accepted messages.
Acceptors ↔ Learners: Once an acceptor has accepted a value, it can inform learners. When learners receive accepted messages from a majority of acceptors for the same proposal number and value, they know the value is chosen.
Proposer = Learner (often): In many implementations, the proposer also acts as a learner, discovering whether its proposal was successful.

Role Collocation

In typical deployments, each server runs all three roles. For example, with 5 servers, each server is a proposer (can propose values), an acceptor (votes on proposals), and a learner (learns the outcome). This collocation simplifies deployment while maintaining logical separation.

The cluster size question:

Paxos requires a majority of acceptors to make progress. For a cluster of N acceptors:

Minimum to make progress: ⌊N/2⌋ + 1 acceptors must be available
Maximum tolerated failures: ⌊(N-1)/2⌋ acceptors can fail

Common cluster sizes:

Cluster Sizes and Fault Tolerance
Cluster Size	Majority Required	Tolerated Failures	Notes
3 nodes	2	1	Minimum useful size; survives single failure
5 nodes	3	2	Common production choice; good balance of safety and overhead
7 nodes	4	3	Higher fault tolerance; increased message overhead
2F+1 nodes	F+1	F	General formula for tolerating F failures

Why Not 4 Nodes?

A 4-node cluster requires 3 nodes for a majority and tolerates only 1 failure—the same fault tolerance as a 3-node cluster. Even-numbered clusters provide no additional safety benefit while increasing communication overhead. This is why production Paxos deployments almost always use odd numbers of nodes.

Why Paxos Matters for System Design

Understanding Paxos is not merely academic—it's fundamentally practical for system design. Here's why it matters:

Practical Importance of Paxos

•Foundation for distributed storage — Every strongly-consistent distributed database (Spanner, CockroachDB, YugabyteDB) uses Paxos or a closely related algorithm internally. Understanding Paxos helps you understand their consistency guarantees and limitations.
•Coordination services — Etcd (Paxos via Raft), ZooKeeper (ZAB, a Paxos variant), and Consul (Raft) all implement Paxos-like protocols. Understanding the underlying consensus helps you use these tools correctly.
•Leader election — Many systems use Paxos for leader election, including Kubernetes control plane components. Understanding how leader election works helps you debug split-brain scenarios.
•Replicated state machines — The pattern of using consensus to replicate a log of commands across servers is pervasive. Paxos is the theoretical foundation for this pattern.
•Interview expectations — System design interviews at top companies frequently touch on consensus. Being able to explain Paxos at various levels of detail demonstrates deep understanding.

Real-world systems built on Paxos:

Production Systems Using Paxos Variants
System	Algorithm	Use Case
Google Chubby	Paxos	Distributed lock service, name service
Google Spanner	Multi-Paxos + TrueTime	Globally-distributed relational database
Google Megastore	Paxos	Structured storage for Google applications
Apache ZooKeeper	ZAB (Paxos variant)	Coordination service for distributed applications
etcd	Raft (Paxos equivalent)	Distributed key-value store, Kubernetes state store
CockroachDB	Raft	Distributed SQL database
TiDB/TiKV	Raft	Distributed HTAP database
Consul	Raft	Service mesh, configuration management

Raft vs. Paxos

You'll notice many systems use Raft rather than Paxos directly. Raft was designed in 2014 to be more understandable than Paxos while solving the same problem. Raft makes specific design choices (e.g., leader-based log replication) that Paxos leaves open. We'll compare these algorithms in later modules.

The Paxos Variants Landscape

Lamport's original Paxos has spawned a family of variants, each optimizing for different use cases. Understanding this landscape helps you choose the right approach for your system.

Basic Paxos (Single-Decree Paxos):

This is the foundational algorithm we'll study in detail. It achieves consensus on a single value. Every other Paxos variant is built upon or derived from this core algorithm.

Multi-Paxos:

Extends Basic Paxos to agree on a sequence of values (a log). This is what production systems actually implement. Multi-Paxos introduces optimizations that make it much more efficient than running separate instances of Basic Paxos.

Key Paxos Variants

•Multi-Paxos — Stable leader optimization for consecutive consensus instances. Amortizes Phase 1 across many decisions.
•Fast Paxos — Reduces message delays in the common case by allowing acceptors to directly accept values without a prepare phase.
•Cheap Paxos — Reduces the number of active replicas needed for fault tolerance by using spare acceptors only during recovery.
•Egalitarian Paxos (EPaxos) — Eliminates the leader, allowing any replica to commit commands with minimal dependencies.
•Flexible Paxos — Relaxes quorum intersection requirements, allowing different quorums for Phase 1 and Phase 2.
•Vertical Paxos — Enables reconfiguration of the set of acceptors without stopping the system.

Focus on the Fundamentals

While these variants exist, mastering Basic Paxos is essential before exploring them. The principles of Basic Paxos—quorum intersection, proposal ordering, the prepare-accept pattern—appear in all variants. Once you deeply understand Basic Paxos, the variants become natural extensions rather than new algorithms.

Summary: Paxos Fundamentals

We've covered substantial ground, establishing the foundation for understanding Paxos. Let's consolidate the key takeaways:

Key Takeaways

•Consensus is the fundamental problem — Getting distributed processes to agree on a value despite failures and asynchrony is extraordinarily hard, as proven by the FLP impossibility result.
•Paxos is the canonical solution — Developed by Leslie Lamport, Paxos provides mathematically proven correctness for distributed consensus under crash failures.
•Safety is unconditional — Paxos never makes conflicting decisions, regardless of network conditions or failures.
•Liveness requires synchrony — Progress requires that the network eventually behaves synchronously and a majority of nodes remain available.
•Quorum intersection is key — The overlapping of majorities ensures that any two decisions share at least one participant, enabling information transfer.
•Proposal numbers create ordering — Unique, monotonically increasing proposal numbers establish a total order without synchronized clocks.
•Three roles with distinct responsibilities — Proposers, acceptors, and learners separate concerns while working together to achieve consensus.
•Paxos is everywhere — Modern distributed systems, from databases to coordination services, are built on Paxos or equivalent algorithms.

What's next:

Now that we understand what Paxos is and why it matters, we'll examine the roles in detail. The next page explores Proposers, Acceptors, and Learners—the three actors in the Paxos protocol—and their precise responsibilities, state management, and interactions.

Page Complete

You now understand the historical context, problem definition, and core intuitions behind Paxos. This foundation prepares you for the detailed protocol mechanics ahead. Next, we'll explore the three Paxos roles and their responsibilities.

1 / 5

Loading learning content...

System Design (HLD)Paxos Algorithm

The Paxos Consensus Algorithm

LevelAdvanced

Duration90 mins

TopicPaxos Algorithm

1 / 5

Basic Paxos Overview

The Quest for Agreement in Distributed Systems

That algorithm is Paxos.

What You Will Learn

Historical Context — The Birth of Paxos

The Part-Time Parliament

Why the long delay mattered:

Key Historical Milestones

•1978 — Lamport introduces logical clocks and the happens-before relationship
•1985 — Fischer, Lynch, and Paterson prove the FLP impossibility result, showing consensus is impossible in purely asynchronous systems with even one faulty process
•1989 — Lamport develops the Paxos algorithm at SRI International
•1998 — 'The Part-Time Parliament' finally published in ACM TOCS
•2001 — 'Paxos Made Simple' provides a more accessible presentation
•2006 — Google's Chubby lock service implements Paxos, validating it at scale
•2014 — Raft emerges as a more understandable alternative, explicitly designed for implementation clarity

The impact of Paxos:

Lamport himself received the Turing Award in 2013, with Paxos cited as one of his seminal contributions.

The Consensus Problem Defined

The formal specification:

A correct consensus algorithm must satisfy three properties:

Consensus Properties

•Validity (Non-triviality) — If a value is decided, it must have been proposed by some process. The algorithm cannot invent values out of thin air; it must choose from the values actually proposed.
•Agreement (Safety) — No two correct processes can decide on different values. Once any process has decided on a value, all other processes that decide must choose the same value. This is the core safety guarantee.
•Termination (Liveness) — Eventually, every correct process must decide on some value. The system cannot deadlock indefinitely. However, termination is only guaranteed under certain synchrony assumptions.

The FLP Impossibility Result

Why consensus is hard:

The difficulty of consensus stems from the interplay of failures and asynchrony. Consider these challenges:

Challenges in Distributed Consensus
Challenge	Description	Why It Complicates Consensus
Message Loss	Network packets can be dropped without notification	A process cannot distinguish between a crashed peer and a slow/unreachable one
Message Delay	Messages can be delivered after arbitrary delays	A 'late' vote might arrive after a decision has already been made
Message Reordering	Messages may arrive in different order than sent	Two processes might see the same events in different sequences
Process Crashes	Processes can fail at any point during execution	A process might crash after sending some messages but before others
Process Recovery	Crashed processes can restart with persistent state	A recovering process might have outdated information or duplicate messages
Network Partitions	Groups of processes become mutually unreachable	Each partition might try to make independent decisions

The fundamental tension:

Consensus algorithms must navigate a fundamental tension:

Safety demands that we never make conflicting decisions
Liveness demands that we eventually make a decision

Paxos resolves this tension by ensuring safety unconditionally (under all failure scenarios) while providing liveness under reasonable assumptions about eventual network behavior.

What Problem Does Paxos Actually Solve?

It's easy to have a vague sense that Paxos 'solves consensus.' But understanding precisely what Paxos guarantees—and what it does not guarantee—is essential for using it correctly.

Paxos solves single-value consensus:

Single-Value vs. Multi-Value

The system model assumptions:

Paxos operates under specific assumptions about the environment:

Paxos System Model

•Non-Byzantine failures only — Processes can crash and recover, but they do not behave maliciously or arbitrarily. They follow the protocol correctly when operational.
•Asynchronous communication — Messages can be delayed, reordered, or lost, but messages that are delivered are delivered intact (no corruption).
•Durable storage — Processes have access to stable storage that survives crashes. Promises and acceptances are written to disk before being sent.
•Process identification — Each process has a unique identifier, and processes can authenticate messages.
•Eventual partial synchrony — For liveness, Paxos assumes that eventually the network will behave synchronously enough for messages to be delivered in bounded time.

What Paxos does NOT solve:

Understanding Paxos's limitations is as important as understanding its guarantees:

Paxos Guarantees

•Safety: No conflicting decisions ever
•Single-value agreement among participants
•Tolerance for process crashes and restarts
•Tolerance for message loss and delays
•Tolerance for network partitions (safety-wise)
•Liveness under eventual synchrony

Paxos Does NOT Guarantee

•Tolerance for Byzantine (malicious) failures
•Progress during extended network partitions
•Bounded latency for decisions
•Fairness among proposers
•Automatic reconfiguration of participants
•Multi-value consensus (without Multi-Paxos)

The Intuition Behind Paxos

Before examining the precise mechanics of Paxos, let's build intuition about its core ideas. At its heart, Paxos is remarkably elegant—it's the composition of a few key insights.

Key Insight #1: Majorities Always Overlap

The Quorum Intersection Property

Key Insight #2: Proposal Numbers Create Total Ordering

Key Insight #3: Promise Before Accept

Paxos uses a two-phase approach:

Phase 1 (Prepare): A proposer asks acceptors to promise not to accept any proposal older than its current proposal number. This 'locks out' older proposals.
Phase 2 (Accept): If the proposer receives promises from a majority, it asks acceptors to accept a value. Importantly, the proposer may be required to propose a value that was previously accepted (discovered in Phase 1) rather than its own preferred value.

This two-phase structure ensures that once a value is accepted by a majority, all future proposals will discover and propagate that value.

Converting Mermaid diagram...

Key Insight #4: Values are 'Stickier' Than They Appear

This creates a kind of 'value stickiness'—values, once accepted by a majority, propagate forward to all future proposals, even if the original proposer has crashed.

The Elegance of Paxos

The Architecture at a Glance

Paxos Roles Overview
Role	Responsibility	Key Behaviors
Proposer	Proposes values to be chosen	Generates unique proposal numbers; runs the two-phase protocol; may learn of previously accepted values and adopt them
Acceptor	Votes on proposals; remembers accepted values	Promises to not accept older proposals; accepts values from valid proposals; persists state to stable storage
Learner	Learns the chosen value once consensus is reached	Receives notification when a value is chosen; does not participate in the voting process itself

Role interactions:

Proposers ↔ Acceptors: Proposers send Prepare and Accept messages to acceptors. Acceptors respond with Promise and Accepted messages.
Acceptors ↔ Learners: Once an acceptor has accepted a value, it can inform learners. When learners receive accepted messages from a majority of acceptors for the same proposal number and value, they know the value is chosen.
Proposer = Learner (often): In many implementations, the proposer also acts as a learner, discovering whether its proposal was successful.

Role Collocation

The cluster size question:

Paxos requires a majority of acceptors to make progress. For a cluster of N acceptors:

Minimum to make progress: ⌊N/2⌋ + 1 acceptors must be available
Maximum tolerated failures: ⌊(N-1)/2⌋ acceptors can fail

Common cluster sizes:

Cluster Sizes and Fault Tolerance
Cluster Size	Majority Required	Tolerated Failures	Notes
3 nodes	2	1	Minimum useful size; survives single failure
5 nodes	3	2	Common production choice; good balance of safety and overhead
7 nodes	4	3	Higher fault tolerance; increased message overhead
2F+1 nodes	F+1	F	General formula for tolerating F failures

Why Not 4 Nodes?

Why Paxos Matters for System Design

Understanding Paxos is not merely academic—it's fundamentally practical for system design. Here's why it matters:

Practical Importance of Paxos

•Foundation for distributed storage — Every strongly-consistent distributed database (Spanner, CockroachDB, YugabyteDB) uses Paxos or a closely related algorithm internally. Understanding Paxos helps you understand their consistency guarantees and limitations.
•Coordination services — Etcd (Paxos via Raft), ZooKeeper (ZAB, a Paxos variant), and Consul (Raft) all implement Paxos-like protocols. Understanding the underlying consensus helps you use these tools correctly.
•Leader election — Many systems use Paxos for leader election, including Kubernetes control plane components. Understanding how leader election works helps you debug split-brain scenarios.
•Replicated state machines — The pattern of using consensus to replicate a log of commands across servers is pervasive. Paxos is the theoretical foundation for this pattern.
•Interview expectations — System design interviews at top companies frequently touch on consensus. Being able to explain Paxos at various levels of detail demonstrates deep understanding.

Real-world systems built on Paxos:

Production Systems Using Paxos Variants
System	Algorithm	Use Case
Google Chubby	Paxos	Distributed lock service, name service
Google Spanner	Multi-Paxos + TrueTime	Globally-distributed relational database
Google Megastore	Paxos	Structured storage for Google applications
Apache ZooKeeper	ZAB (Paxos variant)	Coordination service for distributed applications
etcd	Raft (Paxos equivalent)	Distributed key-value store, Kubernetes state store
CockroachDB	Raft	Distributed SQL database
TiDB/TiKV	Raft	Distributed HTAP database
Consul	Raft	Service mesh, configuration management

Raft vs. Paxos

The Paxos Variants Landscape

Lamport's original Paxos has spawned a family of variants, each optimizing for different use cases. Understanding this landscape helps you choose the right approach for your system.

Basic Paxos (Single-Decree Paxos):

This is the foundational algorithm we'll study in detail. It achieves consensus on a single value. Every other Paxos variant is built upon or derived from this core algorithm.

Multi-Paxos:

Key Paxos Variants

•Multi-Paxos — Stable leader optimization for consecutive consensus instances. Amortizes Phase 1 across many decisions.
•Fast Paxos — Reduces message delays in the common case by allowing acceptors to directly accept values without a prepare phase.
•Cheap Paxos — Reduces the number of active replicas needed for fault tolerance by using spare acceptors only during recovery.
•Egalitarian Paxos (EPaxos) — Eliminates the leader, allowing any replica to commit commands with minimal dependencies.
•Flexible Paxos — Relaxes quorum intersection requirements, allowing different quorums for Phase 1 and Phase 2.
•Vertical Paxos — Enables reconfiguration of the set of acceptors without stopping the system.

Focus on the Fundamentals

Summary: Paxos Fundamentals

We've covered substantial ground, establishing the foundation for understanding Paxos. Let's consolidate the key takeaways:

Key Takeaways

•Consensus is the fundamental problem — Getting distributed processes to agree on a value despite failures and asynchrony is extraordinarily hard, as proven by the FLP impossibility result.
•Paxos is the canonical solution — Developed by Leslie Lamport, Paxos provides mathematically proven correctness for distributed consensus under crash failures.
•Safety is unconditional — Paxos never makes conflicting decisions, regardless of network conditions or failures.
•Liveness requires synchrony — Progress requires that the network eventually behaves synchronously and a majority of nodes remain available.
•Quorum intersection is key — The overlapping of majorities ensures that any two decisions share at least one participant, enabling information transfer.
•Proposal numbers create ordering — Unique, monotonically increasing proposal numbers establish a total order without synchronized clocks.
•Three roles with distinct responsibilities — Proposers, acceptors, and learners separate concerns while working together to achieve consensus.
•Paxos is everywhere — Modern distributed systems, from databases to coordination services, are built on Paxos or equivalent algorithms.

What's next:

Page Complete

1 / 5