Loading learning content...
The elegance of Paxos lies in its remarkably simple core structure: two phases. Phase 1 establishes the proposer's right to make a proposal and discovers any previously accepted values. Phase 2 actually proposes a value and gets it accepted. This two-phase structure is what enables Paxos to achieve consensus safely in the face of failures.
Understanding these phases in complete detail—including every edge case, failure mode, and recovery scenario—is essential for implementing Paxos correctly. Many bugs in production implementations stem from subtle misunderstandings of the protocol mechanics.
In this page, we'll walk through both phases with the precision required for a production implementation.
By the end of this page, you will understand the exact message flows and state transitions in both phases, why each step is necessary for correctness, what happens under various failure scenarios, and how the two phases work together to guarantee safety. You'll be able to trace through Paxos executions and reason about their correctness.
Before diving into the details, let's establish a high-level mental model of what each phase accomplishes:
Phase 1: Prepare/Promise ("Establish Right to Propose")
The proposer asks acceptors for permission to make a proposal with number N. Acceptors that agree promise not to accept any proposal numbered less than N. They also reveal any value they've previously accepted, allowing the proposer to discover if consensus has already been reached or is in progress.
Phase 2: Accept/Accepted ("Propose and Accept")
If the proposer receives promises from a majority, it sends an Accept request with a value. If any acceptor reported a previously accepted value in Phase 1, the proposer must use the value from the highest-numbered acceptance. Otherwise, it can propose any value.
When an acceptor receives an Accept request, it accepts the value (persisting it) unless it has since promised a higher number. Once a majority of acceptors have accepted the same proposal, a value is chosen.
The two-phase structure appears in many distributed algorithms: Two-Phase Commit (2PC), Two-Phase Locking (2PL), and Paxos. In each case, the first phase establishes coordination, and the second phase acts on that coordination. The key difference is Paxos's ability to make progress despite failures.
Phase 1, Step 1: Proposer Sends Prepare
A proposer that wishes to propose a value begins by selecting a proposal number N that is:
The proposer then sends a Prepare(N) message to all acceptors (or at least a majority).
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182
async def phase1_prepare(self, initial_value: Any) -> Phase1Result: """ Execute Phase 1 of the Paxos protocol. Args: initial_value: The value we'd like to propose (may be overridden) Returns: Phase1Result containing either: - Success with the value to propose in Phase 2 - Failure requiring retry with higher proposal number """ # Generate a new, unique proposal number proposal_number = self.generate_proposal_number() # Send Prepare(n) to all acceptors prepare_request = PrepareRequest(proposal_number=proposal_number) responses = await asyncio.gather(*[ self._send_prepare(acceptor, prepare_request) for acceptor in self.acceptors ], return_exceptions=True) # Collect responses, counting promises and rejections promises = [] rejections = [] highest_rejection_number = 0 for response in responses: if isinstance(response, Exception): # Network error or timeout - treat as no response continue elif isinstance(response, PromiseResponse): promises.append(response) elif isinstance(response, RejectionResponse): rejections.append(response) highest_rejection_number = max( highest_rejection_number, response.highest_promised ) # Need majority of promises to proceed if len(promises) >= self.majority: # Determine which value to propose in Phase 2 value_to_propose = self._determine_value(initial_value, promises) return Phase1Success( proposal_number=proposal_number, value_to_propose=value_to_propose ) else: # Failed to get majority - need to retry with higher number return Phase1Failure( retry_above=highest_rejection_number ) def _determine_value(self, initial_value: Any, promises: List[PromiseResponse]) -> Any: """ Determine which value to propose in Phase 2. If any acceptor reported a previously accepted value, we MUST propose the value from the highest-numbered acceptance. Otherwise, we can propose our initial value. This is the critical "adoption rule" that ensures safety. """ # Find the highest-numbered previously accepted value highest_accepted = None highest_accepted_number = -1 for promise in promises: if promise.accepted_proposal is not None: accepted_number, accepted_value = promise.accepted_proposal if accepted_number > highest_accepted_number: highest_accepted_number = accepted_number highest_accepted = accepted_value if highest_accepted is not None: # Must adopt the previously-accepted value return highest_accepted else: # Free to propose our own value return initial_valuePhase 1, Step 2: Acceptor Handles Prepare
When an acceptor receives Prepare(N), it must decide whether to promise:
12345678910111213141516171819202122232425262728293031323334353637383940
def handle_prepare(self, request: PrepareRequest) -> PrepareResponse: """ Handle a Prepare(n) request from a proposer. Invariant: We will never accept a proposal numbered < our promised number. Args: request: PrepareRequest containing proposal_number Returns: Either Promise (with any previously accepted value) or Rejection """ proposal_number = request.proposal_number # Check if we can make this promise if self._can_promise(proposal_number): # Update our promised number (persist BEFORE responding!) self.promised_number = proposal_number self._persist_state() # CRITICAL: must be durable before response # Respond with our previous acceptance, if any return PromiseResponse( proposal_number=proposal_number, accepted_proposal=self.accepted_proposal # May be None ) else: # Reject - we've already promised a higher number return RejectionResponse( highest_promised=self.promised_number ) def _can_promise(self, proposal_number: int) -> bool: """ Check if we can promise to not accept proposals below this number. We can promise if proposal_number > previously promised number. """ if self.promised_number is None: return True return proposal_number > self.promised_numberThe acceptor MUST persist its new promised_number to stable storage BEFORE sending the Promise response. If it responds first and then crashes, on restart it would have no record of the promise and might violate it.
The Promise Response Payload:
The Promise response includes not just the acknowledgment, but critically, any value the acceptor has previously accepted. This allows the proposer to discover in-progress or completed consensus.
| Field | Description | Why It's Needed |
|---|---|---|
| proposal_number | The proposal number being promised | Confirms which proposal this promise applies to |
| accepted_proposal | (prior_number, value) or None | Reveals any value the acceptor has previously accepted, enabling the adoption rule |
The adoption rule is the key insight that makes Paxos safe. Let's understand why it's necessary and how it works.
The Rule:
If any acceptor in the Phase 1 majority reports having previously accepted a value, the proposer MUST propose the value from the highest-numbered prior acceptance. The proposer cannot propose its own value if any prior acceptance exists.
Why This Is Critical:
Consider what would happen without this rule:
Suppose Proposer P1 successfully gets value V1 accepted by acceptors A1 and A2 (a majority of 3). Value V1 is chosen! Now Proposer P2 starts a new proposal. P2's Phase 1 contacts A2 and A3. A2 reports it accepted V1, but if P2 is allowed to ignore this and propose V2, it might get A2 (who now promises) and A3 to accept V2. Now we have two 'chosen' values — a safety violation!
How the Adoption Rule Prevents This:
With the adoption rule, P2's Phase 1 discovers that A2 previously accepted V1. P2 is now forced to propose V1 (the value from the highest-numbered prior acceptance). When P2 completes Phase 2, it's proposing V1, the same value that was already chosen. Safety is preserved!
Visualizing the Adoption Rule:
The Mathematical Invariant:
The adoption rule maintains this key invariant:
If a value V is chosen at proposal number N, then for all proposal numbers M > N, any proposer that completes Phase 1 with proposal M will discover V and propose V.
This invariant is proven by induction on proposal numbers and relies on the quorum intersection property (any two majorities share at least one acceptor).
If V1 was accepted by a majority M1, and proposer P2 contacts a majority M2 in Phase 1, then M1 ∩ M2 ≠ ∅. At least one acceptor is in both groups. This acceptor will report the V1 acceptance, forcing P2 to adopt V1.
Phase 2, Step 1: Proposer Sends Accept
Once a proposer has received Promise responses from a majority of acceptors in Phase 1, it can proceed to Phase 2. The proposer sends Accept(N, V) to all acceptors, where:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657
async def phase2_accept( self, proposal_number: int, value: Any) -> Phase2Result: """ Execute Phase 2 of the Paxos protocol. Args: proposal_number: The proposal number from successful Phase 1 value: The value to propose (from Phase 1 determination) Returns: Phase2Result indicating success (value chosen) or failure """ # Send Accept(n, v) to all acceptors accept_request = AcceptRequest( proposal_number=proposal_number, value=value ) responses = await asyncio.gather(*[ self._send_accept(acceptor, accept_request) for acceptor in self.acceptors ], return_exceptions=True) # Count acceptances and rejections acceptances = [] rejections = [] highest_rejection_number = 0 for response in responses: if isinstance(response, Exception): # Network error or timeout continue elif isinstance(response, AcceptedResponse): acceptances.append(response) elif isinstance(response, RejectionResponse): rejections.append(response) highest_rejection_number = max( highest_rejection_number, response.highest_promised ) # Check if we got a majority of acceptances if len(acceptances) >= self.majority: # Value is CHOSEN! return Phase2Success( chosen_value=value, proposal_number=proposal_number ) else: # Failed - another proposer may have preempted us # Need to restart from Phase 1 with higher number return Phase2Failure( retry_above=highest_rejection_number )Phase 2, Step 2: Acceptor Handles Accept
When an acceptor receives Accept(N, V), it accepts the value if it hasn't made a promise that would prevent it:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647
def handle_accept(self, request: AcceptRequest) -> AcceptResponse: """ Handle an Accept(n, v) request from a proposer. Accept if n >= promised_number (we haven't promised to ignore this proposal). Args: request: AcceptRequest containing proposal_number and value Returns: Either Accepted (success) or Rejection """ proposal_number = request.proposal_number value = request.value if self._can_accept(proposal_number): # Accept this value # Update both our accepted value and promise # (accepting also implies promising this number) self.accepted_proposal = (proposal_number, value) self.promised_number = proposal_number # CRITICAL: Persist before responding! self._persist_state() # Respond with success return AcceptedResponse( proposal_number=proposal_number, value=value ) else: # Reject - we've promised a higher number return RejectionResponse( highest_promised=self.promised_number ) def _can_accept(self, proposal_number: int) -> bool: """ Check if we can accept a proposal with this number. We can accept if proposal_number >= promised_number. Note: >= not >, because the proposer who did Prepare(n) should be able to Accept(n, v). """ if self.promised_number is None: return True return proposal_number >= self.promised_numberNote that we use >= (not >) for the accept condition. This allows a proposer who completed Prepare(N) to subsequently send Accept(N, V) and have it accepted. Some formulations of Paxos use different conventions here, but the key is consistency between prepare and accept conditions.
When a Value is Chosen:
A value V is chosen when a majority of acceptors have sent Accepted(N, V) responses for the same proposal number N. At this point:
Understanding how Paxos handles failures is crucial. Let's examine each failure type and trace through the recovery.
Failure Type 1: Proposer Crashes During Phase 1
Scenario: Proposer P1 sends Prepare(5) to acceptors, some respond with Promise(5), then P1 crashes.
| What Happens | Result | Recovery |
|---|---|---|
| Acceptors that promised 5 are waiting | Those acceptors will reject Prepare(4) or lower | As designed - promises are honored |
| No Accept sent for proposal 5 | Value is not chosen | Another proposer can start with higher number |
| Proposer P2 starts Prepare(6) | Acceptors promise 6 (higher than 5) | Protocol proceeds normally |
| P1 restarts and retries | P1 must use proposal number > 5 | Its old promises are superseded by P2's |
Failure Type 2: Proposer Crashes During Phase 2
Scenario: Proposer P1 sends Accept(5, V) to acceptors, some accept, then P1 crashes before determining outcome.
Notice that in both scenarios, any value that has been accepted by a majority is preserved and eventually confirmed. Paxos never loses a chosen value, regardless of proposer failures.
Failure Type 3: Acceptor Crashes
| Timing | Impact | Recovery |
|---|---|---|
| Before any persistence | Acceptor is as if it never received the message | No problem - must still reach a majority of other acceptors |
| After Prepare persist, before Prepare response | Acceptor restarts with the promise recorded | On restart, honors the promise; protocol continues |
| After Accept persist, before Accept response | Acceptor restarts with acceptance recorded | On restart, reports acceptance in future Phase 1; value preserved |
| Minority of acceptors crash | Majority still available | Protocol completes with remaining acceptors |
| Majority of acceptors crash | Cannot reach quorum | Protocol stalls until acceptors recover (liveness affected, not safety) |
Failure Type 4: Network Partition
Network partitions are particularly interesting because they can create scenarios where different groups of nodes attempt to make progress independently.
One of the most subtle aspects of Paxos is handling concurrent proposals from multiple proposers. Let's trace through several scenarios.
Scenario 1: Non-Conflicting Proposals
Two proposers start nearly simultaneously, but one completes Phase 1 before the other begins.
Scenario 2: Conflicting Proposals — Preemption
Two proposers compete, and one preempts the other mid-protocol.
If two proposers continually preempt each other, neither makes progress. P1 prepares 1, P2 prepares 2, P1 prepares 3, P2 prepares 4... This is called livelock. Paxos guarantees safety but not liveness under adversarial scheduling. Practical implementations use leader election and exponential backoff to avoid this.
Scenario 3: Concurrent Proposals, Same Value Chosen
Even when proposals conflict, the adoption rule ensures the same value is eventually chosen.
Let's formally state and understand the properties Paxos provides.
Safety Properties (Always Hold):
Liveness Properties (Conditional):
Paxos doesn't violate the FLP impossibility result. It achieves safety unconditionally but only guarantees liveness under partial synchrony assumptions. In a purely asynchronous system where message delays are unbounded and proposers continuously preempt each other, Paxos might never terminate — but it will never violate safety.
Why Safety is Unconditional:
The safety proof relies on:
These four ingredients, combined correctly, mathematically guarantee that if a value V is chosen at proposal N, every higher-numbered proposal that completes Phase 1 will discover V and be bound to propose V.
We've completed a detailed examination of the Paxos two-phase protocol. Let's consolidate the key takeaways:
What's next:
Basic Paxos agrees on a single value, but real systems need to agree on a sequence of values (a replicated log). The next page explores Multi-Paxos for Log Replication — how to efficiently run multiple Paxos instances and the optimizations that make this practical.
You now have a detailed understanding of the Paxos two-phase protocol, including the exact message flows, failure handling, concurrent proposal resolution, and the critical adoption rule that ensures safety. This prepares you for understanding Multi-Paxos and practical implementations.