Operating SystemsDistributed Coordination

Distributed Coordination

LevelAdvanced

Duration90 mins

TopicDistributed Coordination

3 / 5

Paxos Basics

The First Practical Consensus Protocol

In 1989, Leslie Lamport invented Paxos, the first consensus algorithm proven to be both safe and live (under partial synchrony). The algorithm was so profound that it influenced every subsequent consensus protocol—Raft, ZAB, Viewstamped Replication, and others are all, at their core, variations on Paxos themes.

Paxos is also infamous for being difficult to understand. Lamport's original paper presented the algorithm through an allegory about a Greek parliament on the fictional island of Paxos, which many found more confusing than clarifying. He later wrote "Paxos Made Simple" (2001), which opens with: "The Paxos algorithm, when presented in plain English, is very simple."

Our approach:

We'll build Paxos from first principles, understanding why each rule exists rather than memorizing the protocol mechanically. By the end, you'll understand how Paxos achieves the seemingly impossible: getting distributed nodes to agree on a value despite arbitrary message delays and node failures.

What You Will Learn

By the end of this page, you will understand the three roles in Paxos (proposer, acceptor, learner), the two-phase protocol structure, why each phase is necessary for safety, how Paxos handles competing proposals, and the connection between single-decree Paxos and Multi-Paxos for replicated log implementations.

The Problem Paxos Solves

Paxos solves single-decree consensus: getting a group of nodes to agree on exactly one value. The value, once chosen, cannot be changed.

The system model:

N nodes that can communicate via message passing
Asynchronous network: messages may be delayed, duplicated, or lost (but not corrupted)
Crash-stop failures: nodes may crash and stop responding, but don't send incorrect messages
Durable storage: nodes can persist data and recover after crashes
No Byzantine failures: all nodes follow the protocol honestly

The scenario:

Multiple nodes may propose values concurrently. The algorithm must:

Choose exactly one value (agreement)
Choose a proposed value (validity)
Eventually terminate if a majority of nodes are correct (liveness)

Why is this hard?

Consider a simple approach: "If you receive a proposal and haven't seen one before, accept it."

Naive Approach Fails

Node A proposes value X
Nodes 1 and 2 accept X
Before node 3 sees X, node B proposes Y
Node 3 (and maybe 4, 5) accepts Y
No majority for either value

Worse: If nodes 1 and 2 fail, only Y might survive. But clients who received acknowledgment for X think it was chosen.

Paxos Solution

Numbered proposals: Each proposal has a unique, ordered number
Prepare phase: Before proposing, check what's already accepted
Promise: Acceptors promise to reject older proposals
Adopt: If someone else's value was accepted, adopt it

Result: At most one value can achieve majority acceptance.

The key insight:

Paxos prevents conflicting acceptances by having proposers first check what acceptors have already accepted. If a value has been (or might have been) chosen, the proposer adopts that value rather than proposing something new. This ensures that once a value might be chosen, all future proposals carry that same value.

This check-then-propose pattern is the two-phase structure of Paxos.

The Three Roles in Paxos

Paxos defines three logical roles. In practice, a single node often acts in all three roles simultaneously.

1. Proposers:

Proposers are nodes that propose values. A proposer:

Generates unique proposal numbers
Drives the two-phase protocol
Sends Prepare and Accept requests
May abandon its current proposal and start anew with a higher number

2. Acceptors:

Acceptors are the "voters" that constitute the durable memory of the system. An acceptor:

Responds to Prepare requests with promises and previously accepted values
Responds to Accept requests by accepting (or rejecting) values
Persists its state so it survives crashes
The collective state of acceptors determines what value was chosen

3. Learners:

Learners discover what value was chosen. A learner:

Receives notifications when acceptors accept values
Detects when a value has been accepted by a majority
May be the same node as the proposer or a separate client

Paxos Roles Summary
Role	Responsibility	State to Maintain	Messages Sent/Received
Proposer	Initiate consensus rounds	Current proposal number	Sends: Prepare, Accept Receives: Promise, Accepted
Acceptor	Vote on proposals, remember votes	Highest promised number, accepted value	Receives: Prepare, Accept Sends: Promise, Accepted
Learner	Discover chosen value	Set of accepted values per acceptor	Receives: Accepted notifications

Proposal Numbers:

Proposal numbers must be:

Unique across all proposers: No two proposals have the same number
Totally ordered: Any two numbers can be compared
Infinite: Proposers can always generate a higher number

Common implementation:

Combine a local sequence number with the proposer's node ID:

proposal_number = (sequence_number, node_id)

Compare by sequence number first, then by node ID. This ensures uniqueness and ordering without coordination.

Roles are Logical, Not Physical

In typical deployments, every node acts as proposer, acceptor, and learner. The separation is conceptual—it helps understand the algorithm, but nodes don't need to be partitioned by role. A 3-node Paxos cluster has 3 proposers, 3 acceptors, and 3 learners—all colocated.

Phase 1: Prepare

The Prepare phase achieves two goals:

Establish leadership: Block older proposals from being accepted
Learn history: Discover if a value has already been (or might be) chosen

Proposer actions:

Choose a proposal number n higher than any previously used
Send Prepare(n) to all acceptors (or at least a majority)

Acceptor actions upon receiving Prepare(n):

If n > highest_promised_number:
- Update highest_promised_number = n
- Respond with Promise(n, accepted_value, accepted_number)
- Promise: "I will not accept any proposal numbered less than n"
If n <= highest_promised_number:
- Ignore or respond with Reject(highest_promised_number)
- "I've already promised to a higher proposal"

The promise is crucial:

By collecting promises from a majority, the proposer learns:

What values have been accepted by anyone in that majority
That no proposal < n can now be accepted by that majority

This combination lets the proposer safely propose a value.

paxos_phase1.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
# ============================================
# PHASE 1: PREPARE
# ============================================
 
class PaxosProposer:
    """
    Paxos proposer handling Phase 1 (Prepare).
    """
    
    def __init__(self, node_id: int, acceptors: List[str]):
        self.node_id = node_id
        self.acceptors = acceptors
        self.sequence_number = 0  # Incremented for each new proposal
        
    def generate_proposal_number(self) -> ProposalNumber:
        """
        Generate unique proposal number higher than any we've seen.
        
        Format: (sequence, node_id) - lexicographic ordering.
        Sequence ensures our proposals increase.
        Node ID breaks ties for uniqueness.
        """
        self.sequence_number += 1
        return ProposalNumber(self.sequence_number, self.node_id)
        
    def prepare(self, proposal_number: ProposalNumber) -> Optional[PrepareResult]:
        """
        Phase 1a: Send Prepare request to all acceptors.
        
        Returns PrepareResult if majority respond, None otherwise.
        """
        promises: List[Promise] = []
        
        for acceptor in self.acceptors:
            try:
                response = self._send_prepare(acceptor, proposal_number)
                
                if response.type == "PROMISE":
                    promises.append(response)
                else:
                    # Rejected - acceptor has promised to higher proposal
                    # We might want to bump our number and retry
                    pass
                    
            except NetworkTimeout:
                # Acceptor unreachable - continue with others
                pass
                
        # Did we get a majority?
        majority_threshold = len(self.acceptors) // 2 + 1
        
        if len(promises) >= majority_threshold:
            # Success! Extract highest accepted value from promises
            highest_accepted = self._find_highest_accepted(promises)
            return PrepareResult(
                success=True,
                promises=promises,
                highest_accepted=highest_accepted
            )
        else:
            # Failed to get majority - might retry with higher number
            return None
            
    def _find_highest_accepted(self, promises: List[Promise]) -> Optional[AcceptedValue]:
        """
        Among all promises, find the value accepted with highest proposal number.
        
        This is the value we must adopt if it exists.
        """
        highest = None
        
        for promise in promises:
            if promise.accepted_number is not None:
                if highest is None or promise.accepted_number > highest.number:
                    highest = AcceptedValue(
                        number=promise.accepted_number,
                        value=promise.accepted_value
                    )
                    
        return highest
 
 
class PaxosAcceptor:
    """
    Paxos acceptor handling Phase 1 (Prepare).
    """
    
    def __init__(self, node_id: str, storage: DurableStorage):
        self.node_id = node_id
        self.storage = storage
        
        # Recover state from storage (survives crashes)
        self.highest_promised = storage.get("highest_promised", None)
        self.accepted_number = storage.get("accepted_number", None)
        self.accepted_value = storage.get("accepted_value", None)
        
    def handle_prepare(self, proposal_number: ProposalNumber) -> Response:
        """
        Phase 1b: Handle Prepare request from proposer.
        
        If proposal number is higher than any we've seen,
        promise to reject lower-numbered proposals.
        """
        if self.highest_promised is None or proposal_number > self.highest_promised:
            # Accept this prepare - make a promise
            
            # CRITICAL: Persist before responding
            self.highest_promised = proposal_number
            self.storage.put("highest_promised", proposal_number)
            self.storage.flush()
            
            # Respond with our promise and any accepted value
            return Promise(
                proposal_number=proposal_number,
                accepted_number=self.accepted_number,
                accepted_value=self.accepted_value
            )
        else:
            # Reject - we've promised to a higher proposal
            return Reject(
                proposal_number=proposal_number,
                highest_promised=self.highest_promised
            )

Why Persistence Matters

An acceptor MUST persist highest_promised before responding with a Promise. If it crashes and restarts, forgetting its promise, it might later accept a lower-numbered proposal—violating the promise and potentially breaking consensus safety. Persistence is not optional; it's fundamental to correctness.

Phase 2: Accept

The Accept phase proposes a value for consensus. After receiving promises from a majority, the proposer knows:

No proposal numbered less than n can now be accepted by a majority
What value (if any) was previously accepted with the highest number

Choosing what value to propose:

This is the critical rule that makes Paxos safe:

If any promise included an accepted value, the proposer must propose the value with the highest accepted number. Only if no promise included an accepted value may the proposer propose its own value.

Why this rule?

If a value was accepted by a majority with proposal number k, then any majority of acceptors will include at least one that accepted it. The proposer must continue with that value to avoid conflicting with a potentially-chosen value.

Proposer actions (Phase 2a):

Send Accept(n, v) to all acceptors, where:

n is the proposal number from Phase 1
v is the highest accepted value from promises, or our own value if none

Acceptor actions (Phase 2b):

Upon receiving Accept(n, v):

If n >= highest_promised_number:
- Accept the proposal: set accepted_number = n, accepted_value = v
- Persist and respond Accepted(n, v)
If n < highest_promised_number:
- Reject (we promised not to accept proposals below highest_promised)

paxos_phase2.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
# ============================================
# PHASE 2: ACCEPT
# ============================================
 
class PaxosProposer:
    """
    Paxos proposer handling Phase 2 (Accept).
    """
    
    def propose_value(self, my_value: Any) -> Optional[Any]:
        """
        Complete Paxos protocol to propose a value.
        
        Returns the chosen value (might be ours or someone else's).
        Returns None if consensus could not be reached.
        """
        while True:  # Retry until success or explicit failure
            # Phase 1: Prepare
            n = self.generate_proposal_number()
            prepare_result = self.prepare(n)
            
            if prepare_result is None:
                # Failed to get majority - retry with higher number
                continue
                
            # Determine value to propose
            if prepare_result.highest_accepted is not None:
                # MUST use the highest accepted value
                value_to_propose = prepare_result.highest_accepted.value
            else:
                # Free to propose our own value
                value_to_propose = my_value
                
            # Phase 2: Accept
            accept_result = self.accept(n, value_to_propose)
            
            if accept_result.success:
                # Value was chosen!
                return value_to_propose
            else:
                # Someone sent a higher Prepare - retry
                continue
                
    def accept(self, proposal_number: ProposalNumber, value: Any) -> AcceptResult:
        """
        Phase 2a: Send Accept request to all acceptors.
        
        Returns success if a majority accept.
        """
        accepted_count = 0
        
        for acceptor in self.acceptors:
            try:
                response = self._send_accept(acceptor, proposal_number, value)
                
                if response.type == "ACCEPTED":
                    accepted_count += 1
                    # Notify learners
                    self._notify_learners(acceptor, proposal_number, value)
                else:
                    # Rejected - acceptor promised to higher proposal
                    # This proposal won't succeed
                    if response.highest_promised > proposal_number:
                        return AcceptResult(success=False)
                        
            except NetworkTimeout:
                pass  # Continue with other acceptors
                
        majority_threshold = len(self.acceptors) // 2 + 1
        
        return AcceptResult(
            success=(accepted_count >= majority_threshold),
            accepted_count=accepted_count
        )
 
 
class PaxosAcceptor:
    """
    Paxos acceptor handling Phase 2 (Accept).
    """
    
    def handle_accept(self, proposal_number: ProposalNumber, value: Any) -> Response:
        """
        Phase 2b: Handle Accept request from proposer.
        
        Accept if we haven't promised to a higher proposal.
        """
        if self.highest_promised is None or proposal_number >= self.highest_promised:
            # Accept this proposal
            
            # Update our state
            self.highest_promised = proposal_number
            self.accepted_number = proposal_number
            self.accepted_value = value
            
            # CRITICAL: Persist before responding
            self.storage.put("highest_promised", proposal_number)
            self.storage.put("accepted_number", proposal_number)
            self.storage.put("accepted_value", value)
            self.storage.flush()
            
            return Accepted(
                proposal_number=proposal_number,
                value=value
            )
        else:
            # Reject - we've promised to a higher proposal
            return Reject(
                proposal_number=proposal_number,
                highest_promised=self.highest_promised
            )
 
 
class PaxosLearner:
    """
    Paxos learner discovering the chosen value.
    """
    
    def __init__(self, acceptor_count: int):
        self.acceptor_count = acceptor_count
        self.accepted_values: Dict[str, Tuple[ProposalNumber, Any]] = {}
        self.chosen_value: Optional[Any] = None
        
    def on_accepted(self, acceptor_id: str, proposal_number: ProposalNumber, value: Any):
        """
        Receive notification that an acceptor accepted a value.
        
        Check if this value has now been chosen (majority accepted).
        """
        self.accepted_values[acceptor_id] = (proposal_number, value)
        
        # Count how many acceptors have accepted this exact (number, value) pair
        count = sum(
            1 for (n, v) in self.accepted_values.values()
            if n == proposal_number and v == value
        )
        
        majority_threshold = self.acceptor_count // 2 + 1
        
        if count >= majority_threshold and self.chosen_value is None:
            self.chosen_value = value
            print(f"Value chosen: {value}")
            
    def get_chosen_value(self) -> Optional[Any]:
        """Return the chosen value, or None if not yet known."""
        return self.chosen_value

The Safety Invariant

Paxos maintains this invariant: If a value v is chosen (accepted by a majority with number n), then every proposal with number > n will also propose v. Why? Any majority of acceptors includes one who accepted v, and Phase 1 forces the proposer to adopt v. This is the key insight that makes Paxos safe.

Worked Example: Paxos in Action

Let's trace through a complete Paxos execution with competing proposals to see how the protocol ensures safety.

Setup:

5 acceptors: A1, A2, A3, A4, A5
2 proposers: P1 (wants to propose "X") and P2 (wants to propose "Y")
Majority quorum: 3 acceptors

Scenario: P1 starts first, but messages to A4 and A5 are delayed

Paxos Timeline
Step	Action	A1	A2	A3	A4	A5
1	Initial state
2	P1: Prepare(1)	Promise(1,-)	Promise(1,-)	Promise(1,-)	(delayed)	(delayed)
3	P1 gets 3 promises, proceeds to Accept
4	P1: Accept(1,X)	Accept(1,X)✓	Accept(1,X)✓	Accept(1,X)✓	(delayed)	(delayed)
5	X is chosen (majority A1,A2,A3)	X	X	X
6	P2: Prepare(2)	Promise(2,1,X)	Promise(2,1,X)	Promise(2,1,X)	Promise(2,-)	Promise(2,-)
7	P2 sees X was accepted at n=1, must adopt X
8	P2: Accept(2,X) [not Y!]	Accept(2,X)✓	Accept(2,X)✓	Accept(2,X)✓	Accept(2,X)✓	Accept(2,X)✓
9	X confirmed again with higher number	X	X	X	X	X

Key observations:

Step 5: X is chosen when 3 acceptors (A1, A2, A3) accept it with proposal number 1.
Step 6-7: When P2 runs Prepare(2), it receives promises from all 5 acceptors. Three of them (A1, A2, A3) report having accepted X at proposal number 1. P2 must propose X, not Y, because X was accepted by a majority.
Step 8-9: P2's Accept(2, X) succeeds and reinforces X as the chosen value.

Even if P2 had reached A4, A5 before P1:

Alternate Timeline: P2 Reaches A4, A5 First
Step	Action	A1	A2	A3	A4	A5
1	P2: Prepare(2) reaches A4, A5 first				Promise(2,-)	Promise(2,-)
2	P1: Prepare(1)	Promise(1,-)	Promise(1,-)	Promise(1,-)	(already promised 2)	(already promised 2)
3	P1 has 3 promises, proceeds
4	P1: Accept(1,X) to A1,A2,A3	Accept(1,X)✓	Accept(1,X)✓	Accept(1,X)✓
5	P1: Accept(1,X) to A4,A5				Reject (promised 2)	Reject (promised 2)
6	P2: Prepare(2) to A1,A2,A3	Promise(2,1,X)	Promise(2,1,X)	Promise(2,1,X)
7	P2 sees X accepted, adopts X
8	P2: Accept(2,X)	Accept(2,X)✓	Accept(2,X)✓	Accept(2,X)✓	Accept(2,X)✓	Accept(2,X)✓

The crucial point:

Even in this alternate timeline, P2 learns about X when it collects promises from A1, A2, A3. Because X was accepted by a majority, any majority P2 consults will include at least one acceptor that knows about X.

When does Paxos not terminate?

Paxos can fail to terminate if proposers keep "stepping on each other":

P1 sends Prepare(1), gets promises from majority
Before P1 can send Accept, P2 sends Prepare(2)
Majority promises to P2, blocking P1's Accept
Before P2 can send Accept, P1 sends Prepare(3)
...and so on forever

This is the liveness problem mentioned by FLP. In practice, randomized timeouts prevent this: proposers wait a random interval before retrying, making continuous collisions unlikely.

Why Each Step is Necessary

Each element of Paxos exists to prevent specific failure modes. Let's examine why we can't simplify the protocol.

Why Phase 1 (Prepare)?

Without Phase 1, we can't prevent conflicting acceptances:

Imagine acceptors just accept the first proposal they see. With network delays, different acceptors see different proposals first. A1 might accept X while A2 accepts Y. Neither achieves majority. Now what? We're stuck, or worse, both might eventually get accepted by different majorities in succession.

Phase 1's promises ensure that once a proposer starts phase 2, old proposals can't interfere.

Why must we adopt the highest accepted value?

Without this rule, we can overwrite a chosen value:

What If We Don't Adopt Accepted Values?
Step	Without Adoption Rule	Consequence
1	P1 gets promises from A1, A2, A3	No one has accepted anything yet
2	P1 sends Accept(1, X) to A1, A2 only	X accepted by A1, A2
3	P1 crashes before reaching A3	X not yet chosen (only 2, need 3)
4	P2 gets promises from A3, A4, A5	None of these accepted anything
5	P2 proposes Accept(2, Y)	If A3 accepts Y: A3, A4, A5 accept Y
6	Y is chosen (3 acceptors)	But A1, A2 have accepted X!

In step 4-5, if P2 didn't have to check what was already accepted, it could propose Y and get it accepted by A3, A4, A5. But X was already accepted by A1, A2. If P1 later resumes and gets A3 to accept X, we have both X and Y accepted by majorities!

The adoption rule prevents this: P2's Prepare would see A1 or A2 (if it contacts them) report X. If P2's majority doesn't include A1 or A2, it's possible that X wasn't chosen. But if P2's Accept succeeds with a majority including A3, that overlaps with any majority that might choose X, so one of them will inform P2.

Why unique, ordered proposal numbers?

Without ordering, we can't determine which promise takes precedence.

If proposals are unordered, an acceptor that promised to "proposal A" has no way to determine if "proposal B" is higher or lower. The promise mechanism only works because proposals are totally ordered.

Minimal Viable Consensus

Paxos is surprisingly minimal. Every element exists to prevent a specific failure. Remove the Prepare phase, and conflicting values can be accepted. Remove the adoption rule, and chosen values can be overwritten. Remove proposal ordering, and promises become meaningless. Paxos is consensus stripped to its essence.

From Single-Decree to Multi-Paxos

Single-decree Paxos chooses one value. For practical systems like replicated databases, we need Multi-Paxos: a sequence of Paxos instances, each choosing one log entry.

Conceptually:

Run separate Paxos instances for log positions 1, 2, 3, ...
Instance i agrees on the value at log position i
Apply log entries in order to replicate state machine

The leader optimization:

Running full two-phase Paxos for every log entry is expensive. Multi-Paxos optimizes:

A stable leader completes Phase 1 for many log positions in advance
Only Phase 2 (Accept) is needed for each new entry
If the leader changes (failure or partition), the new leader runs Phase 1 for pending positions

This reduces steady-state operation from 4 message delays (Prepare → Promise → Accept → Accepted) to 2 (Accept → Accepted).

multi_paxos.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
class MultiPaxosLeader:
    """
    Multi-Paxos leader using leader optimization.
    
    Once leader is established, only Phase 2 is needed
    for each new log entry.
    """
    
    def __init__(self, node_id: int, acceptors: List[str]):
        self.node_id = node_id
        self.acceptors = acceptors
        self.current_proposal_number: Optional[ProposalNumber] = None
        self.prepared_up_to: int = -1  # Highest log index we've prepared
        self.is_leader = False
        
    def become_leader(self) -> bool:
        """
        Establish leadership by running Phase 1 for all pending log positions.
        
        Returns True if we successfully became leader.
        """
        n = self.generate_proposal_number()
        
        # Prepare for "infinity" - all future log positions
        # In practice, send highest known log index to acceptors
        promises = []
        
        for acceptor in self.acceptors:
            try:
                response = self._send_prepare(acceptor, n)
                if response.type == "PROMISE":
                    promises.append(response)
            except NetworkTimeout:
                pass
                
        if len(promises) < self.majority_size():
            return False
            
        # We're now the leader for this proposal number
        self.current_proposal_number = n
        self.is_leader = True
        
        # Recover any in-progress log entries from promises
        self._recover_log_entries(promises)
        
        return True
        
    def replicate_entry(self, log_index: int, value: Any) -> bool:
        """
        Replicate a log entry (Phase 2 only, since we're leader).
        
        Only works if we're the established leader.
        """
        if not self.is_leader:
            raise NotLeaderError("Must be leader to replicate")
            
        # With leader optimization, skip Phase 1
        # Go directly to Phase 2 (Accept)
        
        accepted_count = 0
        for acceptor in self.acceptors:
            try:
                response = self._send_accept(
                    acceptor, 
                    self.current_proposal_number,
                    log_index,
                    value
                )
                if response.type == "ACCEPTED":
                    accepted_count += 1
                elif response.type == "REJECT":
                    # Someone has a higher proposal number
                    # We're no longer the leader
                    self.is_leader = False
                    return False
            except NetworkTimeout:
                pass
                
        return accepted_count >= self.majority_size()
        
    def _recover_log_entries(self, promises: List[Promise]):
        """
        Recover uncommitted log entries from Phase 1 promises.
        
        For each log position, we must adopt the highest-numbered
        accepted value, if any.
        """
        # Collect all accepted entries from promises
        accepted_entries: Dict[int, Tuple[ProposalNumber, Any]] = {}
        
        for promise in promises:
            for log_index, (n, v) in promise.accepted_entries.items():
                if log_index not in accepted_entries:
                    accepted_entries[log_index] = (n, v)
                elif n > accepted_entries[log_index][0]:
                    accepted_entries[log_index] = (n, v)
                    
        # For each recovered entry, complete Phase 2
        for log_index, (n, v) in sorted(accepted_entries.items()):
            # Must re-propose this value to ensure it's committed
            success = self.replicate_entry(log_index, v)
            if not success:
                # Lost leadership during recovery
                return
 
 
class MultiPaxosFollower:
    """
    Multi-Paxos follower (acceptor) with leader lease support.
    """
    
    def __init__(self, node_id: str, storage: DurableStorage):
        self.node_id = node_id
        self.storage = storage
        
        # Per-log-position state
        self.log: Dict[int, LogEntry] = {}
        self.highest_promised: Optional[ProposalNumber] = None
        
    def handle_prepare(self, proposal_number: ProposalNumber) -> PrepareResponse:
        """
        Handle Prepare from aspiring leader.
        
        For Multi-Paxos, this covers all log positions.
        """
        if self.highest_promised is None or proposal_number > self.highest_promised:
            # Update promise
            self.highest_promised = proposal_number
            self.storage.put("highest_promised", proposal_number)
            self.storage.flush()
            
            # Return all accepted but uncommitted log entries
            return PrepareResponse(
                type="PROMISE",
                proposal_number=proposal_number,
                accepted_entries=self._get_accepted_entries()
            )
        else:
            return PrepareResponse(
                type="REJECT",
                highest_promised=self.highest_promised
            )
            
    def handle_accept(self, proposal_number: ProposalNumber, 
                     log_index: int, value: Any) -> AcceptResponse:
        """
        Handle Accept from leader.
        
        For multi-Paxos, each Accept is for a specific log position.
        """
        if proposal_number >= self.highest_promised:
            # Accept this entry
            self.log[log_index] = LogEntry(
                index=log_index,
                proposal_number=proposal_number,
                value=value
            )
            
            self.storage.put(f"log_{log_index}", self.log[log_index])
            self.storage.flush()
            
            return AcceptResponse(type="ACCEPTED")
        else:
            return AcceptResponse(
                type="REJECT",
                highest_promised=self.highest_promised
            )

The Leader Optimization in Practice

In steady state, Multi-Paxos with a stable leader achieves 2-message-delay commits. The leader sends Accept, waits for majority Accepted responses, then knows the entry is committed. This is exactly what Raft formalizes as 'normal operation'. The Prepare phase only runs during leader election or when the current leader is challenged.

Summary: Paxos Protocols

We've built a deep understanding of Paxos, the foundational consensus algorithm. Let's consolidate the essential insights:

Key Takeaways

•Three roles — Proposers drive consensus, Acceptors vote and remember, Learners discover the result. In practice, nodes play all roles.
•Two phases — Prepare establishes leadership and learns history. Accept proposes a value and collects votes.
•The adoption rule — If any promise reports an accepted value, the proposer must adopt the highest-numbered one. This prevents overwriting chosen values.
•Quorum intersection — Any two majorities overlap, so any chosen value is visible to future majorities. This enables safe handoff between proposers.
•Persistence before response — Acceptors must persist promises and acceptances before responding. This survives crashes.
•Multi-Paxos optimization — A stable leader pre-executes Phase 1, reducing steady-state replication to Phase 2 only.
•Liveness via timeouts — Randomized retry intervals prevent indefinite proposer collisions.

What's next:

While Paxos is mathematically elegant, its understandability has been a persistent challenge. Raft was designed from the ground up to be more comprehensible while providing the same guarantees. In the next section, we'll explore Raft's approach and see how it makes consensus more accessible to implementers.

Page Complete

You now understand Paxos: why each phase exists, how the safety invariants work, and how Multi-Paxos extends single-decree consensus to replicated logs. This knowledge provides the foundation for understanding all modern consensus protocols, including Raft, which we'll explore next.

3 / 5

Loading learning content...

Operating SystemsDistributed Coordination

Distributed Coordination

LevelAdvanced

Duration90 mins

TopicDistributed Coordination

3 / 5

Paxos Basics

The First Practical Consensus Protocol

Our approach:

What You Will Learn

The Problem Paxos Solves

Paxos solves single-decree consensus: getting a group of nodes to agree on exactly one value. The value, once chosen, cannot be changed.

The system model:

N nodes that can communicate via message passing
Asynchronous network: messages may be delayed, duplicated, or lost (but not corrupted)
Crash-stop failures: nodes may crash and stop responding, but don't send incorrect messages
Durable storage: nodes can persist data and recover after crashes
No Byzantine failures: all nodes follow the protocol honestly

The scenario:

Multiple nodes may propose values concurrently. The algorithm must:

Choose exactly one value (agreement)
Choose a proposed value (validity)
Eventually terminate if a majority of nodes are correct (liveness)

Why is this hard?

Consider a simple approach: "If you receive a proposal and haven't seen one before, accept it."

Naive Approach Fails

Node A proposes value X
Nodes 1 and 2 accept X
Before node 3 sees X, node B proposes Y
Node 3 (and maybe 4, 5) accepts Y
No majority for either value

Worse: If nodes 1 and 2 fail, only Y might survive. But clients who received acknowledgment for X think it was chosen.

Paxos Solution

Numbered proposals: Each proposal has a unique, ordered number
Prepare phase: Before proposing, check what's already accepted
Promise: Acceptors promise to reject older proposals
Adopt: If someone else's value was accepted, adopt it

Result: At most one value can achieve majority acceptance.

The key insight:

This check-then-propose pattern is the two-phase structure of Paxos.

The Three Roles in Paxos

Paxos defines three logical roles. In practice, a single node often acts in all three roles simultaneously.

1. Proposers:

Proposers are nodes that propose values. A proposer:

Generates unique proposal numbers
Drives the two-phase protocol
Sends Prepare and Accept requests
May abandon its current proposal and start anew with a higher number

2. Acceptors:

Acceptors are the "voters" that constitute the durable memory of the system. An acceptor:

Responds to Prepare requests with promises and previously accepted values
Responds to Accept requests by accepting (or rejecting) values
Persists its state so it survives crashes
The collective state of acceptors determines what value was chosen

3. Learners:

Learners discover what value was chosen. A learner:

Receives notifications when acceptors accept values
Detects when a value has been accepted by a majority
May be the same node as the proposer or a separate client

Paxos Roles Summary
Role	Responsibility	State to Maintain	Messages Sent/Received
Proposer	Initiate consensus rounds	Current proposal number	Sends: Prepare, Accept Receives: Promise, Accepted
Acceptor	Vote on proposals, remember votes	Highest promised number, accepted value	Receives: Prepare, Accept Sends: Promise, Accepted
Learner	Discover chosen value	Set of accepted values per acceptor	Receives: Accepted notifications

Proposal Numbers:

Proposal numbers must be:

Unique across all proposers: No two proposals have the same number
Totally ordered: Any two numbers can be compared
Infinite: Proposers can always generate a higher number

Common implementation:

Combine a local sequence number with the proposer's node ID:

proposal_number = (sequence_number, node_id)

Compare by sequence number first, then by node ID. This ensures uniqueness and ordering without coordination.

Roles are Logical, Not Physical

Phase 1: Prepare

The Prepare phase achieves two goals:

Establish leadership: Block older proposals from being accepted
Learn history: Discover if a value has already been (or might be) chosen

Proposer actions:

Choose a proposal number n higher than any previously used
Send Prepare(n) to all acceptors (or at least a majority)

Acceptor actions upon receiving Prepare(n):

If n > highest_promised_number:
- Update highest_promised_number = n
- Respond with Promise(n, accepted_value, accepted_number)
- Promise: "I will not accept any proposal numbered less than n"
If n <= highest_promised_number:
- Ignore or respond with Reject(highest_promised_number)
- "I've already promised to a higher proposal"

The promise is crucial:

By collecting promises from a majority, the proposer learns:

What values have been accepted by anyone in that majority
That no proposal < n can now be accepted by that majority

This combination lets the proposer safely propose a value.

paxos_phase1.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
# ============================================
# PHASE 1: PREPARE
# ============================================
 
class PaxosProposer:
    """
    Paxos proposer handling Phase 1 (Prepare).
    """
    
    def __init__(self, node_id: int, acceptors: List[str]):
        self.node_id = node_id
        self.acceptors = acceptors
        self.sequence_number = 0  # Incremented for each new proposal
        
    def generate_proposal_number(self) -> ProposalNumber:
        """
        Generate unique proposal number higher than any we've seen.
        
        Format: (sequence, node_id) - lexicographic ordering.
        Sequence ensures our proposals increase.
        Node ID breaks ties for uniqueness.
        """
        self.sequence_number += 1
        return ProposalNumber(self.sequence_number, self.node_id)
        
    def prepare(self, proposal_number: ProposalNumber) -> Optional[PrepareResult]:
        """
        Phase 1a: Send Prepare request to all acceptors.
        
        Returns PrepareResult if majority respond, None otherwise.
        """
        promises: List[Promise] = []
        
        for acceptor in self.acceptors:
            try:
                response = self._send_prepare(acceptor, proposal_number)
                
                if response.type == "PROMISE":
                    promises.append(response)
                else:
                    # Rejected - acceptor has promised to higher proposal
                    # We might want to bump our number and retry
                    pass
                    
            except NetworkTimeout:
                # Acceptor unreachable - continue with others
                pass
                
        # Did we get a majority?
        majority_threshold = len(self.acceptors) // 2 + 1
        
        if len(promises) >= majority_threshold:
            # Success! Extract highest accepted value from promises
            highest_accepted = self._find_highest_accepted(promises)
            return PrepareResult(
                success=True,
                promises=promises,
                highest_accepted=highest_accepted
            )
        else:
            # Failed to get majority - might retry with higher number
            return None
            
    def _find_highest_accepted(self, promises: List[Promise]) -> Optional[AcceptedValue]:
        """
        Among all promises, find the value accepted with highest proposal number.
        
        This is the value we must adopt if it exists.
        """
        highest = None
        
        for promise in promises:
            if promise.accepted_number is not None:
                if highest is None or promise.accepted_number > highest.number:
                    highest = AcceptedValue(
                        number=promise.accepted_number,
                        value=promise.accepted_value
                    )
                    
        return highest
 
 
class PaxosAcceptor:
    """
    Paxos acceptor handling Phase 1 (Prepare).
    """
    
    def __init__(self, node_id: str, storage: DurableStorage):
        self.node_id = node_id
        self.storage = storage
        
        # Recover state from storage (survives crashes)
        self.highest_promised = storage.get("highest_promised", None)
        self.accepted_number = storage.get("accepted_number", None)
        self.accepted_value = storage.get("accepted_value", None)
        
    def handle_prepare(self, proposal_number: ProposalNumber) -> Response:
        """
        Phase 1b: Handle Prepare request from proposer.
        
        If proposal number is higher than any we've seen,
        promise to reject lower-numbered proposals.
        """
        if self.highest_promised is None or proposal_number > self.highest_promised:
            # Accept this prepare - make a promise
            
            # CRITICAL: Persist before responding
            self.highest_promised = proposal_number
            self.storage.put("highest_promised", proposal_number)
            self.storage.flush()
            
            # Respond with our promise and any accepted value
            return Promise(
                proposal_number=proposal_number,
                accepted_number=self.accepted_number,
                accepted_value=self.accepted_value
            )
        else:
            # Reject - we've promised to a higher proposal
            return Reject(
                proposal_number=proposal_number,
                highest_promised=self.highest_promised
            )

Why Persistence Matters

Phase 2: Accept

The Accept phase proposes a value for consensus. After receiving promises from a majority, the proposer knows:

No proposal numbered less than n can now be accepted by a majority
What value (if any) was previously accepted with the highest number

Choosing what value to propose:

This is the critical rule that makes Paxos safe:

If any promise included an accepted value, the proposer must propose the value with the highest accepted number. Only if no promise included an accepted value may the proposer propose its own value.

Why this rule?

Proposer actions (Phase 2a):

Send Accept(n, v) to all acceptors, where:

n is the proposal number from Phase 1
v is the highest accepted value from promises, or our own value if none

Acceptor actions (Phase 2b):

Upon receiving Accept(n, v):

If n >= highest_promised_number:
- Accept the proposal: set accepted_number = n, accepted_value = v
- Persist and respond Accepted(n, v)
If n < highest_promised_number:
- Reject (we promised not to accept proposals below highest_promised)

paxos_phase2.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
# ============================================
# PHASE 2: ACCEPT
# ============================================
 
class PaxosProposer:
    """
    Paxos proposer handling Phase 2 (Accept).
    """
    
    def propose_value(self, my_value: Any) -> Optional[Any]:
        """
        Complete Paxos protocol to propose a value.
        
        Returns the chosen value (might be ours or someone else's).
        Returns None if consensus could not be reached.
        """
        while True:  # Retry until success or explicit failure
            # Phase 1: Prepare
            n = self.generate_proposal_number()
            prepare_result = self.prepare(n)
            
            if prepare_result is None:
                # Failed to get majority - retry with higher number
                continue
                
            # Determine value to propose
            if prepare_result.highest_accepted is not None:
                # MUST use the highest accepted value
                value_to_propose = prepare_result.highest_accepted.value
            else:
                # Free to propose our own value
                value_to_propose = my_value
                
            # Phase 2: Accept
            accept_result = self.accept(n, value_to_propose)
            
            if accept_result.success:
                # Value was chosen!
                return value_to_propose
            else:
                # Someone sent a higher Prepare - retry
                continue
                
    def accept(self, proposal_number: ProposalNumber, value: Any) -> AcceptResult:
        """
        Phase 2a: Send Accept request to all acceptors.
        
        Returns success if a majority accept.
        """
        accepted_count = 0
        
        for acceptor in self.acceptors:
            try:
                response = self._send_accept(acceptor, proposal_number, value)
                
                if response.type == "ACCEPTED":
                    accepted_count += 1
                    # Notify learners
                    self._notify_learners(acceptor, proposal_number, value)
                else:
                    # Rejected - acceptor promised to higher proposal
                    # This proposal won't succeed
                    if response.highest_promised > proposal_number:
                        return AcceptResult(success=False)
                        
            except NetworkTimeout:
                pass  # Continue with other acceptors
                
        majority_threshold = len(self.acceptors) // 2 + 1
        
        return AcceptResult(
            success=(accepted_count >= majority_threshold),
            accepted_count=accepted_count
        )
 
 
class PaxosAcceptor:
    """
    Paxos acceptor handling Phase 2 (Accept).
    """
    
    def handle_accept(self, proposal_number: ProposalNumber, value: Any) -> Response:
        """
        Phase 2b: Handle Accept request from proposer.
        
        Accept if we haven't promised to a higher proposal.
        """
        if self.highest_promised is None or proposal_number >= self.highest_promised:
            # Accept this proposal
            
            # Update our state
            self.highest_promised = proposal_number
            self.accepted_number = proposal_number
            self.accepted_value = value
            
            # CRITICAL: Persist before responding
            self.storage.put("highest_promised", proposal_number)
            self.storage.put("accepted_number", proposal_number)
            self.storage.put("accepted_value", value)
            self.storage.flush()
            
            return Accepted(
                proposal_number=proposal_number,
                value=value
            )
        else:
            # Reject - we've promised to a higher proposal
            return Reject(
                proposal_number=proposal_number,
                highest_promised=self.highest_promised
            )
 
 
class PaxosLearner:
    """
    Paxos learner discovering the chosen value.
    """
    
    def __init__(self, acceptor_count: int):
        self.acceptor_count = acceptor_count
        self.accepted_values: Dict[str, Tuple[ProposalNumber, Any]] = {}
        self.chosen_value: Optional[Any] = None
        
    def on_accepted(self, acceptor_id: str, proposal_number: ProposalNumber, value: Any):
        """
        Receive notification that an acceptor accepted a value.
        
        Check if this value has now been chosen (majority accepted).
        """
        self.accepted_values[acceptor_id] = (proposal_number, value)
        
        # Count how many acceptors have accepted this exact (number, value) pair
        count = sum(
            1 for (n, v) in self.accepted_values.values()
            if n == proposal_number and v == value
        )
        
        majority_threshold = self.acceptor_count // 2 + 1
        
        if count >= majority_threshold and self.chosen_value is None:
            self.chosen_value = value
            print(f"Value chosen: {value}")
            
    def get_chosen_value(self) -> Optional[Any]:
        """Return the chosen value, or None if not yet known."""
        return self.chosen_value

The Safety Invariant

Worked Example: Paxos in Action

Let's trace through a complete Paxos execution with competing proposals to see how the protocol ensures safety.

Setup:

5 acceptors: A1, A2, A3, A4, A5
2 proposers: P1 (wants to propose "X") and P2 (wants to propose "Y")
Majority quorum: 3 acceptors

Scenario: P1 starts first, but messages to A4 and A5 are delayed

Paxos Timeline
Step	Action	A1	A2	A3	A4	A5
1	Initial state
2	P1: Prepare(1)	Promise(1,-)	Promise(1,-)	Promise(1,-)	(delayed)	(delayed)
3	P1 gets 3 promises, proceeds to Accept
4	P1: Accept(1,X)	Accept(1,X)✓	Accept(1,X)✓	Accept(1,X)✓	(delayed)	(delayed)
5	X is chosen (majority A1,A2,A3)	X	X	X
6	P2: Prepare(2)	Promise(2,1,X)	Promise(2,1,X)	Promise(2,1,X)	Promise(2,-)	Promise(2,-)
7	P2 sees X was accepted at n=1, must adopt X
8	P2: Accept(2,X) [not Y!]	Accept(2,X)✓	Accept(2,X)✓	Accept(2,X)✓	Accept(2,X)✓	Accept(2,X)✓
9	X confirmed again with higher number	X	X	X	X	X

Key observations:

Step 5: X is chosen when 3 acceptors (A1, A2, A3) accept it with proposal number 1.
Step 6-7: When P2 runs Prepare(2), it receives promises from all 5 acceptors. Three of them (A1, A2, A3) report having accepted X at proposal number 1. P2 must propose X, not Y, because X was accepted by a majority.
Step 8-9: P2's Accept(2, X) succeeds and reinforces X as the chosen value.

Even if P2 had reached A4, A5 before P1:

Alternate Timeline: P2 Reaches A4, A5 First
Step	Action	A1	A2	A3	A4	A5
1	P2: Prepare(2) reaches A4, A5 first				Promise(2,-)	Promise(2,-)
2	P1: Prepare(1)	Promise(1,-)	Promise(1,-)	Promise(1,-)	(already promised 2)	(already promised 2)
3	P1 has 3 promises, proceeds
4	P1: Accept(1,X) to A1,A2,A3	Accept(1,X)✓	Accept(1,X)✓	Accept(1,X)✓
5	P1: Accept(1,X) to A4,A5				Reject (promised 2)	Reject (promised 2)
6	P2: Prepare(2) to A1,A2,A3	Promise(2,1,X)	Promise(2,1,X)	Promise(2,1,X)
7	P2 sees X accepted, adopts X
8	P2: Accept(2,X)	Accept(2,X)✓	Accept(2,X)✓	Accept(2,X)✓	Accept(2,X)✓	Accept(2,X)✓

The crucial point:

When does Paxos not terminate?

Paxos can fail to terminate if proposers keep "stepping on each other":

P1 sends Prepare(1), gets promises from majority
Before P1 can send Accept, P2 sends Prepare(2)
Majority promises to P2, blocking P1's Accept
Before P2 can send Accept, P1 sends Prepare(3)
...and so on forever

This is the liveness problem mentioned by FLP. In practice, randomized timeouts prevent this: proposers wait a random interval before retrying, making continuous collisions unlikely.

Why Each Step is Necessary

Each element of Paxos exists to prevent specific failure modes. Let's examine why we can't simplify the protocol.

Why Phase 1 (Prepare)?

Without Phase 1, we can't prevent conflicting acceptances:

Phase 1's promises ensure that once a proposer starts phase 2, old proposals can't interfere.

Why must we adopt the highest accepted value?

Without this rule, we can overwrite a chosen value:

What If We Don't Adopt Accepted Values?
Step	Without Adoption Rule	Consequence
1	P1 gets promises from A1, A2, A3	No one has accepted anything yet
2	P1 sends Accept(1, X) to A1, A2 only	X accepted by A1, A2
3	P1 crashes before reaching A3	X not yet chosen (only 2, need 3)
4	P2 gets promises from A3, A4, A5	None of these accepted anything
5	P2 proposes Accept(2, Y)	If A3 accepts Y: A3, A4, A5 accept Y
6	Y is chosen (3 acceptors)	But A1, A2 have accepted X!

Why unique, ordered proposal numbers?

Without ordering, we can't determine which promise takes precedence.

Minimal Viable Consensus

From Single-Decree to Multi-Paxos

Single-decree Paxos chooses one value. For practical systems like replicated databases, we need Multi-Paxos: a sequence of Paxos instances, each choosing one log entry.

Conceptually:

Run separate Paxos instances for log positions 1, 2, 3, ...
Instance i agrees on the value at log position i
Apply log entries in order to replicate state machine

The leader optimization:

Running full two-phase Paxos for every log entry is expensive. Multi-Paxos optimizes:

A stable leader completes Phase 1 for many log positions in advance
Only Phase 2 (Accept) is needed for each new entry
If the leader changes (failure or partition), the new leader runs Phase 1 for pending positions

This reduces steady-state operation from 4 message delays (Prepare → Promise → Accept → Accepted) to 2 (Accept → Accepted).

multi_paxos.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
class MultiPaxosLeader:
    """
    Multi-Paxos leader using leader optimization.
    
    Once leader is established, only Phase 2 is needed
    for each new log entry.
    """
    
    def __init__(self, node_id: int, acceptors: List[str]):
        self.node_id = node_id
        self.acceptors = acceptors
        self.current_proposal_number: Optional[ProposalNumber] = None
        self.prepared_up_to: int = -1  # Highest log index we've prepared
        self.is_leader = False
        
    def become_leader(self) -> bool:
        """
        Establish leadership by running Phase 1 for all pending log positions.
        
        Returns True if we successfully became leader.
        """
        n = self.generate_proposal_number()
        
        # Prepare for "infinity" - all future log positions
        # In practice, send highest known log index to acceptors
        promises = []
        
        for acceptor in self.acceptors:
            try:
                response = self._send_prepare(acceptor, n)
                if response.type == "PROMISE":
                    promises.append(response)
            except NetworkTimeout:
                pass
                
        if len(promises) < self.majority_size():
            return False
            
        # We're now the leader for this proposal number
        self.current_proposal_number = n
        self.is_leader = True
        
        # Recover any in-progress log entries from promises
        self._recover_log_entries(promises)
        
        return True
        
    def replicate_entry(self, log_index: int, value: Any) -> bool:
        """
        Replicate a log entry (Phase 2 only, since we're leader).
        
        Only works if we're the established leader.
        """
        if not self.is_leader:
            raise NotLeaderError("Must be leader to replicate")
            
        # With leader optimization, skip Phase 1
        # Go directly to Phase 2 (Accept)
        
        accepted_count = 0
        for acceptor in self.acceptors:
            try:
                response = self._send_accept(
                    acceptor, 
                    self.current_proposal_number,
                    log_index,
                    value
                )
                if response.type == "ACCEPTED":
                    accepted_count += 1
                elif response.type == "REJECT":
                    # Someone has a higher proposal number
                    # We're no longer the leader
                    self.is_leader = False
                    return False
            except NetworkTimeout:
                pass
                
        return accepted_count >= self.majority_size()
        
    def _recover_log_entries(self, promises: List[Promise]):
        """
        Recover uncommitted log entries from Phase 1 promises.
        
        For each log position, we must adopt the highest-numbered
        accepted value, if any.
        """
        # Collect all accepted entries from promises
        accepted_entries: Dict[int, Tuple[ProposalNumber, Any]] = {}
        
        for promise in promises:
            for log_index, (n, v) in promise.accepted_entries.items():
                if log_index not in accepted_entries:
                    accepted_entries[log_index] = (n, v)
                elif n > accepted_entries[log_index][0]:
                    accepted_entries[log_index] = (n, v)
                    
        # For each recovered entry, complete Phase 2
        for log_index, (n, v) in sorted(accepted_entries.items()):
            # Must re-propose this value to ensure it's committed
            success = self.replicate_entry(log_index, v)
            if not success:
                # Lost leadership during recovery
                return
 
 
class MultiPaxosFollower:
    """
    Multi-Paxos follower (acceptor) with leader lease support.
    """
    
    def __init__(self, node_id: str, storage: DurableStorage):
        self.node_id = node_id
        self.storage = storage
        
        # Per-log-position state
        self.log: Dict[int, LogEntry] = {}
        self.highest_promised: Optional[ProposalNumber] = None
        
    def handle_prepare(self, proposal_number: ProposalNumber) -> PrepareResponse:
        """
        Handle Prepare from aspiring leader.
        
        For Multi-Paxos, this covers all log positions.
        """
        if self.highest_promised is None or proposal_number > self.highest_promised:
            # Update promise
            self.highest_promised = proposal_number
            self.storage.put("highest_promised", proposal_number)
            self.storage.flush()
            
            # Return all accepted but uncommitted log entries
            return PrepareResponse(
                type="PROMISE",
                proposal_number=proposal_number,
                accepted_entries=self._get_accepted_entries()
            )
        else:
            return PrepareResponse(
                type="REJECT",
                highest_promised=self.highest_promised
            )
            
    def handle_accept(self, proposal_number: ProposalNumber, 
                     log_index: int, value: Any) -> AcceptResponse:
        """
        Handle Accept from leader.
        
        For multi-Paxos, each Accept is for a specific log position.
        """
        if proposal_number >= self.highest_promised:
            # Accept this entry
            self.log[log_index] = LogEntry(
                index=log_index,
                proposal_number=proposal_number,
                value=value
            )
            
            self.storage.put(f"log_{log_index}", self.log[log_index])
            self.storage.flush()
            
            return AcceptResponse(type="ACCEPTED")
        else:
            return AcceptResponse(
                type="REJECT",
                highest_promised=self.highest_promised
            )

The Leader Optimization in Practice

Summary: Paxos Protocols

We've built a deep understanding of Paxos, the foundational consensus algorithm. Let's consolidate the essential insights:

Key Takeaways

•Three roles — Proposers drive consensus, Acceptors vote and remember, Learners discover the result. In practice, nodes play all roles.
•Two phases — Prepare establishes leadership and learns history. Accept proposes a value and collects votes.
•The adoption rule — If any promise reports an accepted value, the proposer must adopt the highest-numbered one. This prevents overwriting chosen values.
•Quorum intersection — Any two majorities overlap, so any chosen value is visible to future majorities. This enables safe handoff between proposers.
•Persistence before response — Acceptors must persist promises and acceptances before responding. This survives crashes.
•Multi-Paxos optimization — A stable leader pre-executes Phase 1, reducing steady-state replication to Phase 2 only.
•Liveness via timeouts — Randomized retry intervals prevent indefinite proposer collisions.

What's next:

Page Complete

3 / 5