Loading content...
Two-Phase Commit's Achilles heel is blocking—when the coordinator fails at a critical moment, participants can be left in limbo indefinitely, holding locks on precious resources. In 1983, Dale Skeen proposed Three-Phase Commit (3PC) as a solution, introducing an additional phase to ensure that no single failure can block the protocol.
The insight behind 3PC is elegant: insert an intermediate state between 'prepared' and 'committed' so that participants can always make progress, even without the coordinator. But as we'll see, this elegance comes with its own costs, and the protocol fails under conditions that are surprisingly common in real distributed systems.
By the end of this page, you will understand the Three-Phase Commit protocol, how the pre-commit phase prevents blocking, why 3PC fails under network partitions, its relationship to the FLP impossibility result, and why modern systems rarely use 3PC.
3PC extends 2PC by splitting the second phase into two separate phases, creating a buffer state that allows recovery without blocking.
Phase 1: Prepare (CanCommit)
Identical to 2PC's first phase:
PREPARE to all participantsVOTE_COMMIT or VOTE_ABORTABORTPhase 2: Pre-Commit (New Phase)
This is the critical addition:
PRE_COMMITPhase 3: Commit (DoCommit)
COMMIT┌────────────────────────────────────────────────────────────────┐
│ THREE-PHASE COMMIT │
│ │
│ Coordinator Participant 1 Participant 2 │
│ │ │ │ │
│ │──── PREPARE ───────►│ │ │
│ │──── PREPARE ────────────────────────────►│ │
│ │ │ │ │
│ │◄─── VOTE_COMMIT ────│ │ │
│ │◄─── VOTE_COMMIT ─────────────────────────│ │
│ │ │ │ │
│ │──── PRE_COMMIT ────►│ │ Phase 2 │
│ │──── PRE_COMMIT ─────────────────────────►│ (NEW) │
│ │ │ │ │
│ │◄─── ACK ────────────│ │ │
│ │◄─── ACK ─────────────────────────────────│ │
│ │ │ │ │
│ │──── COMMIT ────────►│ │ │
│ │──── COMMIT ─────────────────────────────►│ │
│ │ │ │ │
│ │◄─── ACK ────────────│ │ │
│ │◄─── ACK ─────────────────────────────────│ │
└────────────────────────────────────────────────────────────────┘
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123
// Three-Phase Commit Protocol Implementation type ParticipantState3PC = | 'WORKING' | 'PREPARED' | 'PRE_COMMITTED' // New state in 3PC | 'COMMITTED' | 'ABORTED'; class ThreePhaseCommitCoordinator { private transactionLog: TransactionLog; private participants: Participant3PC[]; private timeout: number = 10000; async executeTransaction(transaction: Transaction): Promise<boolean> { const txId = transaction.id; // ========== PHASE 1: PREPARE ========== console.log(`[${txId}] Phase 1: Sending PREPARE`); const votes = await this.collectVotes(transaction); if (!votes.allCommit) { await this.broadcastAbort(txId); return false; } // ========== PHASE 2: PRE-COMMIT ========== // This is the key addition in 3PC console.log(`[${txId}] Phase 2: Sending PRE_COMMIT`); await this.transactionLog.write(txId, 'PRE_COMMITTING'); const preCommitAcks = await Promise.all( this.participants.map(async (p) => { try { await this.withTimeout(p.preCommit(txId), this.timeout); return true; } catch (error) { return false; } }) ); // If any participant failed to precommit, we can still safely abort // because no one has committed yet if (!preCommitAcks.every(ack => ack)) { await this.broadcastAbort(txId); return false; } // ========== PHASE 3: COMMIT ========== console.log(`[${txId}] Phase 3: Sending COMMIT`); await this.transactionLog.write(txId, 'COMMITTING'); await Promise.all( this.participants.map(p => this.retryUntilSuccess(() => p.commit(txId)) ) ); await this.transactionLog.write(txId, 'COMMITTED'); return true; } private async collectVotes(tx: Transaction): Promise<{ allCommit: boolean }> { const votes = await Promise.all( this.participants.map(p => this.withTimeout(p.prepare(tx), this.timeout, 'VOTE_ABORT') ) ); return { allCommit: votes.every(v => v === 'VOTE_COMMIT') }; }} class ThreePhaseCommitParticipant { private state: ParticipantState3PC = 'WORKING'; private participantLog: ParticipantLog; private timeoutMs: number = 15000; async prepare(transaction: Transaction): Promise<VoteResult> { // Validate and prepare transaction const canCommit = await this.validateAndPrepare(transaction); if (canCommit) { await this.participantLog.write(transaction.id, 'PREPARED'); this.state = 'PREPARED'; return 'VOTE_COMMIT'; } else { this.state = 'ABORTED'; return 'VOTE_ABORT'; } } async preCommit(txId: string): Promise<void> { // Transition to pre-committed state await this.participantLog.write(txId, 'PRE_COMMITTED'); this.state = 'PRE_COMMITTED'; // Start timeout - if no COMMIT received, we can commit autonomously this.startCommitTimeout(txId); } async commit(txId: string): Promise<void> { await this.participantLog.write(txId, 'COMMITTED'); this.state = 'COMMITTED'; // Release locks, apply changes } private startCommitTimeout(txId: string): void { setTimeout(async () => { if (this.state === 'PRE_COMMITTED') { // KEY 3PC INSIGHT: If we're pre-committed and timeout, // we KNOW all participants received PRE_COMMIT // (otherwise coordinator would have aborted) // So we can safely commit autonomously console.log(`Timeout: Committing ${txId} autonomously`); await this.commit(txId); } }, this.timeoutMs); }}The genius of 3PC lies in ensuring that no single state exists from which both COMMIT and ABORT outcomes are reachable without the coordinator. Let's analyze why this works.
The 2PC Problem:
In 2PC, a participant in PREPARED state faces a dilemma if the coordinator fails:
The 3PC Solution:
The PRE_COMMITTED state creates a clear boundary:
| If I'm in... | And I learn another is in... | Safe Decision |
|---|---|---|
| PREPARED | WORKING | ABORT (they never prepared) |
| PREPARED | PREPARED | ABORT (no PRE_COMMIT sent yet) |
| PREPARED | PRE_COMMITTED | Wait or COMMIT (decision was commit) |
| PRE_COMMITTED | PREPARED | Never happens (coordinator waits for all PRE_COMMIT ACKs) |
| PRE_COMMITTED | PRE_COMMITTED | COMMIT (safe to proceed) |
| PRE_COMMITTED | COMMITTED | COMMIT (obviously) |
3PC maintains the invariant: No participant can be in COMMITTED state while another is in WORKING or PREPARED state. The PRE_COMMITTED state acts as a buffer, ensuring all participants cross into 'ready to commit' territory before any participant actually commits.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465
// 3PC Termination Protocol - How participants recover without coordinator class ThreePhaseTerminationProtocol { constructor( private localState: ParticipantState3PC, private otherParticipants: Participant3PC[] ) {} async electNewCoordinatorAndDecide(txId: string): Promise<'COMMIT' | 'ABORT'> { // Collect states from all reachable participants const states = await this.collectParticipantStates(txId); // Apply termination rules return this.decide(states); } private decide(states: ParticipantState3PC[]): 'COMMIT' | 'ABORT' { // Rule 1: If anyone committed, we must commit if (states.includes('COMMITTED')) { return 'COMMIT'; } // Rule 2: If anyone aborted, we must abort if (states.includes('ABORTED')) { return 'ABORT'; } // Rule 3: If anyone is still WORKING, abort // (coordinator would have sent ABORT, not PRE_COMMIT) if (states.includes('WORKING')) { return 'ABORT'; } // Rule 4: If everyone is PREPARED (no one PRE_COMMITTED) // Coordinator hadn't sent PRE_COMMIT yet → abort if (states.every(s => s === 'PREPARED')) { return 'ABORT'; } // Rule 5: If at least one is PRE_COMMITTED // Coordinator decided to commit → commit if (states.includes('PRE_COMMITTED')) { return 'COMMIT'; } // Should never reach here throw new Error('Unexpected state combination'); } private async collectParticipantStates(txId: string): Promise<ParticipantState3PC[]> { const states: ParticipantState3PC[] = [this.localState]; for (const participant of this.otherParticipants) { try { const state = await participant.queryState(txId); states.push(state); } catch (error) { // Participant unreachable - skip // This is where network partitions cause problems! } } return states; }}While 3PC solves the fail-stop (crash) problem, it fails catastrophically under network partitions—when nodes can't communicate but are still running.
The Partition Scenario:
Imagine three participants: A, B, and C. After Phase 2:
What happens:
{A, B} timeout waiting for coordinator
They run termination protocol, see each other in PRE_COMMITTED
Per 3PC rules, they COMMIT
Meanwhile, C times out
C can't reach A or B
C only sees itself in PREPARED state
Per 3PC rules, C ABORTS
Result: A and B committed while C aborted. We've violated the fundamental safety guarantee of atomic commitment.
3PC assumes a fail-stop model where crashed nodes don't come back with stale state. In real networks, a 'failed' node might just be partitioned, then rejoin later with a different decision. This makes 3PC unsafe for any system that might experience network partitions—which is essentially every distributed system.
The FLP Impossibility:
This isn't a flaw in 3PC's design—it's a fundamental impossibility result. The Fischer-Lynch-Paterson (FLP) theorem (1985) proves that no deterministic protocol can guarantee consensus in an asynchronous system where even one process might fail.
Implications:
| Property | 2PC | 3PC | Paxos/Raft |
|---|---|---|---|
| Safety (No conflicts) | ✓ Always | ✗ Fails under partition | ✓ Always |
| Liveness (Progress) | ✗ Blocking | ✓ Non-blocking (no partition) | ✓ Eventually (with leader) |
| Partition Tolerance | N/A (blocks) | ✗ Unsafe | ✓ Safe |
| Message Complexity | 4N | 6N | O(N²) typical |
| Latency (rounds) | 2 | 3 | 2+ (with leader) |
Despite solving the blocking problem, 3PC has largely been abandoned in favor of other approaches. Here's why:
3PC is an important contribution to distributed systems theory—it proved that non-blocking atomic commitment is possible (under certain assumptions). But the assumptions (no partitions, synchronous timing) don't hold in practice. For real systems, use 2PC (accepting blocking) or consensus-based approaches (Paxos Commit).
What's Next:
The next page explores consensus-based coordination—specifically Paxos and Raft. These protocols take a fundamentally different approach: instead of preventing disagreement, they ensure that if disagreement occurs, it's eventually resolved correctly.
You now understand Three-Phase Commit—its mechanics, why it prevents blocking, and critically, why it's unsafe under network partitions. This understanding is essential for appreciating why consensus protocols like Paxos and Raft were developed.