Loading content...
Imagine you're building a banking system where a funds transfer must debit one account and credit another—atomically. In a single database, this is trivial: wrap both operations in a transaction. But what happens when those accounts live in different databases, on different servers, possibly in different data centers?
This is the atomic commitment problem: ensuring that a set of distributed participants either all commit or all abort a transaction, even in the presence of failures. It's one of the most fundamental challenges in distributed computing, and Two-Phase Commit (2PC) is the classical solution that has been deployed in production systems for over four decades.
Understanding 2PC isn't just historical knowledge—it's the foundation for understanding why modern distributed systems make the architectural choices they do, and why alternatives like Saga patterns and eventual consistency exist.
By the end of this page, you will understand the Two-Phase Commit protocol mechanics, its formal guarantees, failure scenarios and recovery procedures, why it's considered a 'blocking' protocol, its performance characteristics, and where it remains appropriate in modern systems.
Two-Phase Commit, as the name implies, operates in two distinct phases. A designated coordinator orchestrates the protocol, while multiple participants (also called resource managers) execute the actual work.
Phase 1: Prepare (Voting Phase)
PREPARE message to all participantsVOTE_COMMIT (ready to commit) or VOTE_ABORT (cannot commit)Phase 2: Commit/Abort (Decision Phase)
GLOBAL_COMMITGLOBAL_ABORT123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104
// Two-Phase Commit Protocol Implementation interface Participant { id: string; prepare(transaction: Transaction): Promise<VoteResult>; commit(transactionId: string): Promise<void>; abort(transactionId: string): Promise<void>;} type VoteResult = 'VOTE_COMMIT' | 'VOTE_ABORT';type TransactionState = 'INITIATED' | 'PREPARING' | 'PREPARED' | 'COMMITTING' | 'COMMITTED' | 'ABORTING' | 'ABORTED'; class TwoPhaseCommitCoordinator { private transactionLog: TransactionLog; // Durable log for recovery private participants: Participant[]; private timeout: number = 30000; // 30 seconds constructor(participants: Participant[], log: TransactionLog) { this.participants = participants; this.transactionLog = log; } async executeTransaction(transaction: Transaction): Promise<boolean> { const txId = transaction.id; // Log transaction initiation (for recovery) await this.transactionLog.write(txId, 'INITIATED', this.participants.map(p => p.id)); try { // ========== PHASE 1: PREPARE ========== await this.transactionLog.write(txId, 'PREPARING'); const votes = await Promise.all( this.participants.map(async (participant) => { try { const vote = await this.withTimeout( participant.prepare(transaction), this.timeout, 'VOTE_ABORT' // Timeout = abort vote ); await this.transactionLog.writeVote(txId, participant.id, vote); return { participantId: participant.id, vote }; } catch (error) { await this.transactionLog.writeVote(txId, participant.id, 'VOTE_ABORT'); return { participantId: participant.id, vote: 'VOTE_ABORT' as VoteResult }; } }) ); const allCommit = votes.every(v => v.vote === 'VOTE_COMMIT'); // ========== PHASE 2: COMMIT OR ABORT ========== if (allCommit) { // Critical: Log decision BEFORE sending commits await this.transactionLog.write(txId, 'COMMITTING'); await Promise.all( this.participants.map(p => this.retryUntilSuccess(() => p.commit(txId))) ); await this.transactionLog.write(txId, 'COMMITTED'); return true; } else { await this.transactionLog.write(txId, 'ABORTING'); await Promise.all( this.participants.map(p => this.retryUntilSuccess(() => p.abort(txId))) ); await this.transactionLog.write(txId, 'ABORTED'); return false; } } catch (error) { // Recovery will handle incomplete transactions throw new TransactionError(`2PC failed for ${txId}: ${error}`); } } private async withTimeout<T>(promise: Promise<T>, ms: number, fallback: T): Promise<T> { const timeout = new Promise<T>((_, reject) => setTimeout(() => reject(new Error('Timeout')), ms) ); try { return await Promise.race([promise, timeout]); } catch { return fallback; } } private async retryUntilSuccess(fn: () => Promise<void>): Promise<void> { while (true) { try { await fn(); return; } catch (error) { await this.sleep(1000); // Retry after 1 second } } } private sleep(ms: number): Promise<void> { return new Promise(resolve => setTimeout(resolve, ms)); }}Both coordinator and participants MUST write their state to durable storage before sending messages. If a participant votes COMMIT, it has promised to commit—even if it crashes. Upon recovery, it must honor that promise. This durability requirement is what makes 2PC safe but also contributes to its overhead.
The correctness of 2PC stems from its carefully designed state machine. Understanding these states and transitions is essential for implementing the protocol correctly.
Coordinator State Machine:
INITIATED
│
▼ (send PREPARE to all)
PREPARING
│
├──→ all VOTE_COMMIT ──→ COMMITTING ──→ COMMITTED
│ │
│ ▼ (send GLOBAL_COMMIT)
│
└──→ any VOTE_ABORT ──→ ABORTING ──→ ABORTED
│
▼ (send GLOBAL_ABORT)
Participant State Machine:
WORKING
│
▼ (receive PREPARE)
PREPARED ─────────────────────────┐
│ │
├──→ receive GLOBAL_COMMIT ──→ COMMITTED
│
└──→ receive GLOBAL_ABORT ──→ ABORTED
| State | Durably Logged? | Locks Held? | Can Unilaterally Decide? |
|---|---|---|---|
| WORKING | No | No | Yes (can abort) |
| PREPARED | Yes | Yes | No (must wait for coordinator) |
| COMMITTED | Yes | Released | N/A (final) |
| ABORTED | Yes | Released | N/A (final) |
Once a participant enters the PREPARED state, it has surrendered its autonomy. It cannot unilaterally commit or abort—it MUST wait for the coordinator's decision. This is called the 'uncertainty period' and is the source of 2PC's blocking behavior.
The true complexity of 2PC lies in handling failures. The protocol must maintain safety guarantees even when coordinators crash, participants fail, networks partition, or messages are lost.
Scenario 1: Participant Fails Before Voting
If a participant crashes before sending its vote, the coordinator will timeout waiting and treat this as a VOTE_ABORT. The transaction aborts. Upon recovery, the participant finds no PREPARED record and simply discards any incomplete work.
Scenario 2: Participant Fails After Voting COMMIT
This is more complex. The participant has durably logged PREPARED and may have sent VOTE_COMMIT. Upon recovery:
Scenario 3: Coordinator Fails After Collecting Votes
This is the blocking scenario. If the coordinator crashes after collecting votes but before broadcasting the decision:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101
// Recovery Protocol for Coordinator and Participants class CoordinatorRecovery { constructor( private transactionLog: TransactionLog, private participants: Map<string, Participant> ) {} async recover(): Promise<void> { const incompleteTransactions = await this.transactionLog.getIncomplete(); for (const tx of incompleteTransactions) { await this.recoverTransaction(tx); } } private async recoverTransaction(tx: TransactionRecord): Promise<void> { switch (tx.state) { case 'INITIATED': case 'PREPARING': // Decision not made yet - abort await this.abortTransaction(tx.id); break; case 'COMMITTING': // Decision was COMMIT - complete it await this.completeCommit(tx.id, tx.participantIds); break; case 'ABORTING': // Decision was ABORT - complete it await this.completeAbort(tx.id, tx.participantIds); break; case 'COMMITTED': case 'ABORTED': // Already complete - no action needed break; } } private async completeCommit(txId: string, participantIds: string[]): Promise<void> { await Promise.all( participantIds.map(id => this.retryUntilAck(() => this.participants.get(id)!.commit(txId)) ) ); await this.transactionLog.write(txId, 'COMMITTED'); }} class ParticipantRecovery { constructor( private localLog: ParticipantLog, private coordinator: CoordinatorClient ) {} async recover(): Promise<void> { const preparedTransactions = await this.localLog.getPrepared(); for (const tx of preparedTransactions) { // Must ask coordinator for the decision const decision = await this.coordinator.queryDecision(tx.id); if (decision === 'COMMIT') { await this.localLog.commit(tx.id); } else if (decision === 'ABORT') { await this.localLog.abort(tx.id); } else { // Coordinator doesn't know yet - keep waiting // This is the BLOCKING scenario console.log(`Transaction ${tx.id} still in doubt - waiting...`); } } }} // Cooperative Termination Protocol// Allows participants to contact each other when coordinator is down class CooperativeTermination { async queryOtherParticipants( txId: string, otherParticipants: Participant[] ): Promise<'COMMIT' | 'ABORT' | 'UNKNOWN'> { for (const participant of otherParticipants) { try { const state = await participant.queryState(txId); if (state === 'COMMITTED') return 'COMMIT'; if (state === 'ABORTED') return 'ABORT'; if (state === 'WORKING') return 'ABORT'; // They never prepared } catch (error) { continue; // Try next participant } } // All participants in PREPARED state - truly blocked return 'UNKNOWN'; }}If the coordinator fails after all participants vote COMMIT but before broadcasting the decision, AND at least one participant also fails, the remaining participants cannot determine the outcome. They must wait indefinitely, holding locks on resources. This is why 2PC is called a 'blocking' protocol.
Understanding 2PC's performance profile is crucial for deciding when to use it. The protocol has inherent costs that cannot be optimized away.
Message Complexity:
For N participants:
Latency:
The minimum latency is 4 network round-trips:
Plus the time for each durable write (typically 1-10ms for SSD, 10-100ms for spinning disk).
Resource Holding:
Locks are held from PREPARE until COMMIT/ABORT. In a WAN setting with ~100ms round-trips, a transaction holds locks for at least 400ms. Under high concurrency, this causes significant contention.
| Scenario | Round-Trip Time | Min Transaction Time | Practical TPS |
|---|---|---|---|
| Same datacenter | 0.5ms | ~10ms | ~1000 TPS |
| Cross-region (same continent) | 20ms | ~100ms | ~100 TPS |
| Global (cross-continent) | 100ms | ~500ms | ~20 TPS |
Despite its limitations, 2PC remains valuable in specific contexts. The key is understanding where its guarantees are worth its costs.
2PC is still widely used internally by databases (e.g., PostgreSQL for distributed queries, MySQL Cluster) and within single-datacenter microservices. For cross-datacenter transactions, consider alternatives like Saga patterns, or consensus-based protocols like Paxos Commit or Raft.
What's Next:
The next page explores Three-Phase Commit (3PC), an attempt to address the blocking problem of 2PC by adding an additional phase. We'll examine why 3PC reduces but doesn't eliminate blocking, and why it's rarely used in practice.
You now understand the Two-Phase Commit protocol—its mechanics, guarantees, failure handling, performance characteristics, and appropriate use cases. This foundational knowledge is essential for understanding why modern distributed systems often choose alternative approaches.