Data Sync Patterns - Learning Module

Loading content...

0/273

Two-Phase Commit (2PC)

The Atomic Commitment Problem

Imagine you're building a banking system where a funds transfer must debit one account and credit another—atomically. In a single database, this is trivial: wrap both operations in a transaction. But what happens when those accounts live in different databases, on different servers, possibly in different data centers?

This is the atomic commitment problem: ensuring that a set of distributed participants either all commit or all abort a transaction, even in the presence of failures. It's one of the most fundamental challenges in distributed computing, and Two-Phase Commit (2PC) is the classical solution that has been deployed in production systems for over four decades.

Understanding 2PC isn't just historical knowledge—it's the foundation for understanding why modern distributed systems make the architectural choices they do, and why alternatives like Saga patterns and eventual consistency exist.

What You Will Learn

By the end of this page, you will understand the Two-Phase Commit protocol mechanics, its formal guarantees, failure scenarios and recovery procedures, why it's considered a 'blocking' protocol, its performance characteristics, and where it remains appropriate in modern systems.

The Protocol Mechanics

Two-Phase Commit, as the name implies, operates in two distinct phases. A designated coordinator orchestrates the protocol, while multiple participants (also called resource managers) execute the actual work.

Phase 1: Prepare (Voting Phase)

The coordinator sends a PREPARE message to all participants
Each participant executes the transaction up to the point of committing
Each participant writes its changes to durable storage (but doesn't commit)
Each participant responds with either VOTE_COMMIT (ready to commit) or VOTE_ABORT (cannot commit)

Phase 2: Commit/Abort (Decision Phase)

The coordinator collects all votes
If all participants voted COMMIT, coordinator sends GLOBAL_COMMIT
If any participant voted ABORT (or timed out), coordinator sends GLOBAL_ABORT
Participants execute the final commit or abort and acknowledge

two-phase-commit.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
// Two-Phase Commit Protocol Implementation
 
interface Participant {
    id: string;
    prepare(transaction: Transaction): Promise<VoteResult>;
    commit(transactionId: string): Promise<void>;
    abort(transactionId: string): Promise<void>;
}
 
type VoteResult = 'VOTE_COMMIT' | 'VOTE_ABORT';
type TransactionState = 'INITIATED' | 'PREPARING' | 'PREPARED' | 'COMMITTING' | 'COMMITTED' | 'ABORTING' | 'ABORTED';
 
class TwoPhaseCommitCoordinator {
    private transactionLog: TransactionLog; // Durable log for recovery
    private participants: Participant[];
    private timeout: number = 30000; // 30 seconds
 
    constructor(participants: Participant[], log: TransactionLog) {
        this.participants = participants;
        this.transactionLog = log;
    }
 
    async executeTransaction(transaction: Transaction): Promise<boolean> {
        const txId = transaction.id;
        
        // Log transaction initiation (for recovery)
        await this.transactionLog.write(txId, 'INITIATED', this.participants.map(p => p.id));
 
        try {
            // ========== PHASE 1: PREPARE ==========
            await this.transactionLog.write(txId, 'PREPARING');
            
            const votes = await Promise.all(
                this.participants.map(async (participant) => {
                    try {
                        const vote = await this.withTimeout(
                            participant.prepare(transaction),
                            this.timeout,
                            'VOTE_ABORT' // Timeout = abort vote
                        );
                        await this.transactionLog.writeVote(txId, participant.id, vote);
                        return { participantId: participant.id, vote };
                    } catch (error) {
                        await this.transactionLog.writeVote(txId, participant.id, 'VOTE_ABORT');
                        return { participantId: participant.id, vote: 'VOTE_ABORT' as VoteResult };
                    }
                })
            );
 
            const allCommit = votes.every(v => v.vote === 'VOTE_COMMIT');
            
            // ========== PHASE 2: COMMIT OR ABORT ==========
            if (allCommit) {
                // Critical: Log decision BEFORE sending commits
                await this.transactionLog.write(txId, 'COMMITTING');
                
                await Promise.all(
                    this.participants.map(p => this.retryUntilSuccess(() => p.commit(txId)))
                );
                
                await this.transactionLog.write(txId, 'COMMITTED');
                return true;
            } else {
                await this.transactionLog.write(txId, 'ABORTING');
                
                await Promise.all(
                    this.participants.map(p => this.retryUntilSuccess(() => p.abort(txId)))
                );
                
                await this.transactionLog.write(txId, 'ABORTED');
                return false;
            }
        } catch (error) {
            // Recovery will handle incomplete transactions
            throw new TransactionError(`2PC failed for ${txId}: ${error}`);
        }
    }
 
    private async withTimeout<T>(promise: Promise<T>, ms: number, fallback: T): Promise<T> {
        const timeout = new Promise<T>((_, reject) => 
            setTimeout(() => reject(new Error('Timeout')), ms)
        );
        try {
            return await Promise.race([promise, timeout]);
        } catch {
            return fallback;
        }
    }
 
    private async retryUntilSuccess(fn: () => Promise<void>): Promise<void> {
        while (true) {
            try {
                await fn();
                return;
            } catch (error) {
                await this.sleep(1000); // Retry after 1 second
            }
        }
    }
 
    private sleep(ms: number): Promise<void> {
        return new Promise(resolve => setTimeout(resolve, ms));
    }
}

The Durability Requirement

Both coordinator and participants MUST write their state to durable storage before sending messages. If a participant votes COMMIT, it has promised to commit—even if it crashes. Upon recovery, it must honor that promise. This durability requirement is what makes 2PC safe but also contributes to its overhead.

State Machine and Formal Guarantees

The correctness of 2PC stems from its carefully designed state machine. Understanding these states and transitions is essential for implementing the protocol correctly.

Coordinator State Machine:

    INITIATED
        │
        ▼ (send PREPARE to all)
    PREPARING
        │
        ├──→ all VOTE_COMMIT ──→ COMMITTING ──→ COMMITTED
        │                            │
        │                            ▼ (send GLOBAL_COMMIT)
        │
        └──→ any VOTE_ABORT ──→ ABORTING ──→ ABORTED
                                    │
                                    ▼ (send GLOBAL_ABORT)

Participant State Machine:

    WORKING
        │
        ▼ (receive PREPARE)
    PREPARED ─────────────────────────┐
        │                              │
        ├──→ receive GLOBAL_COMMIT ──→ COMMITTED
        │
        └──→ receive GLOBAL_ABORT ──→ ABORTED

2PC Formal Guarantees

•Safety (Agreement) — No two participants reach different final decisions. If one commits, all commit. If one aborts, all abort.
•Safety (Validity) — If all participants vote COMMIT and no failures occur, the transaction commits. If any participant votes ABORT, the transaction aborts.
•Liveness (Termination) — In the absence of failures, all participants eventually reach a final decision. However, 2PC does NOT guarantee termination under all failure scenarios—this is its Achilles heel.

2PC State Responsibilities
State	Durably Logged?	Locks Held?	Can Unilaterally Decide?
WORKING	No	No	Yes (can abort)
PREPARED	Yes	Yes	No (must wait for coordinator)
COMMITTED	Yes	Released	N/A (final)
ABORTED	Yes	Released	N/A (final)

The Point of No Return

Once a participant enters the PREPARED state, it has surrendered its autonomy. It cannot unilaterally commit or abort—it MUST wait for the coordinator's decision. This is called the 'uncertainty period' and is the source of 2PC's blocking behavior.

Failure Scenarios and Recovery

The true complexity of 2PC lies in handling failures. The protocol must maintain safety guarantees even when coordinators crash, participants fail, networks partition, or messages are lost.

Scenario 1: Participant Fails Before Voting

If a participant crashes before sending its vote, the coordinator will timeout waiting and treat this as a VOTE_ABORT. The transaction aborts. Upon recovery, the participant finds no PREPARED record and simply discards any incomplete work.

Scenario 2: Participant Fails After Voting COMMIT

This is more complex. The participant has durably logged PREPARED and may have sent VOTE_COMMIT. Upon recovery:

It finds a PREPARED record in its log
It must contact the coordinator (or other participants) to learn the outcome
It then commits or aborts based on the global decision

Scenario 3: Coordinator Fails After Collecting Votes

This is the blocking scenario. If the coordinator crashes after collecting votes but before broadcasting the decision:

Participants who voted COMMIT are stuck in PREPARED state
They cannot commit (don't know if others voted ABORT)
They cannot abort (others might have committed)
They must hold locks and wait for coordinator recovery

two-phase-commit-recovery.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
// Recovery Protocol for Coordinator and Participants
 
class CoordinatorRecovery {
    constructor(
        private transactionLog: TransactionLog,
        private participants: Map<string, Participant>
    ) {}
 
    async recover(): Promise<void> {
        const incompleteTransactions = await this.transactionLog.getIncomplete();
        
        for (const tx of incompleteTransactions) {
            await this.recoverTransaction(tx);
        }
    }
 
    private async recoverTransaction(tx: TransactionRecord): Promise<void> {
        switch (tx.state) {
            case 'INITIATED':
            case 'PREPARING':
                // Decision not made yet - abort
                await this.abortTransaction(tx.id);
                break;
                
            case 'COMMITTING':
                // Decision was COMMIT - complete it
                await this.completeCommit(tx.id, tx.participantIds);
                break;
                
            case 'ABORTING':
                // Decision was ABORT - complete it
                await this.completeAbort(tx.id, tx.participantIds);
                break;
                
            case 'COMMITTED':
            case 'ABORTED':
                // Already complete - no action needed
                break;
        }
    }
 
    private async completeCommit(txId: string, participantIds: string[]): Promise<void> {
        await Promise.all(
            participantIds.map(id => 
                this.retryUntilAck(() => this.participants.get(id)!.commit(txId))
            )
        );
        await this.transactionLog.write(txId, 'COMMITTED');
    }
}
 
class ParticipantRecovery {
    constructor(
        private localLog: ParticipantLog,
        private coordinator: CoordinatorClient
    ) {}
 
    async recover(): Promise<void> {
        const preparedTransactions = await this.localLog.getPrepared();
        
        for (const tx of preparedTransactions) {
            // Must ask coordinator for the decision
            const decision = await this.coordinator.queryDecision(tx.id);
            
            if (decision === 'COMMIT') {
                await this.localLog.commit(tx.id);
            } else if (decision === 'ABORT') {
                await this.localLog.abort(tx.id);
            } else {
                // Coordinator doesn't know yet - keep waiting
                // This is the BLOCKING scenario
                console.log(`Transaction ${tx.id} still in doubt - waiting...`);
            }
        }
    }
}
 
// Cooperative Termination Protocol
// Allows participants to contact each other when coordinator is down
 
class CooperativeTermination {
    async queryOtherParticipants(
        txId: string, 
        otherParticipants: Participant[]
    ): Promise<'COMMIT' | 'ABORT' | 'UNKNOWN'> {
        for (const participant of otherParticipants) {
            try {
                const state = await participant.queryState(txId);
                
                if (state === 'COMMITTED') return 'COMMIT';
                if (state === 'ABORTED') return 'ABORT';
                if (state === 'WORKING') return 'ABORT'; // They never prepared
            } catch (error) {
                continue; // Try next participant
            }
        }
        
        // All participants in PREPARED state - truly blocked
        return 'UNKNOWN';
    }
}

The Blocking Problem

If the coordinator fails after all participants vote COMMIT but before broadcasting the decision, AND at least one participant also fails, the remaining participants cannot determine the outcome. They must wait indefinitely, holding locks on resources. This is why 2PC is called a 'blocking' protocol.

Performance Characteristics

Understanding 2PC's performance profile is crucial for deciding when to use it. The protocol has inherent costs that cannot be optimized away.

Message Complexity:

For N participants:

Phase 1: N PREPARE messages + N VOTE responses = 2N messages
Phase 2: N COMMIT/ABORT messages + N acknowledgments = 2N messages
Total: 4N messages (or 3N with presumed abort optimization)

Latency:

The minimum latency is 4 network round-trips:

Coordinator → Participant: PREPARE
Participant → Coordinator: VOTE
Coordinator → Participant: COMMIT
Participant → Coordinator: ACK

Plus the time for each durable write (typically 1-10ms for SSD, 10-100ms for spinning disk).

Resource Holding:

Locks are held from PREPARE until COMMIT/ABORT. In a WAN setting with ~100ms round-trips, a transaction holds locks for at least 400ms. Under high concurrency, this causes significant contention.

2PC Performance Impact by Deployment
Scenario	Round-Trip Time	Min Transaction Time	Practical TPS
Same datacenter	0.5ms	~10ms	~1000 TPS
Cross-region (same continent)	20ms	~100ms	~100 TPS
Global (cross-continent)	100ms	~500ms	~20 TPS

Performance Optimizations

•Presumed Abort — Coordinator doesn't log aborted transactions. On recovery, unknown transactions are presumed aborted. Saves one disk write for aborted transactions.
•Presumed Commit — Inverse of presumed abort. Better when most transactions commit.
•Read-Only Optimization — Participants with no writes vote 'READ_ONLY' and don't participate in Phase 2.
•Early Prepare — Participants begin preparing before receiving explicit PREPARE (piggyback on last operation).
•Group Commit — Batch multiple transactions' log writes together to amortize disk I/O cost.

When to Use 2PC in Modern Systems

Despite its limitations, 2PC remains valuable in specific contexts. The key is understanding where its guarantees are worth its costs.

Good Fit For 2PC

•Transactions within a single datacenter
•Low-latency networks with reliable connectivity
•Small number of participants (2-5)
•Strong consistency is non-negotiable
•Transaction throughput is moderate
•Coordinator high-availability is ensured

Poor Fit For 2PC

•Geo-distributed transactions (high latency)
•Many participants (coordination overhead)
•High throughput requirements
•Availability is more important than consistency
•Long-running transactions
•Untrusted or unreliable networks

Modern 2PC Usage

2PC is still widely used internally by databases (e.g., PostgreSQL for distributed queries, MySQL Cluster) and within single-datacenter microservices. For cross-datacenter transactions, consider alternatives like Saga patterns, or consensus-based protocols like Paxos Commit or Raft.

What's Next:

The next page explores Three-Phase Commit (3PC), an attempt to address the blocking problem of 2PC by adding an additional phase. We'll examine why 3PC reduces but doesn't eliminate blocking, and why it's rarely used in practice.

Page Complete

You now understand the Two-Phase Commit protocol—its mechanics, guarantees, failure handling, performance characteristics, and appropriate use cases. This foundational knowledge is essential for understanding why modern distributed systems often choose alternative approaches.