Database Management SystemsDistributed Transactions

Distributed Transactions

LevelAdvanced

Duration75 mins

TopicDistributed Transactions

5 / 5

Three-Phase Commit (3PC) Overview

Beyond the Blocking Problem

The Two-Phase Commit protocol has a fundamental weakness: it can block indefinitely when the coordinator fails while participants are in the PREPARED state. This blocking problem motivated the development of the Three-Phase Commit (3PC) protocol, introduced by Dale Skeen in 1981.

3PC adds an additional intermediate phase between PREPARE and COMMIT, creating a buffer that allows participants to make progress even when the coordinator becomes unreachable. The key insight is that by introducing a pre-commit phase, participants gain enough information to safely resolve the transaction without the coordinator.

However, 3PC is not a silver bullet. While it eliminates blocking under certain failure assumptions, it introduces additional complexity, latency, and—critically—it cannot handle network partitions correctly. Understanding 3PC's design, benefits, and limitations provides essential insight into the fundamental trade-offs in distributed transaction protocols.

What You Will Learn

By the end of this page, you will understand the Three-Phase Commit protocol's motivation and design. You'll comprehend its three phases in detail, understand why it's non-blocking under certain conditions, recognize its limitations (especially with network partitions), and be able to compare 2PC and 3PC for different scenarios.

Motivation: Why Three-Phase Commit?

To understand 3PC, we must first deeply analyze why 2PC blocks and what properties would be needed to avoid blocking.

The Root Cause of 2PC Blocking:

In 2PC, when a participant enters the PREPARED state:

It has voted VOTE_COMMIT
It has promised to follow the coordinator's decision
It doesn't know if other participants voted COMMIT or ABORT
It doesn't know the coordinator's decision

If the coordinator fails, the participant is stuck. It cannot:

Commit unilaterally: Other participants might have voted ABORT, and the coordinator decided ABORT
Abort unilaterally: Other participants might have received GLOBAL_COMMIT and already committed

The problem is asymmetric information: from the PREPARED state, the participant cannot distinguish between a future COMMIT and a future ABORT.

The Key Observation

In 2PC, there's a direct transition from PREPARED to COMMITTED. A participant can be in PREPARED while another is in COMMITTED—an inconsistent situation where resolution requires the coordinator. If we insert an intermediate state that ALL participants must pass through before COMMITTED, no participant can be COMMITTED while others are still uncertain.

The 3PC Solution: Add a Pre-Commit Phase

3PC inserts a PRE-COMMIT phase between PREPARED and COMMITTED:

Phase 1 (Can Commit?): Coordinator asks all participants if they can commit
Phase 2 (Pre-Commit): If all can commit, coordinator tells all to prepare to commit (PRE-COMMIT)
Phase 3 (Do Commit): Coordinator tells all to actually commit

The crucial property: Before any participant receives DO_COMMIT (and enters COMMITTED), all participants have received PRE_COMMIT (and entered PRE-COMMITTED state).

This means if the coordinator fails and any participant is in PRE-COMMITTED state, that participant knows:

All participants voted to commit (otherwise we wouldn't have reached PRE-COMMIT)
No participant has committed yet (Phase 3 hasn't happened)
Either commit or abort is safe—if all remaining participants agree

This additional knowledge enables non-blocking termination.

Blocking Vulnerability: 2PC vs 3PC
Scenario	2PC Behavior	3PC Behavior
Coordinator fails during vote collection	Participants abort or block until recovery	Participants abort or block until recovery
Coordinator fails after decision, before any notification	All PREPARED participants blocked	PRE-COMMIT phase prevents this exact scenario
Coordinator fails after some but not all notified	Some committed, others blocked	PRE-COMMIT ensures uniform state among participants
Network partition during commit phase	Blocked	May commit or abort incorrectly (see limitations)

The Three Phases in Detail

Let's examine each phase of the 3PC protocol in rigorous detail.

Phase 1: Can Commit (Voting Phase)

This phase is essentially identical to 2PC's prepare phase:

Coordinator sends CAN_COMMIT message to all participants
Each participant determines if it can commit:
- Checks constraints, resources, locks
- If yes: replies YES_VOTE and enters UNCERTAIN state
- If no: replies NO_VOTE and aborts locally
Coordinator collects all votes:
- If all YES_VOTE: proceed to Phase 2
- If any NO_VOTE or timeout: send ABORT to all and terminate

Important: At the end of Phase 1, if the coordinator hasn't received all YES_VOTEs, it aborts immediately. There is no ambiguity—the coordinator knows the transaction will abort.

Converting Mermaid diagram...

Phase 2: Pre-Commit (Prepare to Commit Phase)

This is the new phase that distinguishes 3PC from 2PC:

Coordinator sends PRE_COMMIT message to all participants
Each participant receives PRE_COMMIT:
- Writes durable log record indicating readiness
- Enters PRE-COMMITTED state
- Replies with ACK_PRE_COMMIT
Coordinator collects all acknowledgments:
- If all acknowledge: proceed to Phase 3
- If timeout: can still safely abort (no one has committed yet)

Critical Property: Receiving PRE_COMMIT tells a participant that:

All participants voted YES (unanimous consent)
The coordinator intends to commit
But no participant has committed yet

This is the key insight: the PRE-COMMITTED state indicates consensus to commit without irrevocable commitment.

Converting Mermaid diagram...

Phase 3: Do Commit (Commit Phase)

The final phase mirrors 2PC's commit phase:

Coordinator sends DO_COMMIT message to all participants
Each participant receives DO_COMMIT:
- Makes changes permanent
- Enters COMMITTED state
- Replies with DONE
Coordinator collects confirmations:
- Transaction complete when all DONE received

The Complete Message Flow:

3PC Message Sequence (Successful Commit)
Phase	Coordinator Message	Participant Response	State Transition
Phase 1	CAN_COMMIT	YES_VOTE	INITIAL → UNCERTAIN
Phase 2	PRE_COMMIT	ACK_PRE_COMMIT	UNCERTAIN → PRE-COMMITTED
Phase 3	DO_COMMIT	DONE	PRE-COMMITTED → COMMITTED

State Machines for 3PC

Understanding the state machines for both coordinator and participant is essential for reasoning about 3PC's non-blocking properties.

Coordinator State Machine:

INITIAL: Transaction executing
WAITING: Sent CAN_COMMIT, waiting for votes
PRE-COMMITTING: All YES_VOTEs received, sent PRE_COMMIT, waiting for ACKs
COMMITTING: All PRE_COMMIT ACKs received, sent DO_COMMIT
COMMITTED: All DONE received, transaction complete
ABORTED: Abort decision made at any point

Converting Mermaid diagram...

Participant State Machine:

INITIAL: Executing local operations
UNCERTAIN: Voted YES_VOTE, waiting for Phase 2 message
PRE-COMMITTED: Received PRE_COMMIT, ready to commit, waiting for DO_COMMIT
COMMITTED: Received DO_COMMIT, transaction completed
ABORTED: Received ABORT or decided to abort

The Key Non-Blocking Property:

Notice that in 3PC, before any participant reaches COMMITTED, all participants must pass through PRE-COMMITTED. This means:

If coordinator fails and ANY participant is PRE-COMMITTED, all survivors know the intent was to commit
If coordinator fails and NO participant is PRE-COMMITTED, all can safely abort
There's no scenario where one participant is COMMITTED while another is only UNCERTAIN

Converting Mermaid diagram...

Timeout Behavior Difference

In 3PC, timeout behavior differs by state: In UNCERTAIN state, timeout leads to abort (safe because no one committed). In PRE-COMMITTED state, timeout can lead to commit (safe because all voted yes and coordinator intended to commit). This asymmetry is what enables non-blocking termination.

Non-Blocking Termination

The primary advantage of 3PC over 2PC is its non-blocking property under certain failure assumptions. Let's understand how this works.

The Termination Protocol:

When the coordinator fails and a participant times out waiting for a message, the participant initiates a termination protocol:

Elect a new coordinator from surviving participants
New coordinator polls all reachable participants for their states
Based on collected states, new coordinator decides:
- If ANY participant is COMMITTED → Everyone should COMMIT
- If ANY participant is ABORTED → Everyone should ABORT
- If ANY participant is PRE-COMMITTED and none COMMITTED → Safe to COMMIT
- If ALL participants are UNCERTAIN (none PRE-COMMITTED) → Safe to ABORT
- If mix of UNCERTAIN and PRE-COMMITTED → COMMIT (PRE-COMMITTED means intent was commit)

3PC Termination Protocol Decision Rules
States Found Among Survivors	Decision	Rationale
Any COMMITTED	COMMIT	Someone already committed; all must commit
Any ABORTED	ABORT	Someone already aborted; all must abort
Any PRE-COMMITTED, none COMMITTED/ABORTED	COMMIT	All agreed, coordinator intended to commit
All UNCERTAIN	ABORT	No one reached PRE-COMMIT; safe to abort
Mix of UNCERTAIN and PRE-COMMITTED	COMMIT	PRE-COMMITTED proves unanimous YES

Why This Works:

The key invariant that 3PC maintains is:

No participant can be in COMMITTED state while any other participant is in UNCERTAIN state.

This is because:

A participant enters PRE-COMMITTED only after receiving PRE_COMMIT
The coordinator sends PRE_COMMIT to all participants before sending any DO_COMMIT
A participant enters COMMITTED only after receiving DO_COMMIT
Therefore, if anyone is COMMITTED, everyone must have passed through PRE-COMMITTED

Contrast with 2PC:

In 2PC, a participant in PREPARED state (equivalent to UNCERTAIN) cannot know if another participant has committed. In 3PC, if any participant is merely UNCERTAIN, we know for certain that no participant is COMMITTED—enabling safe abort.

Converting Mermaid diagram...

The PRE-COMMIT Buffer

Think of PRE-COMMITTED as a 'buffer zone' between uncertainty and commitment. All participants must pass through this buffer before any can commit. This synchronization point gives survivors enough information to resolve the transaction without the original coordinator.

3PC Failure Scenarios

Let's trace through how 3PC handles various failure scenarios, comparing with 2PC behavior.

Scenario 5.1: Coordinator Fails After Receiving All YES_VOTEs, Before Sending PRE_COMMIT

In 2PC: All participants in PREPARED state, blocked forever In 3PC:

All participants in UNCERTAIN state
Timeout triggers termination protocol
New coordinator finds all UNCERTAIN
Decision: ABORT (safe, no one pre-committed)
Transaction aborts, no blocking!

Scenario 5.2: Coordinator Fails After Sending Some PRE_COMMIT Messages

In 3PC:

Some participants in PRE-COMMITTED state
Some participants in UNCERTAIN state (didn't receive PRE_COMMIT)
Termination protocol finds mix of PRE-COMMITTED and UNCERTAIN
Decision: COMMIT (PRE-COMMITTED proves unanimous YES)
All participants commit, no blocking!

Converting Mermaid diagram...

Scenario 5.3: Coordinator Fails After Sending All PRE_COMMIT, Before Sending DO_COMMIT

In 3PC:

All participants in PRE-COMMITTED state
Termination protocol finds all PRE-COMMITTED
Decision: COMMIT
All participants commit, no blocking!

Scenario 5.4: Coordinator Fails After Sending Some DO_COMMIT Messages

In 3PC:

Some participants in COMMITTED state
Some participants in PRE-COMMITTED state
Termination protocol finds at least one COMMITTED
Decision: COMMIT
Remaining participants commit

This scenario is similar to 2PC in the sense that the decision is already determined, but unlike 2PC, participants can resolve it themselves without waiting for coordinator recovery.

Scenarios 3PC Handles Well

•Coordinator crash during vote collection
•Coordinator crash after votes, before PRE_COMMIT
•Coordinator crash during PRE_COMMIT phase
•Coordinator crash during DO_COMMIT phase
•Multiple node failures (if quorum survives)
•Message loss (with retries)

Scenarios 3PC Struggles With

•Network partitions (critical flaw)
•Combined coordinator + partition failures
•Byzantine failures (not handled)
•Total network failure (all disconnected)
•Partition that separates all participants

The Network Partition Problem

While 3PC eliminates blocking under crash failures, it has a critical vulnerability: network partitions can cause inconsistency. This is 3PC's most significant limitation.

The Dangerous Scenario:

All participants vote YES, reach UNCERTAIN state
Coordinator sends PRE_COMMIT to some participants
Network partition occurs, splitting participants:
- Partition A: Contains coordinator and participants who received PRE_COMMIT (in PRE-COMMITTED state)
- Partition B: Contains participants who didn't receive PRE_COMMIT (in UNCERTAIN state)
Both partitions believe they have quorum and can proceed
Partition A: Sees PRE-COMMITTED participants, decides COMMIT
Partition B: Sees only UNCERTAIN participants, decides ABORT
Result: INCONSISTENCY! Some committed, some aborted.

Violated Safety

This scenario violates the fundamental atomicity property: different participants have reached different terminal states. Unlike 2PC's blocking (which sacrifices liveness for safety), 3PC's partition vulnerability sacrifices safety for liveness. Neither is acceptable in all scenarios.

Converting Mermaid diagram...

Why This Happens:

The termination protocol assumes that if no participant is PRE-COMMITTED, it's safe to abort. But during a partition:

Partition B cannot see participants in Partition A who are PRE-COMMITTED
Partition B sees only UNCERTAIN participants, believes abort is safe
Meanwhile, Partition A sees PRE-COMMITTED participants and commits

The Fundamental Trade-off:

3PC's termination protocol assumes that all surviving participants can communicate with each other. With network partitions, this assumption breaks down. Each partition sees only a subset of participants and may make inconsistent decisions.

Quorum-Based Solutions:

Some 3PC variants use quorum-based voting to mitigate this:

Require a majority of participants to agree on the decision
Only the partition with a majority can make progress
The minority partition must block

However, this reintroduces some blocking (though less than 2PC) and requires careful configuration of quorum sizes.

2PC vs 3PC: A Comprehensive Comparison

Let's systematically compare the two protocols across multiple dimensions to understand when each is appropriate.

Two-Phase Commit vs Three-Phase Commit
Dimension	2PC	3PC
Message Complexity	4n messages (2 rounds)	6n messages (3 rounds)
Latency	2 RTTs to commit point	3 RTTs to commit point
Blocking under Crash Failures	Yes (coordinator crash blocks)	No (survivors can terminate)
Safety under Partitions	Safe (blocks rather than diverge)	UNSAFE (may diverge)
Implementation Complexity	Moderate	Higher
Practical Adoption	Widespread (XA, databases)	Rare
Log Writes per Transaction	2-3 forced writes	3-4 forced writes
Recovery Complexity	Moderate	Higher

When to Use 2PC:

Network partitions are the primary failure concern
Safety (never diverge) is more important than liveness (never block)
Latency is critical (fewer round trips)
The coordinator is highly available (replication, failover)
Existing infrastructure supports 2PC (XA standard)

When to Use 3PC:

Network is reliable, but node crashes are common
Blocking is absolutely unacceptable
Slightly higher latency is acceptable
Network partitions are rare or can be detected and handled specially
Custom implementation is feasible

Practical Reality

In practice, 2PC with highly available coordinators is more common than 3PC. Modern systems like CockroachDB, Spanner, and TiDB use 2PC combined with Paxos/Raft-replicated coordinators. This provides non-blocking behavior (coordinator failure triggers quick leader election) while maintaining safety during partitions.

Modern Alternatives and Variations

Both 2PC and 3PC have limitations. Modern distributed databases have evolved various approaches that combine ideas from both protocols with consensus algorithms.

1. Paxos Commit

Instead of a single coordinator, use a Paxos group as the coordinator:

The coordinator's decision is replicated via Paxos before being sent
Coordinator failure triggers Paxos leader election rather than blocking
Maintains safety during partitions (Paxos guarantees agreement)
Used by Google Spanner

2. Raft-Based Coordination

Similar to Paxos Commit but using Raft consensus:

Coordinator state replicated across a Raft group
Leader failure triggers election within ~150-300ms
Minimal blocking duration in practice
Used by CockroachDB, TiDB, YugabyteDB

3. RAMP Transactions

Read Atomic Multi-Partition transactions:

Non-blocking even with partitions
Weaker isolation (read committed rather than serializable)
Appropriate for many web application workloads

4. Saga Pattern

Compensating transactions instead of distributed commit:

Each operation has a compensating rollback operation
If any step fails, execute compensations for completed steps
Eventually consistent rather than immediately consistent
Common in microservices architectures

Distributed Transaction Approaches
Approach	Blocking Risk	Partition Safety	Complexity	Consistency
2PC	High	Safe (blocks)	Moderate	Strong
3PC	Low (crash only)	UNSAFE	Higher	Strong*
Paxos/Raft + 2PC	Very Low	Safe	High	Strong
Sagas	None	Safe	Moderate	Eventual
RAMP	None	Safe	Moderate	Read Committed

The Industry Direction

The industry trend is toward '2PC + Consensus' hybrids. By replicating coordinator state via Paxos or Raft, systems get the safety of 2PC (no divergence during partitions) with the liveness of 3PC (minimal blocking because coordinator failure just triggers leader election). This is the approach used by most modern NewSQL databases.

Why Pure 3PC Isn't Used:

Partition Unsafety: The fundamental partition problem is too dangerous for production systems where network issues are common.
Complexity: The extra phase adds implementation and testing complexity without solving the hard cases (partitions).
Better Alternatives Exist: Paxos/Raft + 2PC achieves 3PC's non-blocking goals while maintaining partition safety.
Latency Matters: The extra round trip hurts performance without enough compensating benefit.

However, understanding 3PC remains valuable as it illuminates the fundamental trade-offs in distributed consensus and provides insight into why more sophisticated solutions evolved.

Summary: The Quest for Non-Blocking Commit

We've thoroughly explored the Three-Phase Commit protocol—its motivation, mechanics, advantages, and critical limitations. Let's consolidate the key insights:

Key Takeaways

•Motivation: 3PC was designed to eliminate 2PC's blocking problem, where coordinator failure leaves PREPARED participants stranded indefinitely.
•Three Phases: CAN_COMMIT (voting), PRE_COMMIT (prepare to commit), DO_COMMIT (actual commit). The middle phase is the key innovation.
•PRE-COMMITTED State: This intermediate state ensures that no participant can be COMMITTED while any other is merely UNCERTAIN, enabling non-blocking termination.
•Non-Blocking Under Crashes: When the coordinator crashes, surviving participants can elect a new coordinator and resolve the transaction based on their collective states.
•Network Partition Vulnerability: 3PC can violate atomicity during network partitions. Partitioned groups may reach different decisions, causing permanent inconsistency.
•Trade-off: 2PC sacrifices liveness for safety (blocks but never diverges). 3PC sacrifices safety for liveness (may diverge but doesn't block).
•Modern Approaches: Industry has largely moved to 2PC + Paxos/Raft, which provides non-blocking behavior while maintaining partition safety.
•Practical Value: Understanding 3PC illuminates fundamental distributed systems trade-offs, even though pure 3PC is rarely deployed.

Module Completion:

You have completed the Distributed Transactions module. You now understand:

The fundamental challenge of distributed atomicity
How Two-Phase Commit achieves atomic commit across distributed nodes
The distinct roles of coordinators and participants
How failures are handled (and where 2PC struggles)
How Three-Phase Commit attempts to solve blocking
Why modern systems combine 2PC with consensus protocols

This knowledge is essential for designing, implementing, and troubleshooting distributed database systems.

Module Complete

Congratulations! You've completed the Distributed Transactions module. You now possess a comprehensive understanding of atomic commit protocols—both 2PC and 3PC—their mechanics, trade-offs, and the insights that drive modern distributed database designs. Continue to the next module to explore the CAP Theorem and its implications for distributed systems.

5 / 5

Loading learning content...

Database Management SystemsDistributed Transactions

Distributed Transactions

LevelAdvanced

Duration75 mins

TopicDistributed Transactions

5 / 5

Three-Phase Commit (3PC) Overview

Beyond the Blocking Problem

What You Will Learn

Motivation: Why Three-Phase Commit?

To understand 3PC, we must first deeply analyze why 2PC blocks and what properties would be needed to avoid blocking.

The Root Cause of 2PC Blocking:

In 2PC, when a participant enters the PREPARED state:

It has voted VOTE_COMMIT
It has promised to follow the coordinator's decision
It doesn't know if other participants voted COMMIT or ABORT
It doesn't know the coordinator's decision

If the coordinator fails, the participant is stuck. It cannot:

Commit unilaterally: Other participants might have voted ABORT, and the coordinator decided ABORT
Abort unilaterally: Other participants might have received GLOBAL_COMMIT and already committed

The problem is asymmetric information: from the PREPARED state, the participant cannot distinguish between a future COMMIT and a future ABORT.

The Key Observation

The 3PC Solution: Add a Pre-Commit Phase

3PC inserts a PRE-COMMIT phase between PREPARED and COMMITTED:

Phase 1 (Can Commit?): Coordinator asks all participants if they can commit
Phase 2 (Pre-Commit): If all can commit, coordinator tells all to prepare to commit (PRE-COMMIT)
Phase 3 (Do Commit): Coordinator tells all to actually commit

The crucial property: Before any participant receives DO_COMMIT (and enters COMMITTED), all participants have received PRE_COMMIT (and entered PRE-COMMITTED state).

This means if the coordinator fails and any participant is in PRE-COMMITTED state, that participant knows:

All participants voted to commit (otherwise we wouldn't have reached PRE-COMMIT)
No participant has committed yet (Phase 3 hasn't happened)
Either commit or abort is safe—if all remaining participants agree

This additional knowledge enables non-blocking termination.

Blocking Vulnerability: 2PC vs 3PC
Scenario	2PC Behavior	3PC Behavior
Coordinator fails during vote collection	Participants abort or block until recovery	Participants abort or block until recovery
Coordinator fails after decision, before any notification	All PREPARED participants blocked	PRE-COMMIT phase prevents this exact scenario
Coordinator fails after some but not all notified	Some committed, others blocked	PRE-COMMIT ensures uniform state among participants
Network partition during commit phase	Blocked	May commit or abort incorrectly (see limitations)

The Three Phases in Detail

Let's examine each phase of the 3PC protocol in rigorous detail.

Phase 1: Can Commit (Voting Phase)

This phase is essentially identical to 2PC's prepare phase:

Coordinator sends CAN_COMMIT message to all participants
Each participant determines if it can commit:
- Checks constraints, resources, locks
- If yes: replies YES_VOTE and enters UNCERTAIN state
- If no: replies NO_VOTE and aborts locally
Coordinator collects all votes:
- If all YES_VOTE: proceed to Phase 2
- If any NO_VOTE or timeout: send ABORT to all and terminate

Important: At the end of Phase 1, if the coordinator hasn't received all YES_VOTEs, it aborts immediately. There is no ambiguity—the coordinator knows the transaction will abort.

Converting Mermaid diagram...

Phase 2: Pre-Commit (Prepare to Commit Phase)

This is the new phase that distinguishes 3PC from 2PC:

Coordinator sends PRE_COMMIT message to all participants
Each participant receives PRE_COMMIT:
- Writes durable log record indicating readiness
- Enters PRE-COMMITTED state
- Replies with ACK_PRE_COMMIT
Coordinator collects all acknowledgments:
- If all acknowledge: proceed to Phase 3
- If timeout: can still safely abort (no one has committed yet)

Critical Property: Receiving PRE_COMMIT tells a participant that:

All participants voted YES (unanimous consent)
The coordinator intends to commit
But no participant has committed yet

This is the key insight: the PRE-COMMITTED state indicates consensus to commit without irrevocable commitment.

Converting Mermaid diagram...

Phase 3: Do Commit (Commit Phase)

The final phase mirrors 2PC's commit phase:

Coordinator sends DO_COMMIT message to all participants
Each participant receives DO_COMMIT:
- Makes changes permanent
- Enters COMMITTED state
- Replies with DONE
Coordinator collects confirmations:
- Transaction complete when all DONE received

The Complete Message Flow:

3PC Message Sequence (Successful Commit)
Phase	Coordinator Message	Participant Response	State Transition
Phase 1	CAN_COMMIT	YES_VOTE	INITIAL → UNCERTAIN
Phase 2	PRE_COMMIT	ACK_PRE_COMMIT	UNCERTAIN → PRE-COMMITTED
Phase 3	DO_COMMIT	DONE	PRE-COMMITTED → COMMITTED

State Machines for 3PC

Understanding the state machines for both coordinator and participant is essential for reasoning about 3PC's non-blocking properties.

Coordinator State Machine:

INITIAL: Transaction executing
WAITING: Sent CAN_COMMIT, waiting for votes
PRE-COMMITTING: All YES_VOTEs received, sent PRE_COMMIT, waiting for ACKs
COMMITTING: All PRE_COMMIT ACKs received, sent DO_COMMIT
COMMITTED: All DONE received, transaction complete
ABORTED: Abort decision made at any point

Converting Mermaid diagram...

Participant State Machine:

INITIAL: Executing local operations
UNCERTAIN: Voted YES_VOTE, waiting for Phase 2 message
PRE-COMMITTED: Received PRE_COMMIT, ready to commit, waiting for DO_COMMIT
COMMITTED: Received DO_COMMIT, transaction completed
ABORTED: Received ABORT or decided to abort

The Key Non-Blocking Property:

Notice that in 3PC, before any participant reaches COMMITTED, all participants must pass through PRE-COMMITTED. This means:

If coordinator fails and ANY participant is PRE-COMMITTED, all survivors know the intent was to commit
If coordinator fails and NO participant is PRE-COMMITTED, all can safely abort
There's no scenario where one participant is COMMITTED while another is only UNCERTAIN

Converting Mermaid diagram...

Timeout Behavior Difference

Non-Blocking Termination

The primary advantage of 3PC over 2PC is its non-blocking property under certain failure assumptions. Let's understand how this works.

The Termination Protocol:

When the coordinator fails and a participant times out waiting for a message, the participant initiates a termination protocol:

Elect a new coordinator from surviving participants
New coordinator polls all reachable participants for their states
Based on collected states, new coordinator decides:
- If ANY participant is COMMITTED → Everyone should COMMIT
- If ANY participant is ABORTED → Everyone should ABORT
- If ANY participant is PRE-COMMITTED and none COMMITTED → Safe to COMMIT
- If ALL participants are UNCERTAIN (none PRE-COMMITTED) → Safe to ABORT
- If mix of UNCERTAIN and PRE-COMMITTED → COMMIT (PRE-COMMITTED means intent was commit)

3PC Termination Protocol Decision Rules
States Found Among Survivors	Decision	Rationale
Any COMMITTED	COMMIT	Someone already committed; all must commit
Any ABORTED	ABORT	Someone already aborted; all must abort
Any PRE-COMMITTED, none COMMITTED/ABORTED	COMMIT	All agreed, coordinator intended to commit
All UNCERTAIN	ABORT	No one reached PRE-COMMIT; safe to abort
Mix of UNCERTAIN and PRE-COMMITTED	COMMIT	PRE-COMMITTED proves unanimous YES

Why This Works:

The key invariant that 3PC maintains is:

No participant can be in COMMITTED state while any other participant is in UNCERTAIN state.

This is because:

A participant enters PRE-COMMITTED only after receiving PRE_COMMIT
The coordinator sends PRE_COMMIT to all participants before sending any DO_COMMIT
A participant enters COMMITTED only after receiving DO_COMMIT
Therefore, if anyone is COMMITTED, everyone must have passed through PRE-COMMITTED

Contrast with 2PC:

Converting Mermaid diagram...

The PRE-COMMIT Buffer

3PC Failure Scenarios

Let's trace through how 3PC handles various failure scenarios, comparing with 2PC behavior.

Scenario 5.1: Coordinator Fails After Receiving All YES_VOTEs, Before Sending PRE_COMMIT

In 2PC: All participants in PREPARED state, blocked forever In 3PC:

All participants in UNCERTAIN state
Timeout triggers termination protocol
New coordinator finds all UNCERTAIN
Decision: ABORT (safe, no one pre-committed)
Transaction aborts, no blocking!

Scenario 5.2: Coordinator Fails After Sending Some PRE_COMMIT Messages

In 3PC:

Some participants in PRE-COMMITTED state
Some participants in UNCERTAIN state (didn't receive PRE_COMMIT)
Termination protocol finds mix of PRE-COMMITTED and UNCERTAIN
Decision: COMMIT (PRE-COMMITTED proves unanimous YES)
All participants commit, no blocking!

Converting Mermaid diagram...

Scenario 5.3: Coordinator Fails After Sending All PRE_COMMIT, Before Sending DO_COMMIT

In 3PC:

All participants in PRE-COMMITTED state
Termination protocol finds all PRE-COMMITTED
Decision: COMMIT
All participants commit, no blocking!

Scenario 5.4: Coordinator Fails After Sending Some DO_COMMIT Messages

In 3PC:

Some participants in COMMITTED state
Some participants in PRE-COMMITTED state
Termination protocol finds at least one COMMITTED
Decision: COMMIT
Remaining participants commit

This scenario is similar to 2PC in the sense that the decision is already determined, but unlike 2PC, participants can resolve it themselves without waiting for coordinator recovery.

Scenarios 3PC Handles Well

•Coordinator crash during vote collection
•Coordinator crash after votes, before PRE_COMMIT
•Coordinator crash during PRE_COMMIT phase
•Coordinator crash during DO_COMMIT phase
•Multiple node failures (if quorum survives)
•Message loss (with retries)

Scenarios 3PC Struggles With

•Network partitions (critical flaw)
•Combined coordinator + partition failures
•Byzantine failures (not handled)
•Total network failure (all disconnected)
•Partition that separates all participants

The Network Partition Problem

While 3PC eliminates blocking under crash failures, it has a critical vulnerability: network partitions can cause inconsistency. This is 3PC's most significant limitation.

The Dangerous Scenario:

All participants vote YES, reach UNCERTAIN state
Coordinator sends PRE_COMMIT to some participants
Network partition occurs, splitting participants:
- Partition A: Contains coordinator and participants who received PRE_COMMIT (in PRE-COMMITTED state)
- Partition B: Contains participants who didn't receive PRE_COMMIT (in UNCERTAIN state)
Both partitions believe they have quorum and can proceed
Partition A: Sees PRE-COMMITTED participants, decides COMMIT
Partition B: Sees only UNCERTAIN participants, decides ABORT
Result: INCONSISTENCY! Some committed, some aborted.

Violated Safety

Converting Mermaid diagram...

Why This Happens:

The termination protocol assumes that if no participant is PRE-COMMITTED, it's safe to abort. But during a partition:

Partition B cannot see participants in Partition A who are PRE-COMMITTED
Partition B sees only UNCERTAIN participants, believes abort is safe
Meanwhile, Partition A sees PRE-COMMITTED participants and commits

The Fundamental Trade-off:

Quorum-Based Solutions:

Some 3PC variants use quorum-based voting to mitigate this:

Require a majority of participants to agree on the decision
Only the partition with a majority can make progress
The minority partition must block

However, this reintroduces some blocking (though less than 2PC) and requires careful configuration of quorum sizes.

2PC vs 3PC: A Comprehensive Comparison

Let's systematically compare the two protocols across multiple dimensions to understand when each is appropriate.

Two-Phase Commit vs Three-Phase Commit
Dimension	2PC	3PC
Message Complexity	4n messages (2 rounds)	6n messages (3 rounds)
Latency	2 RTTs to commit point	3 RTTs to commit point
Blocking under Crash Failures	Yes (coordinator crash blocks)	No (survivors can terminate)
Safety under Partitions	Safe (blocks rather than diverge)	UNSAFE (may diverge)
Implementation Complexity	Moderate	Higher
Practical Adoption	Widespread (XA, databases)	Rare
Log Writes per Transaction	2-3 forced writes	3-4 forced writes
Recovery Complexity	Moderate	Higher

When to Use 2PC:

Network partitions are the primary failure concern
Safety (never diverge) is more important than liveness (never block)
Latency is critical (fewer round trips)
The coordinator is highly available (replication, failover)
Existing infrastructure supports 2PC (XA standard)

When to Use 3PC:

Network is reliable, but node crashes are common
Blocking is absolutely unacceptable
Slightly higher latency is acceptable
Network partitions are rare or can be detected and handled specially
Custom implementation is feasible

Practical Reality

Modern Alternatives and Variations

Both 2PC and 3PC have limitations. Modern distributed databases have evolved various approaches that combine ideas from both protocols with consensus algorithms.

1. Paxos Commit

Instead of a single coordinator, use a Paxos group as the coordinator:

The coordinator's decision is replicated via Paxos before being sent
Coordinator failure triggers Paxos leader election rather than blocking
Maintains safety during partitions (Paxos guarantees agreement)
Used by Google Spanner

2. Raft-Based Coordination

Similar to Paxos Commit but using Raft consensus:

Coordinator state replicated across a Raft group
Leader failure triggers election within ~150-300ms
Minimal blocking duration in practice
Used by CockroachDB, TiDB, YugabyteDB

3. RAMP Transactions

Read Atomic Multi-Partition transactions:

Non-blocking even with partitions
Weaker isolation (read committed rather than serializable)
Appropriate for many web application workloads

4. Saga Pattern

Compensating transactions instead of distributed commit:

Each operation has a compensating rollback operation
If any step fails, execute compensations for completed steps
Eventually consistent rather than immediately consistent
Common in microservices architectures

Distributed Transaction Approaches
Approach	Blocking Risk	Partition Safety	Complexity	Consistency
2PC	High	Safe (blocks)	Moderate	Strong
3PC	Low (crash only)	UNSAFE	Higher	Strong*
Paxos/Raft + 2PC	Very Low	Safe	High	Strong
Sagas	None	Safe	Moderate	Eventual
RAMP	None	Safe	Moderate	Read Committed

The Industry Direction

Why Pure 3PC Isn't Used:

Partition Unsafety: The fundamental partition problem is too dangerous for production systems where network issues are common.
Complexity: The extra phase adds implementation and testing complexity without solving the hard cases (partitions).
Better Alternatives Exist: Paxos/Raft + 2PC achieves 3PC's non-blocking goals while maintaining partition safety.
Latency Matters: The extra round trip hurts performance without enough compensating benefit.

However, understanding 3PC remains valuable as it illuminates the fundamental trade-offs in distributed consensus and provides insight into why more sophisticated solutions evolved.

Summary: The Quest for Non-Blocking Commit

We've thoroughly explored the Three-Phase Commit protocol—its motivation, mechanics, advantages, and critical limitations. Let's consolidate the key insights:

Key Takeaways

•Motivation: 3PC was designed to eliminate 2PC's blocking problem, where coordinator failure leaves PREPARED participants stranded indefinitely.
•Three Phases: CAN_COMMIT (voting), PRE_COMMIT (prepare to commit), DO_COMMIT (actual commit). The middle phase is the key innovation.
•PRE-COMMITTED State: This intermediate state ensures that no participant can be COMMITTED while any other is merely UNCERTAIN, enabling non-blocking termination.
•Non-Blocking Under Crashes: When the coordinator crashes, surviving participants can elect a new coordinator and resolve the transaction based on their collective states.
•Network Partition Vulnerability: 3PC can violate atomicity during network partitions. Partitioned groups may reach different decisions, causing permanent inconsistency.
•Trade-off: 2PC sacrifices liveness for safety (blocks but never diverges). 3PC sacrifices safety for liveness (may diverge but doesn't block).
•Modern Approaches: Industry has largely moved to 2PC + Paxos/Raft, which provides non-blocking behavior while maintaining partition safety.
•Practical Value: Understanding 3PC illuminates fundamental distributed systems trade-offs, even though pure 3PC is rarely deployed.

Module Completion:

You have completed the Distributed Transactions module. You now understand:

The fundamental challenge of distributed atomicity
How Two-Phase Commit achieves atomic commit across distributed nodes
The distinct roles of coordinators and participants
How failures are handled (and where 2PC struggles)
How Three-Phase Commit attempts to solve blocking
Why modern systems combine 2PC with consensus protocols

This knowledge is essential for designing, implementing, and troubleshooting distributed database systems.

Module Complete

5 / 5