Two Phase Locking - Learning Module

Loading content...

0/241

2PL Limitations

The Price of Correctness

Two-Phase Locking delivers a powerful guarantee: conflict serializability. But like all engineering solutions, 2PL involves trade-offs. Understanding these limitations is crucial for database professionals who must balance correctness against performance, availability, and system complexity.

In this page, we'll examine the dark side of 2PL—the problems it creates even as it solves the serializability problem. We'll explore why transactions can become permanently stuck, why throughput can plummet under high load, and why some transactions may never complete despite having valid requests.

These aren't theoretical concerns. Every production database operator encounters these issues. Understanding them is the difference between firefighting symptoms and engineering solutions.

What You Will Learn

By the end of this page, you will understand the core limitations of Two-Phase Locking, why deadlocks are inherent to 2PL, how 2PL reduces concurrency and impacts throughput, what starvation means and why it can occur, and how these limitations motivated the development of 2PL variants and alternative protocols.

Deadlock: The Fundamental Vulnerability

The most critical limitation of 2PL is its susceptibility to deadlock—a situation where two or more transactions are permanently blocked, each waiting for a lock held by another in the waiting group.

Why 2PL Enables Deadlocks:

2PL's growing phase allows transactions to acquire locks incrementally. This flexibility—acquiring locks as needed rather than all at once—is what makes 2PL practical for complex transactions. But it also creates the opportunity for circular waits.

deadlock_example.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
-- CLASSIC DEADLOCK SCENARIO
 
-- Initial state: Both accounts exist with balances
-- T₁ wants to transfer from A to B
-- T₂ wants to transfer from B to A
 
-- Time progression:
t₁: T₁: BEGIN TRANSACTION
t₂: T₁: Lock(A) -- Granted (exclusive for update)
t₃: T₁: Read(A) -- balance = $1000
 
t₄: T₂: BEGIN TRANSACTION
t₅: T₂: Lock(B) -- Granted (exclusive for update)  
t₆: T₂: Read(B) -- balance = $500
 
t₇: T₁: Lock(B) -- BLOCKED! T₂ holds this lock
    -- T₁ enters wait queue for B
 
t₈: T₂: Lock(A) -- BLOCKED! T₁ holds this lock
    -- T₂ enters wait queue for A
 
-- DEADLOCK STATE:
-- T₁ holds Lock(A), waits for Lock(B)
-- T₂ holds Lock(B), waits for Lock(A)
-- Neither can proceed
-- Neither will release (2PL forbids release before lock point)

Converting Mermaid diagram...

The Four Conditions for Deadlock

Deadlock requires all four conditions (Coffman conditions):

Mutual Exclusion — Resources can't be shared (exclusive locks)
Hold and Wait — Transactions hold locks while waiting for others
No Preemption — Locks can't be forcibly taken from holders
Circular Wait — A cycle exists in the wait-for graph

2PL inherently satisfies conditions 1, 2, and 3. Condition 4 emerges from transaction access patterns.

The Unavoidable Reality:

2PL cannot prevent deadlocks without additional mechanisms. The protocol guarantees serializability, not liveness. Deadlock handling must be added separately:

Approach	Mechanism	Trade-off
Detection	Periodically check for cycles in wait-for graph	Allows deadlocks, then resolves them
Prevention	Wait-Die or Wound-Wait schemes	May abort transactions unnecessarily
Timeout	Abort transactions that wait too long	Simple but may abort non-deadlocked transactions
Avoidance	Banker's algorithm (rarely used in DBMS)	Too conservative; reduces concurrency

Reduced Concurrency and Throughput

2PL's requirement to hold locks throughout the growing phase—and in Strict 2PL, until commit—significantly reduces the potential for concurrent execution. This is the fundamental trade-off: correctness via lock holding versus concurrency via lock release.

The Concurrency Tax:

Concurrency Impact Analysis
Scenario	Without Locking	With 2PL	Reduction
Two readers, same row	Both execute in parallel	Both execute in parallel	0%
Reader + Writer, same row	Could overlap	Writer blocks reader (or vice versa)	~50%
Two writers, same row	Could overlap	Strictly sequential	~50%
Long transaction + Short transactions	Short ones complete quickly	Short ones wait for long one	Severe
Hot spot access	High parallelism	Serialized access	Dramatic

The Lock Duration Problem:

Consider the timeline of a transaction under different 2PL variants:

lock_duration_comparison.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Transaction Timeline: Process order, update inventory, charge payment
 
BASIC 2PL (Theoretical - rarely used):
|--Lock(O)--Read(O)--Unlock(O)--Lock(I)--Write(I)--Unlock(I)--Lock(P)--Write(P)--Unlock(P)--|
[     Growing      ][              Shrinking (releases as it goes)                         ]
Lock(O) held: ████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
Lock(I) held: ░░░░░░░░░░░░░░████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
Lock(P) held: ░░░░░░░░░░░░░░░░░░░░░░░░████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
 
STRICT 2PL (Standard implementation):
|--Lock(O)--Read(O)--Lock(I)--Write(I)--Lock(P)--Write(P)--COMMIT--Unlocks--|
[                    Growing                                ][  Shrinking  ]
Lock(O) held: ████████████████████████████████████████████████████████████
Lock(I) held: ░░░░░░░░░░░░░░████████████████████████████████████████████████
Lock(P) held: ░░░░░░░░░░░░░░░░░░░░░░░░░░░░████████████████████████████████████
 
COMPARISON:
- Basic 2PL: Each lock held only while needed
- Strict 2PL: All locks held until commit
- Resource O is locked 100% of transaction time in Strict 2PL vs ~20% in Basic 2PL
- Other transactions wanting O must wait 5x longer in Strict 2PL

The Convoy Effect

When a long-running transaction holds locks on hot resources, short transactions queue up behind it like cars behind a slow truck on a single-lane road. Even after the long transaction completes, the queued transactions execute sequentially, creating a 'convoy' that takes far longer than the sum of individual transaction times.

Quantifying the Impact:

Consider a system with:

100 transactions per second arriving
Average transaction time: 10ms without contention
Hot spot accessed by 20% of transactions
Hot spot access: 2ms of the 10ms

Without contention: Throughput = 100 TPS, Latency = 10ms

With 2PL on hot spot:

20 transactions/second contend for hot spot
Hot spot can process: 1 / 2ms = 500 exclusive accesses/second
But each holds the lock for 10ms (Strict 2PL), not 2ms
Hot spot throughput: 1 / 10ms = 100/second
20 transactions/second saturates 20% of capacity
Queuing begins, latency increases dramatically

Starvation: The Forgotten Transaction

Starvation occurs when a transaction is repeatedly delayed indefinitely, never able to acquire the locks it needs despite the system not being in deadlock. Unlike deadlock, which involves mutual blocking, starvation involves unfair scheduling.

How Starvation Happens:

starvation_scenario.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
SCENARIO: Writer starvation amid continuous readers
 
Resource: Inventory count for popular product
- Many transactions read inventory (for display)
- Occasional transactions write inventory (for orders)
 
TIME SEQUENCE:
t₀:  R₁ starts, acquires S-lock on Inventory
t₁:  R₂ starts, acquires S-lock (compatible with R₁)
t₂:  W₁ starts, requests X-lock, WAITS (blocked by R₁, R₂)
t₃:  R₃ starts, acquires S-lock (compatible with R₁, R₂)
t₄:  R₁ finishes, releases S-lock
t₅:  R₄ starts, acquires S-lock (compatible with R₂, R₃)
t₆:  R₂ finishes, releases S-lock
t₇:  R₅ starts, acquires S-lock (compatible with R₃, R₄)
...
 
W₁ NEVER ACQUIRES THE LOCK!
As long as new readers keep arriving before all existing readers finish,
W₁ remains in the wait queue indefinitely.
 
This is STARVATION, not deadlock:
- No circular wait exists
- The system is making progress (readers complete)
- But W₁ is unfairly excluded

Causes of Starvation

•Read-heavy workloads — Writers blocked by continuous reader stream
•Low-priority transactions — Always preempted by higher priority
•Large transactions — Need many locks; one is always held
•Lock upgrades — Always blocked by other shared lock holders
•Hot spots — Popular data always contested

Solutions to Starvation

•FIFO wait queues — Strict ordering of requests
•Priority aging — Waiting transactions gain priority over time
•Read barriers — Block new readers when writer waiting
•Fair locks — Explicit fairness policies in lock manager
•Timeout with retry — Abort and retry with higher priority

The Read Barrier Solution

Many databases implement a read barrier: when a write request arrives while readers hold the lock, new read requests are queued behind the writer. Existing readers complete, then the writer executes, then queued readers proceed. This prevents writer starvation at the cost of slightly increased read latency.

Cascading Aborts in Basic 2PL

In basic 2PL (where locks can be released before commit), a dangerous situation can arise: cascading aborts. When one transaction aborts, other transactions that read its uncommitted data must also abort, potentially triggering a chain reaction.

The Cascading Abort Scenario:

cascading_abort.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
-- CASCADING ABORT SCENARIO (Basic 2PL)
 
-- Initial: Account.balance = $1000
 
t₁: T₁: Lock(Account)
t₂: T₁: Write(Account) ← $1500  -- Adding bonus
t₃: T₁: Unlock(Account)         -- Basic 2PL allows early unlock
    -- T₁ is now in shrinking phase
 
t₄: T₂: Lock(Account)           -- Granted! T₁ released
t₅: T₂: Read(Account) → $1500   -- T₂ reads T₁'s uncommitted data
t₆: T₂: Unlock(Account)
 
t₇: T₃: Lock(Account)
t₈: T₃: Read(Account) → $1500   -- T₃ also reads uncommitted data
t₉: T₃: Based on $1500, approves loan
 
t₁₀: T₁: ABORT!                 -- T₁'s bonus processing fails!
     -- Account must revert to $1000
 
-- CASCADE EFFECT:
-- T₂ read $1500, which was never committed → T₂ must ABORT
-- T₃ read $1500, which was never committed → T₃ must ABORT
-- T₃ approved a loan based on non-existent funds!
 
-- This is a RECOVERABILITY violation
-- The schedule was serializable but NOT recoverable

The Severity Problem

Cascading aborts can create severe problems:

Wasted Work — All processing by aborted transactions is discarded
Chain Reactions — One abort can trigger dozens of dependent aborts
User Frustration — Transactions that completed successfully are suddenly undone
Recovery Complexity — The system must track dependencies to know what to abort

The Recoverability Issue:

A schedule is recoverable if, for every transaction Tⱼ that reads data written by Tᵢ, the commit of Tⱼ occurs after the commit of Tᵢ. Basic 2PL does NOT guarantee recoverability.

Solution: Strict 2PL

Strict 2PL solves this by holding all exclusive locks until commit. Since other transactions cannot read T₁'s writes until T₁ commits, cascading aborts become impossible. This is why virtually all production systems use Strict 2PL despite its lower concurrency.

2PL Variants and Recoverability
Variant	Lock Release Timing	Cascading Aborts	Recoverability
Basic 2PL	After lock point (anywhere)	Possible	NOT guaranteed
Strict 2PL	Exclusive locks at commit	Impossible	Guaranteed
Rigorous 2PL	ALL locks at commit	Impossible	Guaranteed

The Phantom Problem

Standard 2PL with row-level locking cannot prevent phantom reads—a situation where a transaction sees different sets of rows for the same query because another transaction inserted or deleted rows matching the query criteria.

Why Phantoms Escape Row Locks:

phantom_example.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
-- PHANTOM READ SCENARIO
 
-- T₁: Calculate total salary expense for Engineering department
-- T₂: Add a new engineer
 
-- T₁'s execution:
T₁: SELECT SUM(salary) FROM employees WHERE dept = 'Engineering';
    -- Scans existing rows, acquires S-locks on rows with dept='Engineering'
    -- Finds: Alice($100k), Bob($120k), Charlie($90k)
    -- Result: $310k total
    
-- Meanwhile, T₂ executes:
T₂: INSERT INTO employees (name, dept, salary) 
    VALUES ('Diana', 'Engineering', $130k);
    -- Acquires X-lock on NEW row
    -- No conflict with T₁'s locks (Diana's row didn't exist during T₁'s scan)
T₂: COMMIT;
 
-- T₁ continues:
T₁: SELECT COUNT(*) FROM employees WHERE dept = 'Engineering';
    -- Scans again, now sees 4 rows!
    -- Alice, Bob, Charlie, Diana (PHANTOM!)
    -- T₁ computes average: $310k / 4 = $77.5k (WRONG!)
    -- Correct average should be $440k / 4 = $110k
 
-- THE PROBLEM:
-- T₁ locked existing rows but couldn't lock a row that didn't exist yet
-- Diana's row is a "phantom" that appeared between T₁'s operations

Why Row Locks Are Insufficient:

2PL locks existing data items. But phantom problems involve data items that may or may not exist at query time:

You can't lock a row that doesn't exist yet
You can't lock all possible future rows matching a predicate
Row-level locking only protects currently-existing rows

Solutions to the Phantom Problem

•Predicate Locking — Lock based on conditions (WHERE clause), not just rows. Theoretically ideal but computationally expensive.
•Index-Range Locking (Next-Key Locking) — Lock the index range, including gaps. Used by MySQL InnoDB.
•Table Locks — Lock the entire table for range operations. Simple but reduces concurrency significantly.
•MVCC + Snapshot Isolation — Read from a consistent snapshot; phantoms are prevented by version visibility rules.
•Serializable Snapshot Isolation (SSI) — Track read predicates and detect conflicts at commit time.

Standard 2PL Doesn't Prevent Phantoms

This is a crucial limitation often overlooked. Standard 2PL with row locking guarantees conflict serializability only for operations on existing rows. To achieve true serializability including phantom prevention, additional mechanisms (predicate locks, gap locks, or MVCC) are required.

Performance Degradation Under Contention

2PL performance can degrade dramatically as contention increases. The relationship between throughput and concurrency often follows a pattern that surprises developers accustomed to linear scaling.

The Contention Curve:

contention_analysis.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
THROUGHPUT vs. CONTENTION LEVEL
 
                    ╔═══════════════════════════════════════════╗
                    ║                                           ║
     Throughput     ║              *****                        ║
        (TPS)       ║           ***     ***                     ║  
                    ║         **           ***                  ║
                    ║       **                ****              ║
                    ║      *                      ****          ║
                    ║     *                           *****     ║
                    ║    *                                 ****║
                    ║   *                                       ║
                    ╚═══════════════════════════════════════════╝
                     Low        Medium        High     Very High
                              Contention Level
 
PHASES:
1. LOW CONTENTION: Near-linear scaling
   - Transactions rarely conflict
   - Lock waits are brief
   - Throughput grows with concurrency
 
2. MODERATE CONTENTION: Diminishing returns
   - Some lock conflicts appear
   - Waiting time increases
   - Throughput growth slows
 
3. HIGH CONTENTION: Plateau or decline
   - Frequent lock conflicts
   - Significant waiting time
   - Adding more transactions doesn't increase throughput
   
4. VERY HIGH CONTENTION: Thrashing
   - Constant blocking
   - Deadlock rate increases
   - Throughput actually DECREASES
   - More concurrent transactions = WORSE performance

The Mathematics of Contention:

For a simple model:

Let p = probability two transactions conflict
Let n = number of concurrent transactions
Probability of at least one conflict ≈ 1 - (1-p)^n

For p = 0.01 (1% conflict probability):

n (Concurrent Txns)	P(Conflict)
5	4.9%
10	9.6%
50	39.5%
100	63.4%
200	86.6%

Even with only 1% pairwise conflict probability, at 200 concurrent transactions, 87% of the time someone is waiting.

Real-World Impact:

Hot spots (heavily accessed data) amplify this dramatically. A single frequently-updated counter, a popular product's inventory, or a session table can become a serialization bottleneck that limits the entire system's throughput.

Contention Reduction Strategies

Partition hot data — Spread load across multiple rows (e.g., sharded counters)
Reduce transaction duration — Shorter transactions = shorter lock hold times
Optimize query plans — Faster queries = faster lock acquisition
Use appropriate isolation levels — Lower isolation when correctness permits
Consider MVCC — Readers don't block writers; writers don't block readers

2PL Compared to Alternative Approaches

2PL's limitations have motivated the development of alternative concurrency control mechanisms. Understanding how 2PL compares helps in choosing the right approach for specific workloads.

Concurrency Control Approaches Comparison
Aspect	2PL	Timestamp Ordering	MVCC	Optimistic CC
Basic Mechanism	Lock before access	Timestamp comparison	Multiple versions	Validate at commit
Blocking?	Yes (waits for locks)	No (abort instead)	Readers never block	No (abort at end)
Deadlock Risk	High	None	None (for reads)	None
Abort Rate	Low	Can be high	Low	High under contention
Phantom Prevention	Requires predicate locks	Natural	Requires SSI	Requires commit validation
Best For	OLTP with moderate contention	Distributed systems	Read-heavy workloads	Low-contention workloads

Modern Hybrid Approaches:

Most production databases don't use pure 2PL. Instead, they combine mechanisms:

PostgreSQL: MVCC for reads + 2PL for writes
MySQL InnoDB: MVCC + Strict 2PL + Gap Locking
SQL Server: Row versioning optional + Strict 2PL
Oracle: MVCC (reads) + 2PL (writes)

These hybrids attempt to get the best of both worlds: 2PL's simplicity and correctness for writes, combined with MVCC's reader/writer non-blocking for reads.

The Enduring Relevance of 2PL

Despite its limitations, 2PL remains the foundation of write concurrency control in virtually all major databases. Its theoretical guarantees, decades of engineering refinement, and well-understood behavior make it the practical choice. Alternatives typically supplement rather than replace 2PL.

Summary: Navigating 2PL's Trade-offs

Two-Phase Locking provides a powerful guarantee of serializability, but at significant cost. Understanding these trade-offs is essential for effective database system design and operation. Let's consolidate our understanding:

Key Takeaways

•Deadlock is inherent to 2PL — The protocol guarantees correctness, not liveness. Deadlock handling must be added separately.
•Lock holding reduces concurrency — Especially in Strict 2PL, extended lock hold times limit parallelism.
•Starvation can occur — Unfair scheduling can indefinitely delay transactions without causing deadlock.
•Basic 2PL allows cascading aborts — That's why Strict 2PL is universally preferred in practice.
•Phantoms escape row locks — Additional mechanisms (gap locks, predicate locks) are needed for full serializability.
•Performance degrades under contention — Hot spots can become system-wide bottlenecks.
•Alternatives exist — MVCC, timestamp ordering, and optimistic CC address some limitations but introduce new trade-offs.

Module Conclusion:

You've now completed the Two-Phase Locking module. You understand:

What 2PL is and its formal definition
How the growing and shrinking phases work
Why 2PL guarantees serializability (the proof)
The limitations and trade-offs of the protocol

Armed with this knowledge, you're prepared to explore 2PL variants (Strict, Rigorous, Conservative) in the next module, and eventually alternative protocols like timestamp ordering and MVCC.

Module Complete

Congratulations! You've mastered Two-Phase Locking—the cornerstone of database concurrency control. You understand both its power (guaranteed serializability) and its limitations (deadlocks, reduced concurrency, starvation). This knowledge forms the foundation for understanding all advanced concurrency control topics.