Loading content...
We've learned the rules of Two-Phase Locking: acquire locks during the growing phase, release during the shrinking phase, never acquire after releasing. These rules seem reasonable, but the critical question remains: Why do these rules guarantee correct concurrent execution?
This isn't a question of intuition—it's a question that demands mathematical proof. The correctness of 2PL was formally proven in 1976, establishing that any schedule produced by 2PL-compliant transactions is conflict serializable. This proof is one of the foundational results in database theory.
In this page, we'll walk through this proof step by step, building both formal understanding and practical intuition. By the end, you won't just believe that 2PL guarantees serializability—you'll understand why it must.
By the end of this page, you will understand the formal definition of conflict serializability, how the lock point determines serialization order, the proof that 2PL schedules are always conflict-serializable, and intuitive visualizations of why the two-phase structure works.
Before proving that 2PL guarantees serializability, let's ensure we have a crystal-clear understanding of what conflict serializability means.
Key Definitions:
Why Conflict Serializability Matters:
Serial schedules are trivially correct—there's no concurrency, so no concurrency problems can occur. But serial execution wastes resources (one transaction at a time).
Conflict serializability gives us the best of both worlds: concurrent execution with serial-equivalent results. If a concurrent schedule is conflict serializable, its effects are identical to some serial execution order. Users cannot distinguish the concurrent execution from a serial one.
| Operation 1 | Operation 2 | Conflict? | Reason |
|---|---|---|---|
| T₁: Read(A) | T₂: Read(A) | No | Both reads — order doesn't matter |
| T₁: Read(A) | T₂: Write(A) | Yes | Read-Write — order determines what T₁ sees |
| T₁: Write(A) | T₂: Read(A) | Yes | Write-Read — order determines what T₂ sees |
| T₁: Write(A) | T₂: Write(A) | Yes | Write-Write — order determines final value |
| T₁: Read(A) | T₂: Write(B) | No | Different items — no interference |
The precedence graph (also called the conflict graph or serialization graph) is the tool we use to test whether a schedule is conflict serializable.
Construction Rules:
12345678910111213141516171819202122232425262728293031
SCHEDULE S:R₁(A), W₂(A), R₂(B), W₁(B), R₃(A), W₃(B) Where:- R₁(A) means Transaction 1 reads A- W₂(A) means Transaction 2 writes A STEP 1: Identify conflicting pairs┌────────────────────────────────────────────────────────────┐│ Pair │ Conflict? │ Edge Added │├───────────────────┼───────────┼────────────────────────────┤│ R₁(A) - W₂(A) │ Yes │ T₁ → T₂ (R-W on A) ││ W₂(A) - R₃(A) │ Yes │ T₂ → T₃ (W-R on A) ││ R₂(B) - W₁(B) │ Yes │ T₂ → T₁ (R-W on B) ││ W₁(B) - W₃(B) │ Yes │ T₁ → T₃ (W-W on B) ││ R₂(B) - W₃(B) │ Yes │ T₂ → T₃ (R-W on B) │└────────────────────────────────────────────────────────────┘ STEP 2: Draw the precedence graph T₁ ←──── T₂ │ │ │ │ ↓ ↓ T₃ STEP 3: Check for cyclesPath: T₁ → T₂ → T₁ ← CYCLE DETECTED!(T₁ → T₂ from A, T₂ → T₁ from B) CONCLUSION: Schedule S is NOT conflict serializableThe cycle means there's no valid serial ordering.A schedule is conflict serializable if and only if its precedence graph is acyclic (has no cycles).
If the graph is acyclic, a topological sort of the nodes gives a valid serial ordering. If there's a cycle, no serial ordering can satisfy all the constraints—the schedule cannot be serialized.
The lock point of a transaction is the moment when it has acquired all the locks it will ever need—the transition point between the growing and shrinking phases. This concept is absolutely crucial to understanding why 2PL guarantees serializability.
Key Insight:
In a 2PL schedule, the order of lock points defines a valid serialization order. If transaction Tᵢ's lock point occurs before Tⱼ's lock point, then in the equivalent serial schedule, Tᵢ precedes Tⱼ.
Why Lock Point Order Works:
Think about what it means for Tᵢ's lock point to occur before Tⱼ's:
This creates a clear temporal ordering among transactions based on their lock points.
1234567891011121314151617181920
EXAMPLE: Three transactions with 2PL Time → T₁: |--Lock(A)--Lock(B)--|*LP₁*|--Unlock(B)--Unlock(A)--|COMMIT| T₂: |--Lock(C)------Wait(A)...|--Lock(A)--|*LP₂*|--Unlock--|COMMIT| T₃: |--Lock(D)--Wait(B).........|--Lock(B)--|*LP₃*|--Unlock--|COMMIT| Lock Point Order: LP₁ < LP₂ < LP₃ Therefore: Serial order T₁, T₂, T₃ is valid WHY THIS ORDER WORKS:- T₂ waited for A (held by T₁) → T₂ depends on T₁ finishing first- T₃ waited for B (held by T₁) → T₃ depends on T₁ finishing first- T₃ also waited for T₂ implicitly via the lock schedule The concurrent schedule is equivalent to: T₁; T₂; T₃ executing seriallyWe now present the formal proof that 2PL schedules are always conflict serializable. The proof proceeds by contradiction, showing that if a 2PL schedule were not serializable (i.e., had a cycle in its precedence graph), a contradiction would arise.
Theorem: Every legal schedule under Two-Phase Locking is conflict serializable.
Proof Structure:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748
PROOF BY CONTRADICTION ASSUMPTION: There exists a legal 2PL schedule S that is not conflict serializable. This means its precedence graph has a cycle. Let the cycle be: T₁ → T₂ → T₃ → ... → Tₖ → T₁ WHAT EACH EDGE MEANS:An edge Tᵢ → Tⱼ exists because there are conflicting operations whereTᵢ's operation on some item X precedes Tⱼ's conflicting operation on X. For Tᵢ to access X before Tⱼ, and for both to complete without conflict:- Tᵢ must HOLD a lock on X while accessing it- Tⱼ must WAIT for that lock until Tᵢ releases it- Therefore: unlock_i(X) < lock_j(X) in time EXAMINING THE CYCLE:T₁ → T₂: unlock₁(X₁) < lock₂(X₁) for some item X₁T₂ → T₃: unlock₂(X₂) < lock₃(X₂) for some item X₂...Tₖ → T₁: unlockₖ(Xₖ) < lock₁(Xₖ) for some item Xₖ NOW APPLY 2PL CONSTRAINTS:For each transaction Tᵢ following 2PL:- All lock operations precede all unlock operations- Therefore: lock_i(any) < unlock_i(any) COMBINING THE CONSTRAINTS:From T₁ → T₂: unlock₁(X₁) < lock₂(X₁)By 2PL for T₂: lock₂(X₂) < unlock₂(X₂)And: lock₂(X₁) ≤ lock₂(X₂) (all locks before any unlock)Therefore: unlock₁(X₁) < unlock₂(X₂) Extending this through the cycle:unlock₁(X₁) < unlock₂(X₂) < unlock₃(X₃) < ... < unlockₖ(Xₖ) < lock₁(Xₖ) But by 2PL for T₁: lock₁(Xₖ) < unlock₁(X₁) COMBINING:unlock₁(X₁) < ... < lock₁(Xₖ) < unlock₁(X₁) This says: unlock₁(X₁) < unlock₁(X₁) CONTRADICTION! A timestamp cannot be less than itself. CONCLUSION: Our assumption was false. No cycle can exist.Therefore, all 2PL schedules have acyclic precedence graphs.Therefore, all 2PL schedules are conflict serializable. ∎The proof shows that any cycle in the precedence graph would require a transaction to unlock something after it has already unlocked it (temporal impossibility). The two-phase structure—all locks before any unlocks—creates a "ratchet" effect that makes backward dependencies impossible.
The formal proof establishes correctness, but intuition helps us truly understand the mechanism. Let's build that intuition through several mental models:
Mental Model 1: The Claiming Game
Imagine transactions as players in a resource-claiming game:
Mental Model 4: The Handoff Chain
Think of conflicts as handoffs:
Why Cycles Are Impossible:
A cycle would require: T₁ hands to T₂ hands to T₃ hands to T₁
But "hands to" means "lock point is before". So we'd need:
This is impossible! You can't have LP₁ both before and after itself.
The key insight is that dependencies in 2PL only flow one way—from earlier lock points to later lock points. This unidirectional flow makes cycles mathematically impossible, guaranteeing a valid serial ordering exists.
Let's work through complete examples showing how to determine the equivalent serial order for 2PL schedules.
12345678910111213141516171819
SCHEDULE S1 (2PL Compliant): Time: 1 2 3 4 5 6 7 8 9 10 11 12 13 14T₁: L(A) R(A) L(B) R(B) U(A) U(B) COMMITT₂: L(C) R(C) L(A)--wait--→→ L(A) R(A) U(C) U(A) COMMIT Legend: L=Lock, R=Read, U=Unlock LOCK POINT ANALYSIS:- T₁'s lock point: After L(B) at time 4 (all locks acquired)- T₂'s lock point: After L(A) at time 9 (had to wait for T₁) Lock Point Order: LP₁ (time 4) < LP₂ (time 9) SERIAL ORDER: T₁ ; T₂ VERIFICATION:- T₂ read A after T₁ read A (if T₁ wrote, T₂ would see it)- This matches serial execution where T₁ completes before T₂ starts ✓The serializability guarantee of 2PL has profound implications for how database systems are designed and how applications interact with them.
For Database Engine Developers:
For Application Developers:
2PL guarantees serializability within its scope. It does NOT guarantee:
We've rigorously established why Two-Phase Locking guarantees conflict serializability. This guarantee is the foundation upon which all transactional database systems build their concurrency control. Let's consolidate our understanding:
What's next:
Now that we understand the power of 2PL, we must also understand its limitations. The next page examines the weaknesses and trade-offs of Two-Phase Locking: deadlocks, reduced concurrency, potential for starvation, and why variants like Strict 2PL and Rigorous 2PL were developed to address some of these issues.
You now understand why 2PL guarantees serializability, both formally and intuitively. This understanding is essential for database professionals—it explains why the protocol works, when it applies, and what assumptions underlie its guarantee. You're equipped to reason about transaction correctness at a deep level.