Recoverability - Learning Module

Loading content...

0/241

Strict Schedule

The Simplest Recovery Guarantee

We've established that cascadeless schedules prevent cascading rollbacks by ensuring transactions only read committed data. But consider this scenario:

Transaction T₁ writes X = 100 (uncommitted)
Transaction T₂ writes X = 200 (uncommitted)
T₁ commits
T₂ aborts

Question: What should X be after T₂'s abort?

The answer seems obvious—restore X to T₁'s committed value of 100. But the recovery process is surprisingly tricky. T₂'s abort needs to undo its write, but simply restoring the "before image" (the value before T₂'s write) would restore T₁'s uncommitted value of 100—which happens to be correct only because T₁ committed. What if T₁ had aborted instead?

Strict schedules eliminate this complexity entirely. By preventing any transaction from reading OR writing data that another uncommitted transaction has written, strict schedules make recovery trivially simple: undo using before-images, redo using after-images, with no concern for uncommitted transaction interactions.

What You Will Learn

By the end of this page, you will understand the formal definition of strict schedules, why they provide the strongest recoverability guarantees, how strict schedules simplify recovery operations, the relationship between strictness and locking protocols, and the performance implications of strictness in practice.

The Problem Strict Schedules Solve

Cascadeless schedules handle reads of uncommitted data, but they don't address writes to data that's been written by an uncommitted transaction. This creates a subtle recovery problem.

The Write-Write Dependency Problem:

strict_problem.txt

Scenario

Initial State: X = 50 (committed)
 
Time  T₁              T₂              X Value        Notes
────  ──────────────  ──────────────  ─────────────  ──────────────────────
t1    W(X) ← 100                      100            T₁ writes (uncommitted)
t2                    W(X) ← 200      200            T₂ overwrites (uncommitted)
t3    C                               200            T₁ commits
t4                    A               ???            T₂ aborts - what should X be?
 
RECOVERY DILEMMA:
 
Option A: Restore T₂'s "before image" (what X was before T₂'s write)
- Before image = 100 (T₁'s uncommitted value at t2)
- But T₁ has now committed, so 100 is actually correct
- This works... but only by accident
 
Option B: What if the order was different?
Time  T₁              T₂              X Value        
────  ──────────────  ──────────────  ─────────────  
t1    W(X) ← 100                      100            T₁ writes
t2                    W(X) ← 200      200            T₂ writes
t3                    C               200            T₂ commits
t4    A                               ???            T₁ aborts
 
Now what? T₁'s before image was 50, but restoring 50 would undo
T₂'s committed value of 200! We have a dependency conflict.

The core issue is that when transactions can write to data modified by uncommitted transactions, before-images become unreliable for recovery. The before-image might be:

The original committed value (correct)
An uncommitted value from another transaction (incorrect if that transaction commits)
Part of a chain of uncommitted writes (extremely complex to unwind)

Strict schedules solve this by preventing the situation entirely.

The Before-Image Problem

Without strictness, before-images for recovery might contain uncommitted data from other transactions. This means undoing one transaction could accidentally undo or corrupt another transaction's changes, even if that other transaction commits.

Formal Definition of Strict Schedule

Definition: A schedule S is strict if and only if, for every pair of transactions Tᵢ and Tⱼ in S where Tᵢ writes a data item X:

No other transaction Tⱼ can read or write X until Tᵢ has committed or aborted.

Formally: If Tᵢ writes X at time t and Tⱼ (j ≠ i) reads or writes X at time t', then:

t' > commit(Tᵢ) or t' > abort(Tᵢ)

In simpler terms: Once a transaction writes to a data item, that item is "locked" until the transaction completes (commits or aborts). No other transaction can touch it.

Comparison with cascadeless:

Cascadeless: Other transactions cannot read uncommitted writes
Strict: Other transactions cannot read or write uncommitted writes

Hierarchy of Schedule Properties
Property	Dirty Read	Write After Uncommitted Write	Guarantees
Recoverable	Allowed (with ordering)	Allowed	Recovery is possible
Cascadeless	Prevented	Allowed	No cascading rollbacks
Strict	Prevented	Prevented	Simple before-image recovery

Set-theoretic relationship:

Strict Schedules ⊂ Cascadeless Schedules ⊂ Recoverable Schedules ⊂ All Schedules

Every strict schedule is cascadeless (prevents reads of uncommitted data)
Every cascadeless schedule is recoverable (no cascades → correct commit ordering)
Not every cascadeless schedule is strict (might allow writes after uncommitted writes)

Converting Mermaid diagram...

Strictest Common Property

Strict schedules represent the strongest practical recoverability property. There's an even stronger property (rigorous schedules, which also prevent reads until commit), but strict schedules are the most commonly used in practice because they enable efficient recovery while maintaining good concurrency.

Why Strict Schedules Enable Simple Recovery

The power of strict schedules lies in their recovery properties. With strictness, the database can use a simple, efficient recovery algorithm:

Before-Image Reliability:

In a strict schedule, when transaction Tᵢ writes to data item X:

The before-image is the last committed value of X
No uncommitted transaction has modified X since the last commit
Therefore, restoring the before-image always restores a consistent state

Undo Recovery Algorithm (Strict Schedule):

For each aborted transaction Tᵢ:
  For each data item X that Tᵢ wrote:
    Restore X to its before-image (guaranteed committed value)
  Mark Tᵢ as aborted
  Done. No need to check other transactions.

Non-Strict Recovery

•Before-images may contain uncommitted data
•Must track write-write dependencies
•Undo order matters (later writes first)
•May need to redo committed transactions after undo
•Complex dependency graph analysis
•Risk of "lost updates" if done incorrectly

Strict Schedule Recovery

•Before-images always committed values
•No write-write dependencies to track
•Undo order doesn't matter
•No redo of committed transactions needed
•Simple linear recovery process
•Correctness is trivially guaranteed

strict_recovery.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
from dataclasses import dataclass, field
from typing import Dict, List, Any, Optional
from enum import Enum
 
class TransactionStatus(Enum):
    ACTIVE = "active"
    COMMITTED = "committed"
    ABORTED = "aborted"
 
@dataclass
class LogEntry:
    txn_id: str
    item: str
    before_image: Any
    after_image: Any
    operation: str  # "WRITE", "COMMIT", "ABORT"
 
@dataclass
class StrictScheduleRecovery:
    """
    Demonstrates why strict schedules enable simple recovery.
    
    Key insight: In a strict schedule, before-images are always
    committed values, making undo operations trivially correct.
    """
    
    committed_data: Dict[str, Any] = field(default_factory=dict)
    current_data: Dict[str, Any] = field(default_factory=dict)
    log: List[LogEntry] = field(default_factory=list)
    locks: Dict[str, str] = field(default_factory=dict)  # item -> txn holding exclusive lock
    
    def write(self, txn_id: str, item: str, value: Any) -> bool:
        """
        Write with strict scheduling - must acquire exclusive lock.
        Lock held until commit/abort (enforces strictness).
        """
        # Check for strict scheduling constraint
        if item in self.locks and self.locks[item] != txn_id:
            print(f"  {txn_id}: BLOCKED - {item} locked by {self.locks[item]} (strict)")
            return False
        
        # Acquire exclusive lock (or already have it)
        self.locks[item] = txn_id
        
        # Before image is ALWAYS the committed value (strict guarantee)
        before = self.committed_data.get(item)
        after = value
        
        self.log.append(LogEntry(txn_id, item, before, after, "WRITE"))
        self.current_data[item] = value
        
        print(f"  {txn_id}: W({item}) = {value}, before_image = {before} (guaranteed committed)")
        return True
    
    def commit(self, txn_id: str) -> None:
        """Commit - make writes permanent and release locks."""
        # Update committed data
        for entry in self.log:
            if entry.txn_id == txn_id and entry.operation == "WRITE":
                self.committed_data[entry.item] = entry.after_image
        
        # Release locks
        items_to_unlock = [item for item, holder in self.locks.items() if holder == txn_id]
        for item in items_to_unlock:
            del self.locks[item]
        
        self.log.append(LogEntry(txn_id, "", None, None, "COMMIT"))
        print(f"  {txn_id}: COMMIT - locks released, data committed")
    
    def abort(self, txn_id: str) -> None:
        """
        Abort - restore before-images (simple due to strictness).
        
        In a strict schedule, this is trivially correct:
        - Before-images are always committed values
        - No other transaction has written to our items
        - Simply restore and release locks
        """
        print(f"  {txn_id}: ABORT - restoring before-images...")
        
        # Restore before-images (guaranteed correct due to strictness)
        for entry in reversed(self.log):
            if entry.txn_id == txn_id and entry.operation == "WRITE":
                old_value = self.current_data.get(entry.item)
                self.current_data[entry.item] = entry.before_image
                print(f"    Restore {entry.item}: {old_value} → {entry.before_image} (committed value)")
        
        # Release locks
        items_to_unlock = [item for item, holder in self.locks.items() if holder == txn_id]
        for item in items_to_unlock:
            del self.locks[item]
            print(f"    Release lock on {item}")
        
        self.log.append(LogEntry(txn_id, "", None, None, "ABORT"))
        print(f"  {txn_id}: ABORT complete - no cascade, no dependency analysis needed")
    
    def show_state(self):
        print(f"  Current data: {self.current_data}")
        print(f"  Committed data: {self.committed_data}")
        print(f"  Active locks: {self.locks}")
 
# Demonstrate strict schedule recovery
print("=" * 70)
print("STRICT SCHEDULE RECOVERY DEMONSTRATION")
print("=" * 70)
 
db = StrictScheduleRecovery()
db.committed_data = {"X": 50, "Y": 100}
db.current_data = {"X": 50, "Y": 100}
 
print("\nInitial state:")
db.show_state()
 
print("\n--- T1 writes X ---")
db.write("T1", "X", 150)
 
print("\n--- T2 tries to write X (blocked by strict scheduling) ---")
result = db.write("T2", "X", 200)
if not result:
    print("  T2 must wait for T1 to complete")
 
print("\n--- T2 writes Y (different item, allowed) ---")
db.write("T2", "Y", 250)
 
print("\nState after writes:")
db.show_state()
 
print("\n--- T1 commits ---")
db.commit("T1")
 
print("\n--- Now T2 can write X ---")
db.write("T2", "X", 300)
 
print("\n--- T2 aborts ---")
db.abort("T2")
 
print("\nFinal state (after T2 abort):")
db.show_state()
print("\n✓ Recovery was simple: X = T1's committed value (150), Y = original (100)")

Strict Two-Phase Locking (Strict 2PL)

The most common way to enforce strict schedules is through Strict Two-Phase Locking (Strict 2PL). This protocol extends basic 2PL with a simple additional rule:

Strict 2PL Rule:

All exclusive (write) locks held by a transaction must be retained until the transaction commits or aborts.

This single rule guarantees strict schedules:

No transaction can read data written by an uncommitted transaction (writer holds lock)
No transaction can write data written by an uncommitted transaction (writer holds lock)
Recovery uses reliable before-images (no uncommitted values in before-images)

2PL Variants Comparison
Protocol	Lock Release Rule	Schedule Property	Recovery Complexity
Basic 2PL	After lock point (shrinking phase)	Serializable	Complex
Strict 2PL	Exclusive locks at commit/abort	Serializable + Strict	Simple
Rigorous 2PL	All locks at commit/abort	Serializable + Rigorous	Simplest

Why Strict 2PL works:

Two-Phase property (basic 2PL): Guarantees serializability—the schedule is equivalent to some serial execution.
Exclusive lock retention: By holding write locks until commit/abort:
- No other transaction can acquire a conflicting lock on the same data
- Therefore, no other transaction can read or write uncommitted data
- Result: Strict schedule

Most commercial databases use Strict 2PL for their locking-based concurrency control because it provides both serializability and simple recovery.

strict_2pl.txt

Strict 2PL Example

STRICT 2PL EXECUTION EXAMPLE
 
Transaction T₁                 Transaction T₂                 Locks Held
─────────────────────────────────────────────────────────────────────────────
BEGIN                                                         {}
 
                              BEGIN                          {}
 
WRITE(X)                                                      {X-T₁}
[Acquire exclusive lock]
 
                              WRITE(Y)                        {X-T₁, Y-T₂}
                              [Acquire exclusive lock]
 
READ(Y)                                                       {X-T₁, Y-T₂}
[BLOCKED - T₂ holds exclusive lock on Y]
 
                              COMMIT                          {X-T₁}
                              [Release Y lock at commit]
 
READ(Y) [now succeeds]                                        {X-T₁, Y-T₁}
[Acquire shared lock, sees T₂'s committed value]
 
WRITE(Z)                                                      {X-T₁, Y-T₁, Z-T₁}
[Acquire exclusive lock]
 
COMMIT                                                        {}
[Release ALL locks at commit]
 
─────────────────────────────────────────────────────────────────────────────
OBSERVATIONS:
 
1. T₁ was blocked when trying to read Y (held by uncommitted T₂)
   → No dirty reads (cascadeless)
 
2. No write could happen to X while T₁ held it uncommitted
   → No write-after-uncommitted-write (strict)
 
3. If T₁ had aborted, X would simply restore to its before-image
   → Simple recovery
   
4. Neither transaction needed to check what the other was doing
   → Decoupled, simple logic

The Default in Most Databases

When you use SERIALIZABLE isolation in most databases with locking (SQL Server, MySQL with traditional locking), you're likely using Strict 2PL. This gives you both serializable schedules AND simple recovery properties.

Strictness in MVCC Systems

Multi-Version Concurrency Control (MVCC) systems like PostgreSQL and Oracle handle strictness differently than lock-based systems. Understanding this is important because MVCC is the dominant concurrency control mechanism today.

MVCC Read Behavior:

Readers see a specific version of data (snapshot)
Uncommitted writes create new versions but don't block reads
Readers naturally read committed versions → cascadeless

MVCC Write Behavior:

Writers still need to prevent concurrent writes to the same row
Row-level locks prevent write-write conflicts
This ensures strictness for writes

Strictness Guarantees in Major MVCC Databases
Database	Read Strictness	Write Strictness	Mechanism
PostgreSQL	Cascade less (snapshot)	Strict (row locks)	MVCC + Exclusive row locks for writes
Oracle	Cascadeless (read consistency)	Strict (row locks)	MVCC with undo segments + row locks
MySQL InnoDB	Cascadeless (MVCC)	Strict (row locks)	Clustered index + row locks
SQL Server (RCSI)	Cascadeless (row versions)	Strict (row locks)	Version store + row locks

The MVCC + Locking Hybrid:

Most MVCC systems use a hybrid approach:

For reads: Pure MVCC—read the appropriate version, no blocking
For writes: Row-level locking—exclusive lock until commit

This provides:

High read concurrency (readers never blocked by writers)
Strict write guarantees (writers blocked by uncommitted writers)
Simple recovery (before-images in undo log are committed values)

Example: PostgreSQL Write Behavior

postgres_strictness.sql
PostgreSQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- Demonstrating write strictness in PostgreSQL
 
-- Session 1: Start transaction and update
BEGIN;
UPDATE accounts SET balance = 1000 WHERE id = 1;
-- Row is now locked by Session 1
 
-- Session 2: Try to update the same row
BEGIN;
UPDATE accounts SET balance = 2000 WHERE id = 1;
-- Session 2 BLOCKS here, waiting for Session 1's exclusive lock
 
-- This blocking enforces strictness:
-- Session 2 cannot write to data that Session 1 has written
-- until Session 1 commits or aborts
 
-- Session 1: Commit
COMMIT;
-- Now Session 2's UPDATE proceeds
 
-- Session 2's before-image will be Session 1's COMMITTED value (1000)
-- Not Session 1's uncommitted value from earlier
-- This makes recovery straightforward
 
-- If Session 2 later aborts:
ROLLBACK;
-- Simply restore before-image (1000), which is a committed value
-- Recovery is trivially correct
 
-- Note: Reads are not blocked (MVCC)
-- Session 2 could READ accounts WHERE id = 1 during Session 1's transaction
-- It would see the committed value before Session 1's update
-- This is cascadeless (not dirty read) but not blocking like writes

MVCC Strictness Subtlety

MVCC reads are cascadeless but not strict (they happen while the writer is active). However, MVCC writes are strict (blocked until previous writer commits). Since the recovery concern is primarily about writes (before-images for undo), MVCC systems achieve the practical benefits of strictness for recovery purposes.

Performance Implications of Strictness

Strict schedules come with performance trade-offs. Understanding these helps in making informed decisions about database configuration and workload design.

Primary Performance Impact: Write Blocking

In strict schedules, transactions writing to the same data items must serialize:

Writer 1 acquires exclusive lock
Writer 2 blocks until Writer 1 commits or aborts
High contention on "hot" records causes throughput degradation

The Lock Duration Problem:

With Strict 2PL, exclusive locks are held from acquisition until transaction completion. For long-running transactions, this can:

Block many other transactions
Create lock queues
Increase deadlock probability
Reduce overall system throughput

Performance Characteristics of Schedule Types
Property	Impact on Contention	Impact on Recovery	Typical Use Case
Recoverable only	Lowest blocking	Complex recovery	Rarely used (risky)
Cascadeless	Read blocking only	Moderate recovery	READ UNCOMMITTED (rare)
Strict	Read + write blocking	Simple recovery	Most production systems

Mitigation Strategies for Strict Schedule Performance:

Performance Optimization Techniques

•Keep transactions short: Minimize time locks are held by committing as soon as possible
•Avoid hot spots: Distribute writes across different records to reduce contention
•Use MVCC: MVCC allows reads without blocking, reducing total blocking time
•Batch writes: Group multiple updates into single transactions to reduce lock/unlock overhead
•Consider read replicas: Route read-heavy queries to replicas, reducing main database contention
•Lock ordering: Access tables/rows in consistent order to reduce deadlocks

The Trade-off in Practice

Most applications benefit from strict schedules despite the contention cost. The simplicity of recovery, predictability of behavior, and elimination of subtle bugs outweighs the performance cost for typical OLTP workloads. High-performance systems achieve scaling through sharding and replication rather than weakening consistency.

Strict vs. Rigorous Schedules

There's an even stronger schedule property called rigorous schedules. Understanding the distinction helps complete the picture of recoverability hierarchy.

Definition of Rigorous Schedule:

A schedule is rigorous if no transaction Tⱼ can read or write a data item X until every transaction Tᵢ that has read or written X has committed or aborted.

The difference from strict:

Strict: Prevents read/write of data written by uncommitted transaction
Rigorous: Prevents read/write of data read or written by uncommitted transaction

Rigorous schedules are even more restrictive—they also wait for readers to finish before allowing new operations on data.

Strict vs. Rigorous Schedules
Aspect	Strict Schedule	Rigorous Schedule
Blocks write after uncommitted read?	No	Yes
Blocks write after uncommitted write?	Yes	Yes
Blocks read after uncommitted write?	Yes	Yes
Blocks read after uncommitted read?	No	No (reading doesn't modify)
Lock equivalent	Exclusive locks held to commit	All locks held to commit
Implementation	Strict 2PL	Rigorous 2PL
Practical usage	Very common	Rare (overly restrictive)

Why rigorous schedules are rarely used:

The additional restriction (blocking writes after uncommitted reads) provides minimal benefit:

For recovery: The key benefit of strictness is reliable before-images for writes. Reads don't create before-images, so protecting reads from overwrites during their transaction doesn't help recovery.
For concurrency: Holding read locks until commit significantly reduces concurrency. A long-running read query would block all writers to that data.
For correctness: Strict schedules already provide all the correctness properties needed—serializability and simple recovery.

The bottom line: Strict schedules hit the sweet spot—simple recovery properties with reasonable concurrency. Rigorous schedules add restrictions that cost concurrency without providing commensurate benefits.

MVCC and Rigorousness

MVCC systems typically don't match either strict or rigorous definitions exactly because reads don't block and don't take traditional locks. However, they achieve equivalent recovery properties through version management and write locks.

Summary: Strict Schedule

Strict schedules provide the strongest practical recoverability guarantees, enabling simple and reliable database recovery. Let's consolidate the key concepts:

Key Takeaways

•A strict schedule prevents any transaction from reading OR writing data that an uncommitted transaction has written.
•Strictness implies cascadelessness which implies recoverability—strict schedules have all the recovery properties of weaker classes plus more.
•Simple recovery: Before-images in strict schedules are always committed values, making undo operations trivially correct.
•Strict 2PL is the primary implementation—hold exclusive locks until commit/abort to guarantee strictness.
•MVCC systems achieve equivalent properties through version management (reads) and row locks (writes).
•Performance trade-off: Strict schedules cause write blocking on contended data but provide predictable, simple recovery that usually outweighs the cost.

What's next:

We've explored the complete hierarchy of recoverability properties: recoverable, cascadeless, and strict schedules. The final page brings everything together, examining the recovery implications of these different schedule types and how databases design their recovery systems around these properties.

Page Complete

You now understand strict schedules—the property that enables simple, reliable database recovery by ensuring before-images are always committed values. You've learned how Strict 2PL and MVCC implement these guarantees, and the performance considerations involved. Next, we'll explore the broader recovery implications of these schedule properties.