Database Management SystemTransaction Concepts

Recoverability

LevelIntermediate

Duration75 mins

TopicTransaction Concepts

1 / 5

Recoverable Schedule

When Transactions Commit Too Soon

Imagine a banking system where Transaction T₁ transfers $1,000 from Account A to Account B. Before T₁ completes, Transaction T₂ reads the updated balance of Account B and uses it to calculate interest. T₂ commits successfully. Then disaster strikes—T₁ encounters an error and must be aborted.

The question becomes: What happens to T₂?

T₂ has already committed based on data that was never actually finalized. The database is now in an inconsistent state—T₂'s calculations are based on a phantom value that never existed in any consistent database state. Worse, we cannot roll back T₂ because committed transactions are supposed to be permanent.

This scenario illustrates a non-recoverable schedule—a transaction execution order that makes proper recovery impossible. Understanding recoverability is fundamental to designing database systems that can survive failures without compromising data integrity.

What You Will Learn

By the end of this page, you will understand the formal definition of recoverable schedules, why commit ordering is crucial for recovery, how to identify recoverable vs. non-recoverable schedules, and the relationship between recoverability and the ACID properties.

The Recovery Problem

Database systems must handle failures gracefully. When a transaction fails—whether due to system crash, constraint violation, deadlock, or explicit abort—the database must restore itself to a consistent state. This restoration process is called recovery.

Recovery relies on a fundamental principle: we can undo uncommitted transactions by reversing their changes. This works because uncommitted transactions haven't made any guarantees to the outside world—their effects are tentative and can be erased.

However, this principle creates a potential conflict with concurrent execution:

The Fundamental Recovery Conflict

•Uncommitted Transaction T₁ writes a value X
•Transaction T₂ reads the value X written by T₁
•T₂ commits its transaction successfully
•T₁ aborts (fails, must be undone)
•Conflict: T₂'s committed result depends on T₁'s aborted (never-existed) value

This conflict represents an irrecoverable situation. The database cannot undo T₁ without affecting T₂, but T₂ is already committed. The durability guarantee (the 'D' in ACID) says committed transactions must persist. Yet consistency (the 'C') says the database must be in a valid state.

We cannot satisfy both requirements. This is why preventing irrecoverable schedules is essential—they create impossible recovery scenarios.

The Irrecoverability Trap

Once a transaction commits based on uncommitted data from another transaction that later aborts, the database enters an irrecoverable state. No recovery algorithm can correctly restore consistency without violating the durability guarantee. Prevention is the only solution.

Formal Definition of Recoverable Schedule

To prevent irrecoverable situations, we need precise rules governing when transactions can commit. This leads to the formal definition of a recoverable schedule.

Definition: A schedule S is recoverable if and only if, for every pair of transactions Tᵢ and Tⱼ in S where Tⱼ reads a data item written by Tᵢ:

commit(Tᵢ) < commit(Tⱼ) or Tᵢ aborts

In plain terms: If transaction Tⱼ reads data written by transaction Tᵢ, then Tⱼ must not commit until after Tᵢ commits (or Tᵢ aborts, in which case Tⱼ should also abort).

This definition ensures that when a transaction commits, all the data it read has been finalized by committed transactions. No committed transaction ever depends on uncommitted (potentially aborted) data.

Read-Write Dependency and Commit Ordering
Scenario	T₁ Status	T₂ Reads from T₁	T₂ Commits Before T₁	Recoverable?
Proper ordering	Commits	Yes	No	✓ Yes
Early commit	Eventually commits	Yes	Yes (violates order)	✗ No
T₁ aborts first	Aborts	Yes	T₂ should abort	✓ Yes (if T₂ aborts)
No dependency	Any	No	Any	✓ Yes (irrelevant)
T₂ commits, T₁ later aborts	Aborts	Yes	Yes	✗ No (irrecoverable)

Understanding the "reads from" relationship:

Transaction Tⱼ reads from transaction Tᵢ when:

Tᵢ writes a value to data item X
Tⱼ subsequently reads that value of X
No other transaction writes to X between Tᵢ's write and Tⱼ's read

This creates a data dependency from Tᵢ to Tⱼ. The correctness of Tⱼ's execution depends on the correctness of Tᵢ's write.

The Transitive Nature of Dependencies

If T₃ reads from T₂, and T₂ reads from T₁, then T₁ must commit before T₂, and T₂ must commit before T₃. Dependencies are transitive—a chain of reads creates a chain of required commit orderings.

Illustrative Examples

Let's examine concrete schedule examples to build intuition for identifying recoverable and non-recoverable schedules. We'll use the notation:

R(X): Read data item X
W(X): Write data item X
C: Commit
A: Abort

Recoverable Schedule

•T₁: W(X)
•T₂: R(X) — reads T₁'s write
•T₁: C — T₁ commits first
•T₂: C — T₂ commits after T₁
•
•Analysis: T₂ reads from T₁, so T₁ must commit before T₂. This constraint is satisfied—T₁ commits, then T₂ commits. ✓ Recoverable

Non-Recoverable Schedule

•T₁: W(X)
•T₂: R(X) — reads T₁'s write
•T₂: C — T₂ commits first!
•T₁: A — T₁ aborts
•
•Analysis: T₂ reads from T₁, requiring T₁ to commit before T₂. But T₂ committed while T₁ was still active. When T₁ aborts, T₂'s committed state is based on invalid data. ✗ Non-recoverable

A more complex example:

Consider this schedule with three transactions:

T₁: R(A)  W(A)       C
T₂:           R(A)       W(B)  C
T₃:                  R(B)        C

Dependency Analysis:

T₂ reads A written by T₁ → T₁ must commit before T₂
T₃ reads B written by T₂ → T₂ must commit before T₃

Required commit order: T₁ → T₂ → T₃

Actual commit order in schedule: T₁ (line shows C first), then T₂ (C), then T₃ (C)

This satisfies the constraints, so the schedule is recoverable.

Quick Check for Recoverability

To check if a schedule is recoverable: (1) Identify all read-write dependencies (which transaction reads from which), (2) For each dependency Tⱼ reads from Tᵢ, verify that if both commit, Tᵢ commits first, (3) If all dependencies satisfy this ordering, the schedule is recoverable.

Detecting Non-Recoverable Schedules

Detecting non-recoverable schedules requires tracking read-write dependencies and commit orderings. Here's a systematic approach:

Algorithm for Recoverability Check:

Build the dependency graph: Create a directed graph where each node is a transaction. Add an edge from Tᵢ to Tⱼ if Tⱼ reads a data item last written by Tᵢ.
Record commit times: Note when each transaction commits (or if it aborts).
Verify ordering: For each edge (Tᵢ → Tⱼ) in the dependency graph:
- If both transactions commit, check that commit(Tᵢ) < commit(Tⱼ)
- If Tᵢ aborts and Tⱼ commits, the schedule is non-recoverable
Conclusion: If all edges satisfy the ordering constraint, the schedule is recoverable.

recoverability_check.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
from typing import Dict, List, Tuple, Set
from enum import Enum
from dataclasses import dataclass, field
 
class TransactionStatus(Enum):
    ACTIVE = "active"
    COMMITTED = "committed"
    ABORTED = "aborted"
 
@dataclass
class Transaction:
    id: str
    status: TransactionStatus = TransactionStatus.ACTIVE
    commit_order: int = -1  # -1 means not yet committed
 
@dataclass
class ScheduleAnalyzer:
    """
    Analyzes transaction schedules for recoverability.
    
    Recoverability Definition:
    A schedule is recoverable iff for every pair of transactions
    Ti and Tj where Tj reads from Ti:
    - If both commit: commit(Ti) < commit(Tj)
    - If Ti aborts: Tj must not commit
    """
    transactions: Dict[str, Transaction] = field(default_factory=dict)
    last_writer: Dict[str, str] = field(default_factory=dict)  # item -> transaction_id
    dependencies: List[Tuple[str, str]] = field(default_factory=list)  # (writer, reader)
    commit_counter: int = 0
    
    def write(self, txn_id: str, item: str) -> None:
        """Record a write operation."""
        if txn_id not in self.transactions:
            self.transactions[txn_id] = Transaction(id=txn_id)
        self.last_writer[item] = txn_id
        print(f"  {txn_id}: W({item})")
    
    def read(self, txn_id: str, item: str) -> None:
        """Record a read operation and track dependency."""
        if txn_id not in self.transactions:
            self.transactions[txn_id] = Transaction(id=txn_id)
        
        # If another transaction wrote this item, create dependency
        if item in self.last_writer:
            writer_id = self.last_writer[item]
            if writer_id != txn_id:
                self.dependencies.append((writer_id, txn_id))
                print(f"  {txn_id}: R({item}) -- reads from {writer_id}")
            else:
                print(f"  {txn_id}: R({item}) -- reads own write")
        else:
            print(f"  {txn_id}: R({item}) -- reads initial value")
    
    def commit(self, txn_id: str) -> None:
        """Record a commit operation."""
        if txn_id not in self.transactions:
            self.transactions[txn_id] = Transaction(id=txn_id)
        
        self.transactions[txn_id].status = TransactionStatus.COMMITTED
        self.transactions[txn_id].commit_order = self.commit_counter
        self.commit_counter += 1
        print(f"  {txn_id}: COMMIT (order: {self.transactions[txn_id].commit_order})")
    
    def abort(self, txn_id: str) -> None:
        """Record an abort operation."""
        if txn_id not in self.transactions:
            self.transactions[txn_id] = Transaction(id=txn_id)
        
        self.transactions[txn_id].status = TransactionStatus.ABORTED
        print(f"  {txn_id}: ABORT")
    
    def is_recoverable(self) -> Tuple[bool, str]:
        """
        Check if the schedule is recoverable.
        
        Returns:
            Tuple of (is_recoverable, explanation)
        """
        print("
Analyzing recoverability...")
        print(f"Dependencies (writer → reader): {self.dependencies}")
        
        for writer_id, reader_id in self.dependencies:
            writer = self.transactions.get(writer_id)
            reader = self.transactions.get(reader_id)
            
            if not writer or not reader:
                continue
            
            # Case 1: Reader committed, writer aborted
            if (reader.status == TransactionStatus.COMMITTED and 
                writer.status == TransactionStatus.ABORTED):
                return False, (
                    f"Non-recoverable: {reader_id} committed but "
                    f"depends on {writer_id} which aborted"
                )
            
            # Case 2: Both committed, check order
            if (reader.status == TransactionStatus.COMMITTED and 
                writer.status == TransactionStatus.COMMITTED):
                if reader.commit_order < writer.commit_order:
                    return False, (
                        f"Non-recoverable: {reader_id} (order {reader.commit_order}) "
                        f"committed before {writer_id} (order {writer.commit_order}), "
                        f"but {reader_id} reads from {writer_id}"
                    )
        
        return True, "Schedule is recoverable: all commit orderings satisfy dependencies"
 
# Example: Non-Recoverable Schedule
print("=" * 60)
print("Example 1: NON-RECOVERABLE SCHEDULE")
print("=" * 60)
analyzer1 = ScheduleAnalyzer()
analyzer1.write("T1", "X")
analyzer1.read("T2", "X")    # T2 reads from T1
analyzer1.commit("T2")        # T2 commits first (violation!)
analyzer1.abort("T1")         # T1 aborts
result1, explanation1 = analyzer1.is_recoverable()
print(f"Result: {explanation1}
")
 
# Example: Recoverable Schedule
print("=" * 60)
print("Example 2: RECOVERABLE SCHEDULE")
print("=" * 60)
analyzer2 = ScheduleAnalyzer()
analyzer2.write("T1", "X")
analyzer2.read("T2", "X")    # T2 reads from T1
analyzer2.commit("T1")        # T1 commits first (correct!)
analyzer2.commit("T2")        # T2 commits after
result2, explanation2 = analyzer2.is_recoverable()
print(f"Result: {explanation2}
")

Practical Implementation Considerations:

In real database systems, ensuring recoverability is typically enforced through:

Strict Two-Phase Locking (Strict 2PL): Holds all exclusive locks until commit, preventing other transactions from reading uncommitted data.
Multi-Version Concurrency Control (MVCC): Transactions read committed versions, never uncommitted data.
Commit protocols: The system delays a transaction's commit until all transactions it read from have committed.

These mechanisms guarantee recoverability by construction rather than by checking after the fact.

Dirty Reads and Recoverability

A dirty read occurs when a transaction reads data written by another transaction that has not yet committed. Dirty reads are fundamentally connected to recoverability issues.

The relationship:

Non-recoverable schedules always involve dirty reads
However, not all dirty reads lead to non-recoverable schedules
A dirty read becomes problematic only when the reader commits before the writer

This distinction is important: dirty reads are a necessary but not sufficient condition for non-recoverability.

Dirty Reads vs. Non-Recoverability
Scenario	Dirty Read?	Recoverable?	Explanation
T₂ reads T₁'s uncommitted write; T₁ commits; T₂ commits	Yes	Yes	Dirty read occurred, but commit order is correct
T₂ reads T₁'s uncommitted write; T₂ commits; T₁ commits	Yes	No	Dirty read + wrong commit order → irrecoverable
T₂ reads T₁'s uncommitted write; T₁ aborts; T₂ aborts	Yes	Yes	Dirty read cascaded to abort, but recoverable
T₂ reads T₁'s uncommitted write; T₂ commits; T₁ aborts	Yes	No	Committed transaction depends on aborted data
T₂ reads only committed data	No	Yes	No dirty reads = guaranteed recoverable

The Conservative Approach

Preventing dirty reads entirely (through locking or MVCC) is the most common approach because it eliminates the possibility of non-recoverable schedules by construction. However, this comes with performance costs. Some systems allow dirty reads for performance but use additional mechanisms to ensure recoverability.

SQL Isolation Levels and Dirty Reads:

Isolation Level	Allows Dirty Reads	Recoverable by Design
READ UNCOMMITTED	Yes	Must track dependencies
READ COMMITTED	No	Yes (by construction)
REPEATABLE READ	No	Yes (by construction)
SERIALIZABLE	No	Yes (by construction)

The READ UNCOMMITTED isolation level explicitly allows dirty reads, placing the burden of ensuring recoverability on the application or other mechanisms.

Commit Ordering Strategies

Ensuring recoverable schedules requires controlling when transactions can commit. There are several strategies database systems employ:

Strategy 1: Deferred Commit

When a transaction Tⱼ reads from an uncommitted transaction Tᵢ, the system defers Tⱼ's commit until after Tᵢ commits (or aborts, causing Tⱼ to abort too).

Tⱼ issues COMMIT → System checks dependencies → 
If Tᵢ still active: wait for Tᵢ
If Tᵢ committed: proceed with Tⱼ commit
If Tᵢ aborted: abort Tⱼ too

Strategy 2: Prevent Dirty Reads (Blocking)

Prevent transactions from reading uncommitted data entirely using locks or versioning:

Tⱼ requests READ(X) → X was written by uncommitted Tᵢ → 
Block Tⱼ until Tᵢ commits or aborts

Strategy 3: Multi-Version Reads

Maintain multiple versions of data. Transactions always read the most recent committed version:

Tⱼ requests READ(X) → System returns last committed version of X
(Even if uncommitted Tᵢ has written a newer version)

Strategy Comparison

•Deferred Commit: Allows dirty reads but ensures recoverability. May cause cascading aborts. Used when read performance is critical and abort rates are low.
•Blocking Reads: Prevents dirty reads entirely. May cause lock waits and reduced concurrency. Traditional approach with 2PL.
•Multi-Version Reads: Best of both worlds—high read concurrency with no dirty reads. Requires additional storage for versions. Used in MVCC systems (PostgreSQL, Oracle, MySQL InnoDB).

Modern Database Preference

Most modern OLTP databases use Multi-Version Concurrency Control (MVCC) because it provides excellent read concurrency without dirty reads. Writers don't block readers, and readers always see consistent committed data. This makes recoverability automatic while maintaining high performance.

Real-World Implications

Understanding recoverability has direct implications for database administration, application development, and system design:

For Database Administrators:

Choosing isolation levels: Lower isolation levels (READ UNCOMMITTED) may improve performance but require understanding of recoverability guarantees
Recovery planning: Non-recoverable situations cannot be fixed by recovery procedures—they must be prevented
Monitoring: Watch for patterns that might indicate recoverability risks in custom configurations

For Application Developers:

Transaction design: Understanding dependencies helps design transactions that commit in the right order
Error handling: Knowing that cascading aborts can occur helps design robust retry logic
Performance tuning: Choosing appropriate isolation levels requires understanding the recoverability tradeoffs

Recoverability Considerations by Use Case
Use Case	Recommended Approach	Justification
Financial transactions	Strict schedules (covered later)	Cannot tolerate any data inconsistency
Analytics/reporting	MVCC with snapshot isolation	Read consistency without blocking writers
High-throughput logging	May accept some risk	Volume makes strict approaches costly
Master-slave replication	Serializable at master	Replication depends on consistent commit order
Distributed transactions	Two-phase commit	Coordinates commits across systems

The Hidden Cost of Non-Recoverability

In production systems, non-recoverable schedules are especially dangerous because they may not manifest as immediate failures. The database might appear to work correctly until a specific sequence of events (abort after dependent commit) creates an irrecoverable state. Testing may not catch this because it requires specific failure timing.

Summary: Recoverable Schedule

Recoverable schedules are a fundamental requirement for database systems that must survive failures. Let's consolidate the key concepts:

Key Takeaways

•A recoverable schedule ensures that if Tⱼ reads from Tᵢ, then Tᵢ must commit before Tⱼ commits (or both abort).
•Non-recoverable schedules create impossible recovery situations where committed transactions depend on aborted data.
•Dirty reads are necessary but not sufficient for non-recoverability—the commit ordering determines recoverability.
•Preventing non-recoverability can be done through deferred commits, blocking reads, or multi-version concurrency control.
•Modern databases typically use MVCC to guarantee recoverability while maintaining high concurrency.
•Understanding recoverability helps in choosing isolation levels, designing transactions, and implementing robust error handling.

What's next:

Recoverable schedules guarantee that recovery is possible, but they don't address the cost of recovery. When a transaction aborts, other transactions that read its data may also need to abort. This can trigger a chain reaction called a cascading rollback. The next page explores this phenomenon and its implications for system performance and availability.

Page Complete

You now understand recoverable schedules—the fundamental requirement that commit ordering must respect read-write dependencies. This ensures that database recovery is always possible. Next, we'll explore what happens when recovery does occur and how cascading rollbacks can amplify the impact of a single transaction failure.

1 / 5

Loading learning content...

Database Management SystemTransaction Concepts

Recoverability

LevelIntermediate

Duration75 mins

TopicTransaction Concepts

1 / 5

Recoverable Schedule

When Transactions Commit Too Soon

The question becomes: What happens to T₂?

What You Will Learn

The Recovery Problem

However, this principle creates a potential conflict with concurrent execution:

The Fundamental Recovery Conflict

•Uncommitted Transaction T₁ writes a value X
•Transaction T₂ reads the value X written by T₁
•T₂ commits its transaction successfully
•T₁ aborts (fails, must be undone)
•Conflict: T₂'s committed result depends on T₁'s aborted (never-existed) value

We cannot satisfy both requirements. This is why preventing irrecoverable schedules is essential—they create impossible recovery scenarios.

The Irrecoverability Trap

Formal Definition of Recoverable Schedule

To prevent irrecoverable situations, we need precise rules governing when transactions can commit. This leads to the formal definition of a recoverable schedule.

Definition: A schedule S is recoverable if and only if, for every pair of transactions Tᵢ and Tⱼ in S where Tⱼ reads a data item written by Tᵢ:

commit(Tᵢ) < commit(Tⱼ) or Tᵢ aborts

In plain terms: If transaction Tⱼ reads data written by transaction Tᵢ, then Tⱼ must not commit until after Tᵢ commits (or Tᵢ aborts, in which case Tⱼ should also abort).

Read-Write Dependency and Commit Ordering
Scenario	T₁ Status	T₂ Reads from T₁	T₂ Commits Before T₁	Recoverable?
Proper ordering	Commits	Yes	No	✓ Yes
Early commit	Eventually commits	Yes	Yes (violates order)	✗ No
T₁ aborts first	Aborts	Yes	T₂ should abort	✓ Yes (if T₂ aborts)
No dependency	Any	No	Any	✓ Yes (irrelevant)
T₂ commits, T₁ later aborts	Aborts	Yes	Yes	✗ No (irrecoverable)

Understanding the "reads from" relationship:

Transaction Tⱼ reads from transaction Tᵢ when:

Tᵢ writes a value to data item X
Tⱼ subsequently reads that value of X
No other transaction writes to X between Tᵢ's write and Tⱼ's read

This creates a data dependency from Tᵢ to Tⱼ. The correctness of Tⱼ's execution depends on the correctness of Tᵢ's write.

The Transitive Nature of Dependencies

Illustrative Examples

Let's examine concrete schedule examples to build intuition for identifying recoverable and non-recoverable schedules. We'll use the notation:

R(X): Read data item X
W(X): Write data item X
C: Commit
A: Abort

Recoverable Schedule

•T₁: W(X)
•T₂: R(X) — reads T₁'s write
•T₁: C — T₁ commits first
•T₂: C — T₂ commits after T₁
•
•Analysis: T₂ reads from T₁, so T₁ must commit before T₂. This constraint is satisfied—T₁ commits, then T₂ commits. ✓ Recoverable

Non-Recoverable Schedule

•T₁: W(X)
•T₂: R(X) — reads T₁'s write
•T₂: C — T₂ commits first!
•T₁: A — T₁ aborts
•
•Analysis: T₂ reads from T₁, requiring T₁ to commit before T₂. But T₂ committed while T₁ was still active. When T₁ aborts, T₂'s committed state is based on invalid data. ✗ Non-recoverable

A more complex example:

Consider this schedule with three transactions:

T₁: R(A)  W(A)       C
T₂:           R(A)       W(B)  C
T₃:                  R(B)        C

Dependency Analysis:

T₂ reads A written by T₁ → T₁ must commit before T₂
T₃ reads B written by T₂ → T₂ must commit before T₃

Required commit order: T₁ → T₂ → T₃

Actual commit order in schedule: T₁ (line shows C first), then T₂ (C), then T₃ (C)

This satisfies the constraints, so the schedule is recoverable.

Quick Check for Recoverability

Detecting Non-Recoverable Schedules

Detecting non-recoverable schedules requires tracking read-write dependencies and commit orderings. Here's a systematic approach:

Algorithm for Recoverability Check:

Build the dependency graph: Create a directed graph where each node is a transaction. Add an edge from Tᵢ to Tⱼ if Tⱼ reads a data item last written by Tᵢ.
Record commit times: Note when each transaction commits (or if it aborts).
Verify ordering: For each edge (Tᵢ → Tⱼ) in the dependency graph:
- If both transactions commit, check that commit(Tᵢ) < commit(Tⱼ)
- If Tᵢ aborts and Tⱼ commits, the schedule is non-recoverable
Conclusion: If all edges satisfy the ordering constraint, the schedule is recoverable.

recoverability_check.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
from typing import Dict, List, Tuple, Set
from enum import Enum
from dataclasses import dataclass, field
 
class TransactionStatus(Enum):
    ACTIVE = "active"
    COMMITTED = "committed"
    ABORTED = "aborted"
 
@dataclass
class Transaction:
    id: str
    status: TransactionStatus = TransactionStatus.ACTIVE
    commit_order: int = -1  # -1 means not yet committed
 
@dataclass
class ScheduleAnalyzer:
    """
    Analyzes transaction schedules for recoverability.
    
    Recoverability Definition:
    A schedule is recoverable iff for every pair of transactions
    Ti and Tj where Tj reads from Ti:
    - If both commit: commit(Ti) < commit(Tj)
    - If Ti aborts: Tj must not commit
    """
    transactions: Dict[str, Transaction] = field(default_factory=dict)
    last_writer: Dict[str, str] = field(default_factory=dict)  # item -> transaction_id
    dependencies: List[Tuple[str, str]] = field(default_factory=list)  # (writer, reader)
    commit_counter: int = 0
    
    def write(self, txn_id: str, item: str) -> None:
        """Record a write operation."""
        if txn_id not in self.transactions:
            self.transactions[txn_id] = Transaction(id=txn_id)
        self.last_writer[item] = txn_id
        print(f"  {txn_id}: W({item})")
    
    def read(self, txn_id: str, item: str) -> None:
        """Record a read operation and track dependency."""
        if txn_id not in self.transactions:
            self.transactions[txn_id] = Transaction(id=txn_id)
        
        # If another transaction wrote this item, create dependency
        if item in self.last_writer:
            writer_id = self.last_writer[item]
            if writer_id != txn_id:
                self.dependencies.append((writer_id, txn_id))
                print(f"  {txn_id}: R({item}) -- reads from {writer_id}")
            else:
                print(f"  {txn_id}: R({item}) -- reads own write")
        else:
            print(f"  {txn_id}: R({item}) -- reads initial value")
    
    def commit(self, txn_id: str) -> None:
        """Record a commit operation."""
        if txn_id not in self.transactions:
            self.transactions[txn_id] = Transaction(id=txn_id)
        
        self.transactions[txn_id].status = TransactionStatus.COMMITTED
        self.transactions[txn_id].commit_order = self.commit_counter
        self.commit_counter += 1
        print(f"  {txn_id}: COMMIT (order: {self.transactions[txn_id].commit_order})")
    
    def abort(self, txn_id: str) -> None:
        """Record an abort operation."""
        if txn_id not in self.transactions:
            self.transactions[txn_id] = Transaction(id=txn_id)
        
        self.transactions[txn_id].status = TransactionStatus.ABORTED
        print(f"  {txn_id}: ABORT")
    
    def is_recoverable(self) -> Tuple[bool, str]:
        """
        Check if the schedule is recoverable.
        
        Returns:
            Tuple of (is_recoverable, explanation)
        """
        print("
Analyzing recoverability...")
        print(f"Dependencies (writer → reader): {self.dependencies}")
        
        for writer_id, reader_id in self.dependencies:
            writer = self.transactions.get(writer_id)
            reader = self.transactions.get(reader_id)
            
            if not writer or not reader:
                continue
            
            # Case 1: Reader committed, writer aborted
            if (reader.status == TransactionStatus.COMMITTED and 
                writer.status == TransactionStatus.ABORTED):
                return False, (
                    f"Non-recoverable: {reader_id} committed but "
                    f"depends on {writer_id} which aborted"
                )
            
            # Case 2: Both committed, check order
            if (reader.status == TransactionStatus.COMMITTED and 
                writer.status == TransactionStatus.COMMITTED):
                if reader.commit_order < writer.commit_order:
                    return False, (
                        f"Non-recoverable: {reader_id} (order {reader.commit_order}) "
                        f"committed before {writer_id} (order {writer.commit_order}), "
                        f"but {reader_id} reads from {writer_id}"
                    )
        
        return True, "Schedule is recoverable: all commit orderings satisfy dependencies"
 
# Example: Non-Recoverable Schedule
print("=" * 60)
print("Example 1: NON-RECOVERABLE SCHEDULE")
print("=" * 60)
analyzer1 = ScheduleAnalyzer()
analyzer1.write("T1", "X")
analyzer1.read("T2", "X")    # T2 reads from T1
analyzer1.commit("T2")        # T2 commits first (violation!)
analyzer1.abort("T1")         # T1 aborts
result1, explanation1 = analyzer1.is_recoverable()
print(f"Result: {explanation1}
")
 
# Example: Recoverable Schedule
print("=" * 60)
print("Example 2: RECOVERABLE SCHEDULE")
print("=" * 60)
analyzer2 = ScheduleAnalyzer()
analyzer2.write("T1", "X")
analyzer2.read("T2", "X")    # T2 reads from T1
analyzer2.commit("T1")        # T1 commits first (correct!)
analyzer2.commit("T2")        # T2 commits after
result2, explanation2 = analyzer2.is_recoverable()
print(f"Result: {explanation2}
")

Practical Implementation Considerations:

In real database systems, ensuring recoverability is typically enforced through:

Strict Two-Phase Locking (Strict 2PL): Holds all exclusive locks until commit, preventing other transactions from reading uncommitted data.
Multi-Version Concurrency Control (MVCC): Transactions read committed versions, never uncommitted data.
Commit protocols: The system delays a transaction's commit until all transactions it read from have committed.

These mechanisms guarantee recoverability by construction rather than by checking after the fact.

Dirty Reads and Recoverability

A dirty read occurs when a transaction reads data written by another transaction that has not yet committed. Dirty reads are fundamentally connected to recoverability issues.

The relationship:

Non-recoverable schedules always involve dirty reads
However, not all dirty reads lead to non-recoverable schedules
A dirty read becomes problematic only when the reader commits before the writer

This distinction is important: dirty reads are a necessary but not sufficient condition for non-recoverability.

Dirty Reads vs. Non-Recoverability
Scenario	Dirty Read?	Recoverable?	Explanation
T₂ reads T₁'s uncommitted write; T₁ commits; T₂ commits	Yes	Yes	Dirty read occurred, but commit order is correct
T₂ reads T₁'s uncommitted write; T₂ commits; T₁ commits	Yes	No	Dirty read + wrong commit order → irrecoverable
T₂ reads T₁'s uncommitted write; T₁ aborts; T₂ aborts	Yes	Yes	Dirty read cascaded to abort, but recoverable
T₂ reads T₁'s uncommitted write; T₂ commits; T₁ aborts	Yes	No	Committed transaction depends on aborted data
T₂ reads only committed data	No	Yes	No dirty reads = guaranteed recoverable

The Conservative Approach

SQL Isolation Levels and Dirty Reads:

Isolation Level	Allows Dirty Reads	Recoverable by Design
READ UNCOMMITTED	Yes	Must track dependencies
READ COMMITTED	No	Yes (by construction)
REPEATABLE READ	No	Yes (by construction)
SERIALIZABLE	No	Yes (by construction)

The READ UNCOMMITTED isolation level explicitly allows dirty reads, placing the burden of ensuring recoverability on the application or other mechanisms.

Commit Ordering Strategies

Ensuring recoverable schedules requires controlling when transactions can commit. There are several strategies database systems employ:

Strategy 1: Deferred Commit

When a transaction Tⱼ reads from an uncommitted transaction Tᵢ, the system defers Tⱼ's commit until after Tᵢ commits (or aborts, causing Tⱼ to abort too).

Tⱼ issues COMMIT → System checks dependencies → 
If Tᵢ still active: wait for Tᵢ
If Tᵢ committed: proceed with Tⱼ commit
If Tᵢ aborted: abort Tⱼ too

Strategy 2: Prevent Dirty Reads (Blocking)

Prevent transactions from reading uncommitted data entirely using locks or versioning:

Tⱼ requests READ(X) → X was written by uncommitted Tᵢ → 
Block Tⱼ until Tᵢ commits or aborts

Strategy 3: Multi-Version Reads

Maintain multiple versions of data. Transactions always read the most recent committed version:

Tⱼ requests READ(X) → System returns last committed version of X
(Even if uncommitted Tᵢ has written a newer version)

Strategy Comparison

•Deferred Commit: Allows dirty reads but ensures recoverability. May cause cascading aborts. Used when read performance is critical and abort rates are low.
•Blocking Reads: Prevents dirty reads entirely. May cause lock waits and reduced concurrency. Traditional approach with 2PL.
•Multi-Version Reads: Best of both worlds—high read concurrency with no dirty reads. Requires additional storage for versions. Used in MVCC systems (PostgreSQL, Oracle, MySQL InnoDB).

Modern Database Preference

Real-World Implications

Understanding recoverability has direct implications for database administration, application development, and system design:

For Database Administrators:

Choosing isolation levels: Lower isolation levels (READ UNCOMMITTED) may improve performance but require understanding of recoverability guarantees
Recovery planning: Non-recoverable situations cannot be fixed by recovery procedures—they must be prevented
Monitoring: Watch for patterns that might indicate recoverability risks in custom configurations

For Application Developers:

Transaction design: Understanding dependencies helps design transactions that commit in the right order
Error handling: Knowing that cascading aborts can occur helps design robust retry logic
Performance tuning: Choosing appropriate isolation levels requires understanding the recoverability tradeoffs

Recoverability Considerations by Use Case
Use Case	Recommended Approach	Justification
Financial transactions	Strict schedules (covered later)	Cannot tolerate any data inconsistency
Analytics/reporting	MVCC with snapshot isolation	Read consistency without blocking writers
High-throughput logging	May accept some risk	Volume makes strict approaches costly
Master-slave replication	Serializable at master	Replication depends on consistent commit order
Distributed transactions	Two-phase commit	Coordinates commits across systems

The Hidden Cost of Non-Recoverability

Summary: Recoverable Schedule

Recoverable schedules are a fundamental requirement for database systems that must survive failures. Let's consolidate the key concepts:

Key Takeaways

•A recoverable schedule ensures that if Tⱼ reads from Tᵢ, then Tᵢ must commit before Tⱼ commits (or both abort).
•Non-recoverable schedules create impossible recovery situations where committed transactions depend on aborted data.
•Dirty reads are necessary but not sufficient for non-recoverability—the commit ordering determines recoverability.
•Preventing non-recoverability can be done through deferred commits, blocking reads, or multi-version concurrency control.
•Modern databases typically use MVCC to guarantee recoverability while maintaining high concurrency.
•Understanding recoverability helps in choosing isolation levels, designing transactions, and implementing robust error handling.

What's next:

Page Complete

1 / 5