Loading content...
In the previous page, we learned to read and analyze transaction schedules. Now comes the critical question: Is this schedule correct?
Database systems allow concurrent transaction execution for performance, but not all interleavings produce correct results. The formal notion of correctness is serializability—a schedule is correct if its outcome is equivalent to executing the transactions one at a time, in some serial order.
Serializability testing is the most algorithmically rich topic in transaction interviews. You'll be asked to construct precedence graphs, determine cycle presence, and reason about different flavors of equivalence. This page gives you complete mastery of these techniques.
By completing this page, you will be able to: (1) Understand and apply the formal definition of conflict serializability; (2) Construct precedence graphs from any schedule; (3) Determine serializability through cycle detection; (4) Distinguish conflict from view serializability; (5) Derive equivalent serial orderings when they exist.
Serial execution is trivially correct—each transaction sees a complete, consistent database state. But serial execution is slow because transactions wait for each other unnecessarily. Concurrent execution improves throughput by allowing transactions to overlap.
The insight is: We can allow concurrency as long as the result is indistinguishable from some serial execution. This is the serializability requirement.
A schedule S is serializable if it is equivalent to some serial schedule S_s. The key question is: what does "equivalent" mean?
There are two main notions of equivalence:
Conflict Equivalence: Two schedules are conflict equivalent if they have the same operations and every pair of conflicting operations appears in the same order.
View Equivalence: Two schedules are view equivalent if they have the same reads-from relationships and the same final writes for each data item.
| Property | Conflict Serializability | View Serializability |
|---|---|---|
| Definition | Conflict equivalent to a serial schedule | View equivalent to a serial schedule |
| Testing Complexity | O(n²) - polynomial time (precedence graph) | NP-complete in general |
| Relationship | Subset of view serializable schedules | Superset of conflict serializable schedules |
| Practical Use | Most common in DBMS implementations | Theoretical interest, rarely implemented |
| Detects | All conflict-based anomalies | More permissive, allows blind writes |
Every conflict serializable schedule is also view serializable, but not vice versa. In interview contexts, when someone says 'serializable' without qualification, they almost always mean conflict serializable. View serializability is tested less often but may appear in theoretical questions.
Conflict equivalence captures the idea that if we can transform one schedule into another by swapping only non-conflicting operations, the schedules produce identical results.
Recall that two operations conflict if they:
Non-conflicting operations can be swapped without changing the schedule's outcome. Two reads on the same item can be swapped. Operations on different items from different transactions can be swapped.
Two schedules S₁ and S₂ are conflict equivalent if:
A schedule is conflict serializable if it is conflict equivalent to some serial schedule.
S: R₁(A) R₂(B) W₁(A) W₂(B) C₁ C₂
Serial: R₁(A) W₁(A) C₁ R₂(B) W₂(B) C₂Conflicting pairs in S:
• R₁(A) vs W₁(A): Same transaction - NOT a conflict
• R₂(B) vs W₂(B): Same transaction - NOT a conflict
• R₁(A) vs R₂(B): Different items - NOT a conflict
• W₁(A) vs W₂(B): Different items - NOT a conflict
No conflicts between operations of different transactions!
Therefore, we can freely reorder to get the serial schedule:
R₁(A) → R₂(B) → W₁(A) → W₂(B) can become
R₁(A) → W₁(A) → R₂(B) → W₂(B)
✓ S is conflict serializable (equivalent to T₁ → T₂)To test conflict serializability by brute force, you attempt to transform the given schedule into a serial schedule by repeatedly swapping adjacent non-conflicting operations.
This technique works for small schedules but is impractical for large ones. It does, however, build intuition for what conflict serializability means.
Swap Rules:
The swap technique is useful for small schedules (2-3 transactions, 4-6 operations). For larger schedules, the precedence graph method is far more efficient. However, understanding swapping helps you verify precedence graph results.
The precedence graph (also called serialization graph) is the standard tool for testing conflict serializability. It transforms the problem into a simple graph theory question: Does the graph contain a cycle?
Step 1: Create Nodes
Step 2: Create Edges
Step 3: Test for Cycles
Step 4: Derive Serial Order
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586
def build_precedence_graph(schedule, transactions): """ Build a precedence graph from a transaction schedule. Args: schedule: List of (transaction_id, operation, data_item) tuples e.g., [('T1', 'R', 'A'), ('T2', 'W', 'A'), ...] transactions: Set of transaction IDs Returns: Adjacency list representing the precedence graph """ from collections import defaultdict graph = defaultdict(set) # Process each pair of operations for i in range(len(schedule)): for j in range(i + 1, len(schedule)): t1, op1, item1 = schedule[i] t2, op2, item2 = schedule[j] # Check for conflict: different transactions, same item, at least one write if t1 != t2 and item1 == item2 and (op1 == 'W' or op2 == 'W'): # Add edge from t1 to t2 (t1's operation came first) graph[t1].add(t2) return graph def has_cycle(graph, transactions): """ Detect if the precedence graph has a cycle using DFS. Returns: (has_cycle: bool, cycle_path: list if cycle exists) """ WHITE, GRAY, BLACK = 0, 1, 2 color = {t: WHITE for t in transactions} parent = {} def dfs(node, path): color[node] = GRAY for neighbor in graph.get(node, []): if color[neighbor] == GRAY: # Found cycle - reconstruct path cycle_start = path.index(neighbor) return True, path[cycle_start:] + [neighbor] if color[neighbor] == WHITE: result = dfs(neighbor, path + [neighbor]) if result[0]: return result color[node] = BLACK return False, [] for t in transactions: if color[t] == WHITE: result = dfs(t, [t]) if result[0]: return result return False, [] def topological_sort(graph, transactions): """ Perform topological sort to get equivalent serial order. Only valid if graph is acyclic. """ from collections import deque in_degree = {t: 0 for t in transactions} for node in graph: for neighbor in graph[node]: in_degree[neighbor] += 1 queue = deque([t for t in transactions if in_degree[t] == 0]) result = [] while queue: node = queue.popleft() result.append(node) for neighbor in graph.get(node, []): in_degree[neighbor] -= 1 if in_degree[neighbor] == 0: queue.append(neighbor) return result if len(result) == len(transactions) else NoneSchedule S: R₁(A) W₂(A) R₃(A) W₁(A) W₃(B) R₂(B) C₁ C₂ C₃
Step 1: Identify Transactions
Step 2: Find All Conflicts
On data item A:
On data item B:
Step 3: Construct Graph
Edges: T₁ → T₂, T₂ → T₃, T₂ → T₁, T₃ → T₁, T₃ → T₂
Step 4: Detect Cycle
Conclusion: Schedule S is NOT conflict serializable (contains cycles).
Let's work through several examples of increasing complexity to build fluency with precedence graph construction.
Schedule: R₁(A) R₂(A) W₂(A) W₁(A) C₁ C₂
Conflict Analysis on A:
Precedence Graph:
T₁ ⇄ T₂ (bidirectional edges = cycle)
Result: NOT conflict serializable
Explanation: The schedule exhibits a lost update pattern. Both transactions read A, then both write A. The final value depends on who writes last, but neither transaction saw the other's write. This is the classic conflict pattern that creates cycles.
The most common error in precedence graph construction is missing conflicts. Systematically check EVERY pair of operations on the SAME data item from DIFFERENT transactions. Create a checklist for each data item to ensure completeness.
View serializability is a more permissive criterion than conflict serializability. A schedule is view serializable if it is view equivalent to some serial schedule.
Two schedules S and S' are view equivalent if they satisfy ALL THREE conditions:
1. Initial Read Condition For each data item X, if transaction T reads the initial value of X in S, then T must also read the initial value of X in S'.
2. Reads-From Condition (Updated Read) For each data item X, if T_j reads the value of X written by T_i in S, then T_j must also read the value of X written by T_i in S'.
3. Final Write Condition For each data item X, if T_i performs the final write of X in S, then T_i must also perform the final write of X in S'.
These conditions ensure that transactions "see" the same data and leave the database in the same final state.
Schedule S: R₁(A) W₂(A) W₁(A) W₃(A) C₁ C₂ C₃
Note: W₂(A) is a 'blind write' - T₂ writes A without reading it first.Conflict Analysis:
- R₁(A) before W₂(A): T₁ → T₂
- R₁(A) before W₁(A): Same txn
- W₂(A) before W₁(A): T₂ → T₁
- W₁(A) before W₃(A): T₁ → T₃
- W₂(A) before W₃(A): T₂ → T₃
Precedence Graph has edges T₁→T₂ and T₂→T₁ → CYCLE
∴ NOT conflict serializable
View Equivalence to T₁ → T₂ → T₃:
1. Initial read: T₁ reads initial A in both ✓
2. Reads-from: Only T₁ reads (from initial) ✓
3. Final write: T₃ does final write in both ✓
Blind writes can be reordered without affecting view!
∴ IS view serializableA blind write is a write operation where the transaction never read the data item before writing it. Blind writes can sometimes be reordered without affecting the schedule's "view" because:
NP-Complete Testing: Unlike conflict serializability (polynomial time via precedence graphs), testing view serializability is NP-complete.
Limited Practical Gain: The additional schedules allowed by view serializability (those with blind writes that happen to work) are rare in practice.
Implementation Complexity: DBMS would need to track reads-from relationships and final writes, adding significant overhead.
For these reasons, real database systems use conflict serializability (via 2PL) as their correctness criterion.
If asked about view serializability, first test for conflict serializability. If conflict serializable, it's automatically view serializable. If NOT conflict serializable, check for blind writes—they're the only way to be view serializable without being conflict serializable.
Beyond serializability, schedules are also classified by their recoverability properties—how well they handle transaction failures.
From most restrictive to least restrictive:
Is schedule S recoverable? For each reads-from pair (T_i, T_j, X), check: Does C_i appear before C_j in S?
Is schedule S cascadeless? For each reads-from pair (T_i, T_j, X), check: Does C_i or A_i appear before R_j(X) in S?
Is schedule S strict? For each data item X, if T_i writes X:
| Schedule | Recoverable? | Cascadeless? | Strict? |
|---|---|---|---|
| W₁(A) R₂(A) C₁ C₂ | Yes (C₁ before C₂) | No (R₂ before C₁) | No |
| W₁(A) C₁ R₂(A) C₂ | Yes | Yes | Yes |
| W₁(A) R₂(A) C₂ C₁ | No (C₂ before C₁) | No | No |
| W₁(A) R₂(A) W₂(A) C₁ C₂ | Yes | No | No |
Most production databases implement strict schedules through Strict Two-Phase Locking (S2PL), which holds all locks until transaction completion. This automatically ensures both conflict serializability and strictness.
Serializability questions in interviews follow predictable patterns. Recognizing these patterns helps you solve problems quickly and confidently.
Before diving into graph construction, do a quick scan: If all of one transaction's operations complete before another begins on any shared data item, those two transactions won't create a cycle between them. This can quickly simplify analysis for larger schedules.
Serializability testing is foundational to transaction problem-solving. Let's consolidate the essential skills:
What's Next:
With serializability testing mastered, we'll move to Lock Analysis—understanding how Two-Phase Locking (2PL) and its variants guarantee serializability, how to trace lock acquisition and release through a schedule, and how to identify protocol violations.
You now have complete mastery of serializability testing. The precedence graph method gives you a systematic, error-resistant approach to any serializability question. Combined with understanding of view serializability and recoverability, you can comprehensively classify any transaction schedule.