Loading learning content...
We've established that serial schedules are our benchmark for correctness. We've also seen that concurrent schedules provide the performance modern systems need. The critical question now becomes:
How do we determine if a concurrent schedule is "equivalent" to a serial schedule?
The answer isn't as simple as it might seem. "Equivalent" could mean many things:
Each interpretation leads to a different formal definition of equivalence, with different properties and different computational complexity for testing. Understanding these distinctions is essential for appreciating why conflict equivalence became the practical standard in database systems.
This page explores the various notions of schedule equivalence, their formal definitions, their relationships, and their practical implications for database concurrency control.
By the end of this page, you will understand three types of schedule equivalence (result, view, conflict), their formal definitions and relationships, why conflict equivalence is the practical choice, and how to determine if two schedules are equivalent under each definition.
Serializability is defined in terms of equivalence:
A schedule S is serializable if and only if
S is EQUIVALENT to some serial schedule
But what does "equivalent" mean? The definition of equivalence directly determines:
Imagine two schedules, S₁ and S₂, both operating on the same transactions. They could be related in various ways:
Database theory formalizes specific points on this spectrum as precise definitions of equivalence.
The database literature defines three primary notions of schedule equivalence:
| Equivalence Type | Intuition | Status |
|---|---|---|
| Result Equivalence | Same final database state | Too weak (coincidental matches) |
| View Equivalence | Same reads observe same writes | Theoretically ideal but NP-complete |
| Conflict Equivalence | Same ordering of conflicting operations | Practical standard (polynomial test) |
Each is a different formalization of "the same behavior." We'll explore all three, but focus on conflict equivalence as it's the foundation for practical serializability testing.
Each equivalence definition partitions the space of schedules into equivalence classes. Schedules in the same class are considered 'the same' under that definition. A schedule is serializable if it belongs to the same equivalence class as some serial schedule.
The simplest notion of equivalence focuses solely on the outcome:
Definition: Two schedules S₁ and S₂ are result equivalent if they produce the same final database state when starting from the same initial state.
S₁ ≡ᵣ S₂ ⟺ FinalState(S₁, InitialDB) = FinalState(S₂, InitialDB)
Result equivalence captures the idea: "If the database ends up the same, who cares how we got there?" This seems reasonable at first—applications typically care about final results, not intermediate steps.
Consider two schedules on a variable X initially set to 10:
S₁: r₁(X) w₁(X) // T₁ reads X=10, writes X=15 (X = X + 5)
r₂(X) w₂(X) // T₂ reads X=15, writes X=30 (X = X * 2)
Final: X = 30
S₂: r₂(X) w₂(X) // T₂ reads X=10, writes X=20 (X = X * 2)
r₁(X) w₁(X) // T₁ reads X=20, writes X=25 (X = X + 5)
Final: X = 25
These schedules are NOT result equivalent (30 ≠ 25), even though both are serial and therefore both correct.
Result equivalence suffers from a critical flaw: coincidental equivalence.
Consider:
Initial: X = 10
S₁ (Serial T₁→T₂):
T₁: X = 0 (constant assignment)
T₂: X = 0 (constant assignment)
Final: X = 0
S₂ (Interleaved, with lost update):
r₁(X) = 10
r₂(X) = 10
w₁(X) = 0 // T₁ writes
w₂(X) = 0 // T₂ overwrites (lost update!)
Final: X = 0
S₂ contains a lost update (T₁'s write was overwritten), but produces the same result as S₁ due to the specific values involved.
The Deeper Problem: Result equivalence depends on the initial state and specific data values. A schedule that's result-equivalent for one initial state might not be for another. We need a notion of equivalence that's state-independent.
Result equivalence is too weak because: (1) It can be satisfied by coincidence rather than correct behavior, (2) It's state-dependent—different initial states give different results, (3) Testing requires knowing specific data values, not just the schedule structure, (4) It doesn't ensure intermediate reads were consistent. We need a stronger notion that guarantees correct behavior regardless of data values.
To address the shortcomings of result equivalence, we introduce a stronger notion that considers not just the final state, but what each transaction "sees" during execution.
Definition: Two schedules S₁ and S₂ are view equivalent if they satisfy the following three conditions:
For every data item X, if transaction Tᵢ reads the initial value of X in S₁, then Tᵢ must also read the initial value of X in S₂.
If rᵢ(X) reads initial value in S₁ → rᵢ(X) reads initial value in S₂
For every read operation rᵢ(X) that reads a value written by wⱼ(X) in S₁, the same relationship must hold in S₂.
If rᵢ(X) reads from wⱼ(X) in S₁ → rᵢ(X) reads from wⱼ(X) in S₂
For every data item X, the transaction that performs the final write to X must be the same in both schedules.
If Tᵢ performs final write to X in S₁ → Tᵢ performs final write to X in S₂
Transactions: T₁, T₂, T₃ operating on X
S₁ (Serial T₁→T₂→T₃):
r₁(X) w₁(X) r₂(X) w₂(X) r₃(X) w₃(X)
S₂ (Interleaved):
r₁(X) r₂(X) w₁(X) w₂(X) r₃(X) w₃(X)Analysis of S₁:
- T₁ reads initial X ✓
- T₂ reads from T₁'s write ✓
- T₃ reads from T₂'s write ✓
- Final write by T₃ ✓
Analysis of S₂:
- T₁ reads initial X ✓ (Condition 1: matches)
- T₂ reads initial X ✗ (Condition 2 violated!)
In S₁: T₂ reads from T₁
In S₂: T₂ reads initial (before T₁ writes)
Conclusion: S₁ ≢ᵥ S₂ (NOT view equivalent)The schedules differ in what T₂ reads: in S₁ it reads T₁'s output, in S₂ it reads the initial value. This changes T₂'s 'view' of the database, hence not view equivalent.
Stronger than Result Equivalence: View-equivalent schedules are always result-equivalent, but not vice versa. View equivalence ensures the same computational process, not just the same final result.
State-Independent: View equivalence depends only on the schedule structure, not the actual data values. Two schedules are either view-equivalent or not, regardless of initial database state.
Ideal for Correctness: View equivalence captures exactly what we mean by "the same execution from the transactions' perspective."
Testing view serializability (checking if a schedule is view-equivalent to any serial schedule) is NP-complete. This means:
This is why view equivalence, despite its theoretical elegance, is not used in practice.
View serializability is more permissive than conflict serializability—some schedules are view-serializable but not conflict-serializable. These additional schedules, however, come at the cost of NP-complete testing. In practice, we settle for conflict serializability, which is testable in polynomial time and sufficient for most applications.
Conflict equivalence provides a practical, efficiently testable notion of equivalence that captures the essential correctness properties we need.
Two operations conflict if:
Types of Conflicts:
| Conflict Type | Operations | Notation | Description |
|---|---|---|---|
| Read-Write (RW) | rᵢ(X), wⱼ(X) | RW conflict | One reads, another writes same item |
| Write-Read (WR) | wᵢ(X), rⱼ(X) | WR conflict | One writes, another reads same item |
| Write-Write (WW) | wᵢ(X), wⱼ(X) | WW conflict | Both write same item |
Non-Conflicts:
Definition: Two schedules S₁ and S₂ are conflict equivalent if:
Formally:
S₁ ≡c S₂ ⟺ For all conflicting pairs (oᵢ, oⱼ):
oᵢ < oⱼ in S₁ ⟺ oᵢ < oⱼ in S₂
Conflict equivalence states: "Schedules are equivalent if all the 'important' orderings are the same." The "important" orderings are those that affect the outcome—which means orderings between conflicting operations.
Non-conflicting operations can be reordered freely without changing the result. Only conflicting operations must maintain their relative order.
Transactions:
T₁: r₁(A) w₁(A)
T₂: r₂(A) w₂(A) r₂(B) w₂(B)
S₁: r₁(A) w₁(A) r₂(A) w₂(A) r₂(B) w₂(B)
S₂: r₂(B) r₁(A) w₁(A) r₂(A) w₂(B) w₂(A)Step 1: Identify all conflicts
Data item A:
- r₁(A) vs r₂(A): No conflict (both reads)
- r₁(A) vs w₂(A): RW conflict ✓
- w₁(A) vs r₂(A): WR conflict ✓
- w₁(A) vs w₂(A): WW conflict ✓
Data item B:
- No T₁ operations on B, so no conflicts involving B
Step 2: Check ordering of each conflict
| Conflict | S₁ Order | S₂ Order | Match? |
|----------|----------|----------|--------|
| r₁(A) < w₂(A) | T₁ first | T₁ first | ✓ |
| w₁(A) < r₂(A) | T₁ first | T₁ first | ✓ |
| w₁(A) < w₂(A) | T₁ first | T₁ first | ✓ |
All conflicts maintain the same relative order.
Conclusion: S₁ ≡c S₂ (conflict equivalent)Despite the different interleaving of non-conflicting operations (r₂(B), w₂(B) moved around), all conflicting operations maintain the same order. The schedules are conflict equivalent.
Conflict equivalence allows us to reorder non-conflicting operations freely. This flexibility is what enables concurrent execution: transactions can interleave their non-conflicting operations for better resource utilization, as long as the conflicting operations maintain their order relative to some serial schedule.
The three equivalence types form a hierarchy, with each level being more restrictive than the next:
┌─────────────────────────────────────────────────────────────┐
│ RESULT EQUIVALENT SCHEDULES │
│ (Largest set) │
│ │
│ ┌────────────────────────────────────────────────────┐ │
│ │ VIEW EQUIVALENT SCHEDULES │ │
│ │ │ │
│ │ ┌───────────────────────────────────────────┐ │ │
│ │ │ CONFLICT EQUIVALENT SCHEDULES │ │ │
│ │ │ (Smallest set) │ │ │
│ │ │ │ │ │
│ │ │ Efficiently testable (polynomial) │ │ │
│ │ └───────────────────────────────────────────┘ │ │
│ │ │ │
│ │ + Additional view-equivalent schedules │ │
│ │ Testing is NP-complete │ │
│ └────────────────────────────────────────────────────┘ │
│ │
│ + Coincidentally result-equivalent schedules │
│ Not suitable for guaranteeing correctness │
└─────────────────────────────────────────────────────────────┘
Conflict Equivalent ⟹ View Equivalent ⟹ Result Equivalent
But NOT the reverse:
What schedules fall in the gap between conflict and view serializability? These involve blind writes—writes that don't depend on any prior reads.
Example:
T₁: w₁(X) // Writes X without reading
T₂: w₂(X) // Writes X without reading
T₃: r₃(X) // Reads X
Schedule S: w₁(X) w₂(X) r₃(X)
This schedule is:
View-serializable: Equivalent to serial T₁→T₂→T₃
NOT conflict-serializable:
Such schedules with "blind writes" that get overwritten are view-serializable but not conflict-serializable.
The gap between view and conflict serializability is small and involves edge cases (blind writes). For most practical applications, conflict serializability captures all truly useful schedules. The efficiency gain from polynomial-time testing far outweighs missing the few extra view-serializable schedules.
| Property | Result | View | Conflict |
|---|---|---|---|
| Set size | Largest | Medium | Smallest |
| State-independent | No | Yes | Yes |
| Test complexity | Requires values | NP-complete | Polynomial |
| Practical use | None | Theoretical | Standard |
| Captures correctness | Weakly | Fully | Mostly |
Given two schedules, we can test conflict equivalence through a systematic process:
Input: Two schedules S₁ and S₂ over the same transactions
Output: True if S₁ ≡c S₂, False otherwise
1. Verify both schedules have the same transactions and operations
- If not, return False
2. Find all conflicting pairs of operations:
- For each pair of transactions (Tᵢ, Tⱼ) where i ≠ j
- For each data item X accessed by both
- Check if at least one access is a write
- If so, add (opᵢ, opⱼ) to conflict set
3. For each conflicting pair:
- Determine the order in S₁
- Determine the order in S₂
- If orders differ, return False
4. Return True
This polynomial complexity makes conflict equivalence testing practical for real systems.
T₁: r₁(A) w₁(B)
T₂: r₂(B) w₂(A)
S₁: r₁(A) r₂(B) w₁(B) w₂(A)
S₂: r₂(B) r₁(A) w₂(A) w₁(B)Step 1: Same operations? ✓
Step 2: Find conflicts
| Op 1 | Op 2 | Same Item? | At least one write? | Conflict? |
|------|------|------------|--------------------|-----------|
| r₁(A) | r₂(B) | No | - | No |
| r₁(A) | w₂(A) | Yes (A) | Yes | RW ✓ |
| w₁(B) | r₂(B) | Yes (B) | Yes | WR ✓ |
| w₁(B) | w₂(A) | No | - | No |
Conflicting pairs: {(r₁(A), w₂(A)), (w₁(B), r₂(B))}
Step 3: Check orderings
| Conflict | S₁ Order | S₂ Order | Match? |
|----------|----------|----------|--------|
| r₁(A) vs w₂(A) | r₁ at pos 1, w₂ at pos 4 → T₁ first | r₁ at pos 2, w₂ at pos 3 → T₁ first | ✓ |
| w₁(B) vs r₂(B) | w₁ at pos 3, r₂ at pos 2 → T₂ first | w₁ at pos 4, r₂ at pos 1 → T₂ first | ✓ |
All conflicts match!
Conclusion: S₁ ≡c S₂Both schedules have the same conflict orderings: T₁ goes first on data item A (r₁ before w₂), and T₂ goes first on data item B (r₂ before w₁). The schedules are conflict equivalent.
An alternative view: Two schedules are conflict-equivalent if one can be transformed into the other by swapping adjacent non-conflicting operations. Each swap preserves conflict equivalence, and if S₁ can be transformed to S₂ through such swaps, they're conflict equivalent. This 'swap perspective' helps understand why conflict equivalence works—non-conflicting operations are truly independent.
With equivalence defined, we can formally define serializability:
Definition: A schedule S is conflict serializable if it is conflict equivalent to some serial schedule.
S is conflict serializable ⟺ ∃ serial schedule Ss : S ≡c Ss
Definition: A schedule S is view serializable if it is view equivalent to some serial schedule.
S is view serializable ⟺ ∃ serial schedule Ss : S ≡v Ss
Conflict Serializable ⊂ View Serializable ⊂ All Schedules
- All conflict-serializable schedules are view-serializable
- Not all view-serializable schedules are conflict-serializable
- Many schedules are neither
To test if S is conflict serializable, we could:
Naive approach:
This is O(n! × test_cost)—impractical!
Better approach: Precedence Graph
Instead of comparing with all serial schedules, we construct a precedence graph (also called a serializability graph) and check for cycles:
This reduces testing to O(V + E) graph traversal—polynomial time!
The next page covers precedence graphs in complete detail.
A cycle in the precedence graph means T₁ must come before T₂ (due to one conflict) but T₂ must come before T₁ (due to another conflict). This is contradictory—no serial order can satisfy both requirements. Hence, cycles indicate non-serializability.
We've explored the formal foundations of schedule equivalence—the essential concept for defining and testing serializability. Let's consolidate the key insights:
What's Next:
Now that we understand conflict equivalence, we're ready to dive deeper into conflict equivalence specifically—examining its formal properties and the mechanism for testing it. The next page explores conflict operations in detail, building toward the precedence graph method for efficiently testing conflict serializability.
You now understand the three types of schedule equivalence, their formal definitions, their relationships, and why conflict equivalence is the practical choice for database systems. Next, we'll explore conflict equivalence in depth, examining the theory that makes efficient serializability testing possible.