Loading content...
When multiple transactions execute concurrently, their individual operations don't simply run in parallel and then merge results. Instead, they interleave—operations from different transactions are woven together in time, executed one after another in some specific ordering chosen by the database scheduler.
Think of it like a card shuffle: each transaction is a deck of cards (operations), and the scheduler interleaves them into a single deck. The final order determines whether the result is correct (equivalent to running transactions one at a time) or corrupt (exhibiting one of the concurrency problems we've discussed).
Interleaving is both the mechanism that enables concurrency benefits and the source of concurrency problems. Understanding how interleaving works, what makes certain interleavings safe, and how to reason about interleaved execution is fundamental to mastering database concurrency.
By the end of this page, you will understand how operation interleaving works mechanically, how to represent and analyze interleaved schedules, the concept of equivalence between schedules, what distinguishes safe interleavings from dangerous ones, and how the scheduler makes interleaving decisions.
Interleaving refers to the mixing of operations from multiple transactions into a single sequence for execution. Rather than completing one transaction before starting another, operations from different transactions are executed in alternating fashion.
Formal Definition:
Given transactions T₁ and T₂ with operations:
An interleaved schedule is any sequence containing all operations that:
Examples of valid interleavings:
12345678910111213141516171819202122232425262728293031323334
-- Two transactions with their operations: -- Transaction T1 (Bank Transfer):-- r1(A) : Read Account A balance-- w1(A) : Write new Account A balance (debit)-- r1(B) : Read Account B balance -- w1(B) : Write new Account B balance (credit) -- Transaction T2 (Balance Inquiry):-- r2(A) : Read Account A balance-- r2(B) : Read Account B balance-- r2(C) : Read Account C balance -- SERIAL SCHEDULE S1 (T1 before T2):-- r1(A) w1(A) r1(B) w1(B) | r2(A) r2(B) r2(C)-- └────── T1 ──────┘ └────── T2 ───────┘ -- SERIAL SCHEDULE S2 (T2 before T1):-- r2(A) r2(B) r2(C) | r1(A) w1(A) r1(B) w1(B)-- └────── T2 ──────┘ └────── T1 ──────────────┘ -- INTERLEAVED SCHEDULE S3:-- r1(A) r2(A) w1(A) r2(B) r1(B) w1(B) r2(C)-- └T1 └T2 └T1 └T2 └T1 └T1 └T2 -- In S3, operations alternate between T1 and T2-- The internal order within each transaction is preserved:-- T1: r1(A) → w1(A) → r1(B) → w1(B) ✓-- T2: r2(A) → r2(B) → r2(C) ✓ -- But wait! In S3, T2 reads A before T1 writes A,-- but T2 reads B AFTER T1 writes B!-- T2 might see an inconsistent state (old A, new B)-- This interleaving could cause problems!Why Interleaving Happens:
Interleaving is not random—it's a consequence of practical system behavior:
The scheduler interleaves operations to maximize resource utilization while (ideally) maintaining correctness.
For n transactions with operations of lengths k₁, k₂, ..., kₙ, the number of possible interleaved schedules is (k₁+k₂+...+kₙ)! / (k₁! × k₂! × ... × kₙ!). Even for modest numbers, this explodes: 2 transactions with 5 operations each have 252 possible interleavings. Only a small fraction are typically 'safe' (serializable).
The scheduler (also called the concurrency control manager or serialization manager) is the DBMS component responsible for determining how operations are interleaved. It makes real-time decisions about when each operation can execute.
Scheduler Responsibilities:
The Fundamental Challenge:
The scheduler must make decisions incrementally and without knowledge of the future. When operation O arrives:
Yet it must decide: execute now, delay, or abort the transaction?
Two Approaches to Scheduling:
Pessimistic (Locking-based): Assume conflicts will occur. Prevent problems by acquiring locks before accessing data. Operations wait until locks are available.
Optimistic (Validation-based): Assume conflicts are rare. Execute operations freely, but validate before commit that no conflicts occurred. Abort if validation fails.
Most production databases use some form of pessimistic concurrency control (locking or MVCC).
123456789101112131415161718192021222324252627282930
-- Scheduler decision-making process (conceptual) -- Incoming operation queue:-- 1. T1: r(A) -- T1 wants to read A-- 2. T2: w(A) -- T2 wants to write A -- 3. T1: w(A) -- T1 wants to write A -- PESSIMISTIC SCHEDULER (2PL) DECISIONS: -- Operation 1: T1 wants r(A)-- Check: Does anyone hold a conflicting lock on A? No.-- Decision: Grant shared lock on A to T1, execute r(A)-- Lock table: A -> S-lock by T1 -- Operation 2: T2 wants w(A)-- Check: Does anyone hold a conflicting lock on A?-- Yes! T1 holds S-lock, conflicts with X-lock needed for write.-- Decision: BLOCK T2 until T1 releases lock-- Queue: T2 waiting for X-lock on A -- Operation 3: T1 wants w(A)-- Check: Does T1 hold a compatible lock? T1 has S-lock.-- Decision: Upgrade S-lock to X-lock, execute w(A)-- Lock table: A -> X-lock by T1 -- Later: T1 commits, releases X-lock on A-- Wake up T2, grant X-lock on A, execute w(A) -- Result: Operations executed as r1(A) w1(A) w2(A)-- This is equivalent to serial T1, T2 - safe!The scheduler doesn't try to enumerate all possible interleavings. Instead, it uses protocols (like 2PL) that guarantee that whatever interleaving results will be correct. The protocol constrains which orderings are possible, ensuring only safe interleavings can occur.
Not all interleavings are created equal. Some produce results identical to running transactions serially; others produce corrupted or inconsistent data. Understanding what makes an interleaving 'safe' is central to concurrency control.
The Serializability Standard:
An interleaved schedule is safe (or correct) if it's serializable—meaning it produces the same result as some serial execution of the same transactions. The transactions might have executed with interleaved operations, but the end result is as if they ran one at a time in some order.
Formal Definition:
Schedule S is serializable if there exists a serial schedule S' with the same transactions such that:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950
-- Example: Analyzing two different interleavings-- Initial state: A = 100, B = 100 -- Transaction T1: Transfer $10 from A to B-- r1(A): Read A-- A = A - 10-- w1(A): Write A-- r1(B): Read B-- B = B + 10-- w1(B): Write B -- Transaction T2: Calculate A + B (should always be 200)-- r2(A): Read A-- r2(B): Read B-- sum = A + B -- ========================================-- SCHEDULE S1 (SAFE - Serializable)-- ========================================-- r1(A) w1(A) r1(B) w1(B) r2(A) r2(B)-- -- Timeline:-- r1(A): T1 reads A=100-- w1(A): T1 writes A=90-- r1(B): T1 reads B=100-- w1(B): T1 writes B=110-- r2(A): T2 reads A=90-- r2(B): T2 reads B=110-- T2's sum: 90+110 = 200 ✓---- This is equivalent to serial schedule: T1 then T2-- SAFE! -- ========================================-- SCHEDULE S2 (UNSAFE - Not Serializable)-- ========================================-- r1(A) w1(A) r2(A) r2(B) r1(B) w1(B)---- Timeline:-- r1(A): T1 reads A=100-- w1(A): T1 writes A=90-- r2(A): T2 reads A=90 (after T1's write)-- r2(B): T2 reads B=100 (before T1's write!)-- r1(B): T1 reads B=100-- w1(B): T1 writes B=110-- T2's sum: 90+100 = 190 ✗ (WRONG!)---- Not equivalent to T1,T2 (would give 200)-- Not equivalent to T2,T1 (would give 200)-- UNSAFE - produces incorrect result!| Characteristic | Safe (Serializable) | Unsafe (Non-serializable) |
|---|---|---|
| Equivalent to some serial schedule | Yes | No |
| Final database state | Same as some serial execution | Different from all serial executions |
| Read values | Consistent with some serial order | May see partial/inconsistent states |
| Occurrence | May or may not happen naturally | Should be prevented by scheduler |
| Example anomalies | None | Lost updates, dirty reads, etc. |
The scheduler cannot simply try all interleavings and pick a safe one—there are too many possibilities. Instead, it must use protocols that guarantee any resulting interleaving will be safe. This is why we need concurrency control mechanisms like locking, timestamps, and MVCC.
The key to understanding which interleavings are safe lies in the concept of conflicting operations. Not all pairs of operations interact—only conflicting operations affect each other's results when reordered.
Definition of Conflicting Operations:
Two operations O₁ (from T₁) and O₂ (from T₂) conflict if:
Why Conflicts Matter:
Non-conflicting operations can be reordered without affecting the result. But conflicting operations' relative order matters—swapping them changes the outcome.
123456789101112131415161718192021222324252627282930313233343536373839404142
-- Identifying conflicts in a schedule -- Schedule S: r1(A) w2(A) r2(B) w1(B) r1(C) w2(C) -- Step 1: List all operation pairs from DIFFERENT transactions-- (T1: r1(A), w1(B), r1(C))-- (T2: w2(A), r2(B), w2(C)) -- Step 2: Check each pair for conflicts -- r1(A) vs w2(A): Different transactions ✓, Same item (A) ✓, One write ✓-- → CONFLICT! -- r1(A) vs r2(B): Different transactions ✓, Different items (A≠B) ✗-- → No conflict -- r1(A) vs w2(C): Different transactions ✓, Different items (A≠C) ✗-- → No conflict -- w2(A) vs w1(B): Different transactions ✓, Different items (A≠B) ✗-- → No conflict -- w2(A) vs r1(C): Different transactions ✓, Different items (A≠C) ✗-- → No conflict -- r2(B) vs w1(B): Different transactions ✓, Same item (B) ✓, One write ✓-- → CONFLICT! -- r2(B) vs r1(C): Different transactions ✓, Different items (B≠C) ✗-- → No conflict -- w1(B) vs w2(C): Different transactions ✓, Different items (B≠C) ✗-- → No conflict -- r1(C) vs w2(C): Different transactions ✓, Same item (C) ✓, One write ✓-- → CONFLICT! -- Summary of conflicts in S:-- 1. r1(A) before w2(A) - T1 reads A before T2 writes A-- 2. w1(B) before r2(B) - Wait, check order in S...-- In S: r2(B) comes before w1(B), so T2 reads B before T1 writes B-- 3. r1(C) before w2(C) - T1 reads C before T2 writes CConflict Equivalence:
Two schedules are conflict equivalent if:
If a schedule is conflict equivalent to a serial schedule, it's called conflict serializable and is guaranteed to be safe.
The Precedence Graph Method:
To test conflict serializability:
You can freely reorder non-conflicting operations without changing the outcome. A safe schedule is one where the conflicting operations are ordered consistently with some serial order. This is the foundation of conflict serializability testing and many concurrency control protocols.
Interleaving directly impacts the ACID properties—particularly Isolation—and indirectly affects Consistency. Understanding this relationship clarifies why proper interleaving management is essential.
Impact on Isolation:
The Isolation property promises that concurrent transactions don't interfere with each other—each appears to execute alone. Unsafe interleaving violates this promise:
A serializable interleaving preserves perfect isolation. Weaker isolation levels permit specific safe violations for better performance.
| ACID Property | How Interleaving Can Violate | Example |
|---|---|---|
| Atomicity | Indirect: rollback complexity increases | Cascading aborts from dirty reads |
| Consistency | Seeing partial updates violates invariants | Sum check sees mid-transfer state |
| Isolation | Transactions see each other's effects | All concurrency anomalies |
| Durability | Not directly affected | N/A |
The Consistency Connection:
Database consistency means the database always satisfies its integrity constraints. A transaction takes the database from one consistent state to another. But during the transaction, the database may temporarily violate constraints (e.g., mid-transfer, money has left Account A but not arrived at Account B).
If another transaction sees this mid-transaction state through bad interleaving, it sees an inconsistent database—even though no individual transaction violated consistency.
Example:
Constraint: A + B + C = $1000 (total money in system)
T1 (transfer): subtract from A, add to B
Midway: A has $100 less, B hasn't been updated yet
T2 (sum): reads all accounts, computes A + B + C = $900 (WRONG!)
T2 saw a consistent-constraint violation because interleaving exposed T1's intermediate state.
123456789101112131415161718192021222324252627282930313233
-- Consistency violation through interleaving -- Invariant: Total account balance must always equal $1000-- Initial: A = $500, B = $300, C = $200 (sum = $1000) -- T1: Transfer $100 from A to B-- Step 1: A = A - 100 (A becomes $400)-- Step 2: B = B + 100 (B becomes $400) -- T2: Audit - verify total = $1000-- Step 1: Read A-- Step 2: Read B-- Step 3: Read C-- Check: A + B + C should equal $1000 -- PROBLEMATIC INTERLEAVING:-- w1(A) : T1 debits A → A = $400-- r2(A) : T2 reads A = $400-- r2(B) : T2 reads B = $300 (T1 hasn't credited yet!)-- r2(C) : T2 reads C = $200-- T2 commits with sum = $400 + $300 + $200 = $900-- w1(B) : T1 credits B → B = $400-- T1 commits -- Result:-- - T2's audit shows $900, not $1000 — CONSISTENCY VIOLATED!-- - Final state is actually consistent (A=$400, B=$400, C=$200 = $1000)-- - But T2 observed an inconsistent state due to interleaving -- This is why we need controlled interleaving:-- Either T2 sees all of T1's effects (T1 then T2) -- Or T2 sees none of T1's effects (T2 then T1)-- But not partial effects (mid-T1 state)Among ACID properties, Isolation is most directly about interleaving. Atomicity, Consistency, and Durability can be satisfied by individual transactions. Isolation is inherently about the relationships BETWEEN transactions—how they interleave without interfering.
Being able to visualize and analyze interleavings is a crucial skill for database developers and administrators. Several notation and visualization methods exist.
Timeline Notation:
The most intuitive representation shows time flowing downward (or left-to-right), with each transaction in its own column:
T1 T2
| |
r1(A) |
| r2(A)
w1(A) |
| r2(B)
r1(B) |
w1(B) |
| r2(C)
commit |
| commit
This clearly shows the temporal ordering and which operations overlap.
Operation Sequence Notation:
A compact representation lists operations in order with subscripts indicating transactions:
r₁(A) r₂(A) w₁(A) r₂(B) r₁(B) w₁(B) r₂(C)
This makes it easy to identify conflicts and check ordering.
The Precedence Graph:
A directed graph showing transaction dependencies from conflicts:
T1 ──→ T2 (means T1 must come before T2 in equivalent serial schedule)
The graph is acyclic if and only if the schedule is conflict serializable.
12345678910111213141516171819202122232425262728293031323334353637
-- Building a precedence graph for conflict serializability testing -- Schedule S: r1(A) w2(A) r2(B) w1(B) r1(C) w2(C) -- Step 1: Identify all conflicts (from earlier analysis)-- Conflict 1: r1(A) before w2(A) → Edge T1 → T2-- Conflict 2: r2(B) before w1(B) → Edge T2 → T1 -- Conflict 3: r1(C) before w2(C) → Edge T1 → T2 -- Step 2: Draw the precedence graph-- Nodes: T1, T2-- Edges: T1 → T2 (from conflicts 1 and 3)-- T2 → T1 (from conflict 2) -- Step 3: Check for cycles-- T1 → T2 → T1 ← CYCLE! -- Conclusion: Schedule S is NOT conflict serializable-- There is no serial schedule equivalent to S -- Why the cycle means non-serializability:-- - T1 → T2 means T1 must come before T2 in serial order-- - T2 → T1 means T2 must come before T1 in serial order-- - Both cannot be true simultaneously!-- - Therefore, no serial order is equivalent -- COMPARE WITH SAFE SCHEDULE:-- Schedule S': r1(A) r1(B) w1(B) r2(A) r2(B) r2(C) w2(A) w2(C) -- Conflicts in S':-- r1(A) before w2(A) → T1 → T2-- w1(B) before r2(B) → T1 → T2-- (no T2 → T1 edges!) -- Precedence graph: T1 → T2 (no cycle)-- Equivalent serial order: T1, T2-- Schedule S' IS conflict serializable!While you won't manually draw precedence graphs in production, understanding them helps you reason about why certain transaction combinations cause problems and why concurrency control protocols work. The acyclicity test is the theoretical foundation for proving protocols correct.
The specific interleaving that occurs in any execution depends on many factors, some controllable and others not. Understanding these factors helps in predicting and debugging concurrency behavior.
System-Level Factors:
Application-Level Factors:
The Heisenbug Problem:
Concurrency bugs are notoriously hard to debug because:
This is why databases use proven concurrency control protocols rather than relying on testing to find all problematic interleavings. It's mathematically impossible to test all possible interleavings.
From a million operations: Even a modest workload generates astronomical numbers of possible interleavings. Testing can never cover them all. Correctness must come from protocol design, not exhaustive testing.
A system that exhibits no concurrency bugs in testing may still have bugs. The specific interleavings tested may not include the problematic ones. Only formal correctness guarantees (like serializability through 2PL) provide assurance. This is why database engines use mathematically proven protocols.
We have explored interleaving in depth—the mechanism that enables concurrency benefits while potentially causing concurrency problems. Understanding interleaving is foundational for all concurrency control topics that follow.
What's next:
Having understood how interleaving works and what makes it dangerous, the final page of this module examines Why We Need Control—a comprehensive look at why databases must actively manage concurrency rather than hope for the best. We'll see how uncontrolled concurrency leads to real-world failures and why concurrency control mechanisms are non-negotiable.
You now understand interleaving—how operations from concurrent transactions are woven together, what makes certain interleavings safe, and why this process is both the source of concurrency benefits and the root of concurrency problems. Next, we synthesize everything into understanding why active control is essential.