Concurrency Overview - Learning Module

Loading content...

0/252

Interleaving

The Dance of Operations

When multiple transactions execute concurrently, their individual operations don't simply run in parallel and then merge results. Instead, they interleave—operations from different transactions are woven together in time, executed one after another in some specific ordering chosen by the database scheduler.

Think of it like a card shuffle: each transaction is a deck of cards (operations), and the scheduler interleaves them into a single deck. The final order determines whether the result is correct (equivalent to running transactions one at a time) or corrupt (exhibiting one of the concurrency problems we've discussed).

Interleaving is both the mechanism that enables concurrency benefits and the source of concurrency problems. Understanding how interleaving works, what makes certain interleavings safe, and how to reason about interleaved execution is fundamental to mastering database concurrency.

What You Will Learn

By the end of this page, you will understand how operation interleaving works mechanically, how to represent and analyze interleaved schedules, the concept of equivalence between schedules, what distinguishes safe interleavings from dangerous ones, and how the scheduler makes interleaving decisions.

Understanding Interleaving

Interleaving refers to the mixing of operations from multiple transactions into a single sequence for execution. Rather than completing one transaction before starting another, operations from different transactions are executed in alternating fashion.

Formal Definition:

Given transactions T₁ and T₂ with operations:

T₁: O₁₁, O₁₂, O₁₃ (in this order)
T₂: O₂₁, O₂₂ (in this order)

An interleaved schedule is any sequence containing all operations that:

Includes every operation from every transaction exactly once
Preserves the internal ordering of operations within each transaction

Examples of valid interleavings:

O₁₁, O₁₂, O₁₃, O₂₁, O₂₂ (serial: T₁ then T₂)
O₂₁, O₂₂, O₁₁, O₁₂, O₁₃ (serial: T₂ then T₁)
O₁₁, O₂₁, O₁₂, O₂₂, O₁₃ (interleaved)
O₂₁, O₁₁, O₁₂, O₂₂, O₁₃ (interleaved differently)

Interleaving Visualization
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
-- Two transactions with their operations:
 
-- Transaction T1 (Bank Transfer):
--   r1(A)  : Read Account A balance
--   w1(A)  : Write new Account A balance (debit)
--   r1(B)  : Read Account B balance  
--   w1(B)  : Write new Account B balance (credit)
 
-- Transaction T2 (Balance Inquiry):
--   r2(A)  : Read Account A balance
--   r2(B)  : Read Account B balance
--   r2(C)  : Read Account C balance
 
-- SERIAL SCHEDULE S1 (T1 before T2):
-- r1(A) w1(A) r1(B) w1(B) | r2(A) r2(B) r2(C)
-- └────── T1 ──────┘       └────── T2 ───────┘
 
-- SERIAL SCHEDULE S2 (T2 before T1):
-- r2(A) r2(B) r2(C) | r1(A) w1(A) r1(B) w1(B)
-- └────── T2 ──────┘ └────── T1 ──────────────┘
 
-- INTERLEAVED SCHEDULE S3:
-- r1(A) r2(A) w1(A) r2(B) r1(B) w1(B) r2(C)
-- └T1   └T2   └T1   └T2   └T1   └T1   └T2
 
-- In S3, operations alternate between T1 and T2
-- The internal order within each transaction is preserved:
--   T1: r1(A) → w1(A) → r1(B) → w1(B) ✓
--   T2: r2(A) → r2(B) → r2(C) ✓
 
-- But wait! In S3, T2 reads A before T1 writes A,
-- but T2 reads B AFTER T1 writes B!
-- T2 might see an inconsistent state (old A, new B)
-- This interleaving could cause problems!

Why Interleaving Happens:

Interleaving is not random—it's a consequence of practical system behavior:

I/O Waits: When T₁ waits for disk I/O, the scheduler lets T₂ proceed
Lock Waits: When T₁ waits for a lock, T₂ can execute other operations
Multi-core Execution: Different CPU cores may execute different transactions' operations truly in parallel
Network/Client Delays: While T₁ waits for client communication, T₂ can proceed

The scheduler interleaves operations to maximize resource utilization while (ideally) maintaining correctness.

The Number of Possible Interleavings

For n transactions with operations of lengths k₁, k₂, ..., kₙ, the number of possible interleaved schedules is (k₁+k₂+...+kₙ)! / (k₁! × k₂! × ... × kₙ!). Even for modest numbers, this explodes: 2 transactions with 5 operations each have 252 possible interleavings. Only a small fraction are typically 'safe' (serializable).

The Scheduler's Role

The scheduler (also called the concurrency control manager or serialization manager) is the DBMS component responsible for determining how operations are interleaved. It makes real-time decisions about when each operation can execute.

Scheduler Responsibilities:

Receive operations from transactions as they're submitted
Decide execution order based on correctness requirements and performance goals
Delay or block operations that would cause problems if executed immediately
Abort transactions when necessary to prevent or resolve deadlocks
Maintain serializability (or the configured isolation level) guarantee

Converting Mermaid diagram...

The Fundamental Challenge:

The scheduler must make decisions incrementally and without knowledge of the future. When operation O arrives:

The scheduler doesn't know what operations will come next
It doesn't know if the current transaction will commit or abort
It doesn't know how long other transactions will take

Yet it must decide: execute now, delay, or abort the transaction?

Two Approaches to Scheduling:

Pessimistic (Locking-based): Assume conflicts will occur. Prevent problems by acquiring locks before accessing data. Operations wait until locks are available.
Optimistic (Validation-based): Assume conflicts are rare. Execute operations freely, but validate before commit that no conflicts occurred. Abort if validation fails.

Most production databases use some form of pessimistic concurrency control (locking or MVCC).

Scheduler Decision Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
-- Scheduler decision-making process (conceptual)
 
-- Incoming operation queue:
-- 1. T1: r(A)   -- T1 wants to read A
-- 2. T2: w(A)   -- T2 wants to write A  
-- 3. T1: w(A)   -- T1 wants to write A
 
-- PESSIMISTIC SCHEDULER (2PL) DECISIONS:
 
-- Operation 1: T1 wants r(A)
--   Check: Does anyone hold a conflicting lock on A? No.
--   Decision: Grant shared lock on A to T1, execute r(A)
--   Lock table: A -> S-lock by T1
 
-- Operation 2: T2 wants w(A)
--   Check: Does anyone hold a conflicting lock on A?
--   Yes! T1 holds S-lock, conflicts with X-lock needed for write.
--   Decision: BLOCK T2 until T1 releases lock
--   Queue: T2 waiting for X-lock on A
 
-- Operation 3: T1 wants w(A)
--   Check: Does T1 hold a compatible lock? T1 has S-lock.
--   Decision: Upgrade S-lock to X-lock, execute w(A)
--   Lock table: A -> X-lock by T1
 
-- Later: T1 commits, releases X-lock on A
--   Wake up T2, grant X-lock on A, execute w(A)
 
-- Result: Operations executed as r1(A) w1(A) w2(A)
-- This is equivalent to serial T1, T2 - safe!

The Scheduler's Goal

The scheduler doesn't try to enumerate all possible interleavings. Instead, it uses protocols (like 2PL) that guarantee that whatever interleaving results will be correct. The protocol constrains which orderings are possible, ensuring only safe interleavings can occur.

Safe vs. Unsafe Interleavings

Not all interleavings are created equal. Some produce results identical to running transactions serially; others produce corrupted or inconsistent data. Understanding what makes an interleaving 'safe' is central to concurrency control.

The Serializability Standard:

An interleaved schedule is safe (or correct) if it's serializable—meaning it produces the same result as some serial execution of the same transactions. The transactions might have executed with interleaved operations, but the end result is as if they ran one at a time in some order.

Formal Definition:

Schedule S is serializable if there exists a serial schedule S' with the same transactions such that:

Both S and S' produce the same final database state, AND
Both S and S' return the same values to read operations in each transaction

Safe vs Unsafe Interleaving
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
-- Example: Analyzing two different interleavings
-- Initial state: A = 100, B = 100
 
-- Transaction T1: Transfer $10 from A to B
--   r1(A): Read A
--   A = A - 10
--   w1(A): Write A
--   r1(B): Read B
--   B = B + 10
--   w1(B): Write B
 
-- Transaction T2: Calculate A + B (should always be 200)
--   r2(A): Read A
--   r2(B): Read B
--   sum = A + B
 
-- ========================================
-- SCHEDULE S1 (SAFE - Serializable)
-- ========================================
-- r1(A) w1(A) r1(B) w1(B) r2(A) r2(B)
-- 
-- Timeline:
--   r1(A): T1 reads A=100
--   w1(A): T1 writes A=90
--   r1(B): T1 reads B=100
--   w1(B): T1 writes B=110
--   r2(A): T2 reads A=90
--   r2(B): T2 reads B=110
--   T2's sum: 90+110 = 200 ✓
--
-- This is equivalent to serial schedule: T1 then T2
-- SAFE!
 
-- ========================================
-- SCHEDULE S2 (UNSAFE - Not Serializable)
-- ========================================
-- r1(A) w1(A) r2(A) r2(B) r1(B) w1(B)
--
-- Timeline:
--   r1(A): T1 reads A=100
--   w1(A): T1 writes A=90
--   r2(A): T2 reads A=90 (after T1's write)
--   r2(B): T2 reads B=100 (before T1's write!)
--   r1(B): T1 reads B=100
--   w1(B): T1 writes B=110
--   T2's sum: 90+100 = 190 ✗ (WRONG!)
--
-- Not equivalent to T1,T2 (would give 200)
-- Not equivalent to T2,T1 (would give 200)
-- UNSAFE - produces incorrect result!

Characteristics of Safe vs Unsafe Interleavings
Characteristic	Safe (Serializable)	Unsafe (Non-serializable)
Equivalent to some serial schedule	Yes	No
Final database state	Same as some serial execution	Different from all serial executions
Read values	Consistent with some serial order	May see partial/inconsistent states
Occurrence	May or may not happen naturally	Should be prevented by scheduler
Example anomalies	None	Lost updates, dirty reads, etc.

The Challenge

The scheduler cannot simply try all interleavings and pick a safe one—there are too many possibilities. Instead, it must use protocols that guarantee any resulting interleaving will be safe. This is why we need concurrency control mechanisms like locking, timestamps, and MVCC.

Conflict Operations and Ordering

The key to understanding which interleavings are safe lies in the concept of conflicting operations. Not all pairs of operations interact—only conflicting operations affect each other's results when reordered.

Definition of Conflicting Operations:

Two operations O₁ (from T₁) and O₂ (from T₂) conflict if:

They belong to different transactions (T₁ ≠ T₂)
They access the same data item
At least one is a write operation

Why Conflicts Matter:

Non-conflicting operations can be reordered without affecting the result. But conflicting operations' relative order matters—swapping them changes the outcome.

Read-Read: No conflict. Both get same value regardless of order.
Read-Write: Conflict! Order determines whether read sees old or new value.
Write-Read: Conflict! Same reason as above.
Write-Write: Conflict! Order determines which value persists.

Conflict Analysis
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
-- Identifying conflicts in a schedule
 
-- Schedule S: r1(A) w2(A) r2(B) w1(B) r1(C) w2(C)
 
-- Step 1: List all operation pairs from DIFFERENT transactions
-- (T1: r1(A), w1(B), r1(C))
-- (T2: w2(A), r2(B), w2(C))
 
-- Step 2: Check each pair for conflicts
 
-- r1(A) vs w2(A): Different transactions ✓, Same item (A) ✓, One write ✓
--   → CONFLICT!
 
-- r1(A) vs r2(B): Different transactions ✓, Different items (A≠B) ✗
--   → No conflict
 
-- r1(A) vs w2(C): Different transactions ✓, Different items (A≠C) ✗
--   → No conflict
 
-- w2(A) vs w1(B): Different transactions ✓, Different items (A≠B) ✗
--   → No conflict
 
-- w2(A) vs r1(C): Different transactions ✓, Different items (A≠C) ✗
--   → No conflict
 
-- r2(B) vs w1(B): Different transactions ✓, Same item (B) ✓, One write ✓
--   → CONFLICT!
 
-- r2(B) vs r1(C): Different transactions ✓, Different items (B≠C) ✗
--   → No conflict
 
-- w1(B) vs w2(C): Different transactions ✓, Different items (B≠C) ✗
--   → No conflict
 
-- r1(C) vs w2(C): Different transactions ✓, Same item (C) ✓, One write ✓
--   → CONFLICT!
 
-- Summary of conflicts in S:
--   1. r1(A) before w2(A) - T1 reads A before T2 writes A
--   2. w1(B) before r2(B) - Wait, check order in S...
--      In S: r2(B) comes before w1(B), so T2 reads B before T1 writes B
--   3. r1(C) before w2(C) - T1 reads C before T2 writes C

Conflict Equivalence:

Two schedules are conflict equivalent if:

They contain the same transactions and operations
All pairs of conflicting operations are ordered the same way in both schedules

If a schedule is conflict equivalent to a serial schedule, it's called conflict serializable and is guaranteed to be safe.

The Precedence Graph Method:

To test conflict serializability:

Create a node for each transaction
For each pair of conflicting operations where Oᵢ (from Tᵢ) comes before Oⱼ (from Tⱼ), draw an edge Tᵢ → Tⱼ
If the graph has no cycles, the schedule is conflict serializable
If there's a cycle, the schedule is NOT conflict serializable

The Key Insight

You can freely reorder non-conflicting operations without changing the outcome. A safe schedule is one where the conflicting operations are ordered consistently with some serial order. This is the foundation of conflict serializability testing and many concurrency control protocols.

Interleaving and ACID Properties

Interleaving directly impacts the ACID properties—particularly Isolation—and indirectly affects Consistency. Understanding this relationship clarifies why proper interleaving management is essential.

Impact on Isolation:

The Isolation property promises that concurrent transactions don't interfere with each other—each appears to execute alone. Unsafe interleaving violates this promise:

Dirty reads occur when interleaving allows reading uncommitted writes
Non-repeatable reads occur when interleaving allows committed changes between reads
Lost updates occur when interleaving allows write-write conflicts to overwrite

A serializable interleaving preserves perfect isolation. Weaker isolation levels permit specific safe violations for better performance.

ACID Property Violations from Bad Interleaving
ACID Property	How Interleaving Can Violate	Example
Atomicity	Indirect: rollback complexity increases	Cascading aborts from dirty reads
Consistency	Seeing partial updates violates invariants	Sum check sees mid-transfer state
Isolation	Transactions see each other's effects	All concurrency anomalies
Durability	Not directly affected	N/A

The Consistency Connection:

Database consistency means the database always satisfies its integrity constraints. A transaction takes the database from one consistent state to another. But during the transaction, the database may temporarily violate constraints (e.g., mid-transfer, money has left Account A but not arrived at Account B).

If another transaction sees this mid-transaction state through bad interleaving, it sees an inconsistent database—even though no individual transaction violated consistency.

Example:

Constraint: A + B + C = $1000 (total money in system)

T1 (transfer): subtract from A, add to B
Midway: A has $100 less, B hasn't been updated yet
T2 (sum): reads all accounts, computes A + B + C = $900 (WRONG!)

T2 saw a consistent-constraint violation because interleaving exposed T1's intermediate state.

ACID Violation Through Interleaving
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- Consistency violation through interleaving
 
-- Invariant: Total account balance must always equal $1000
-- Initial: A = $500, B = $300, C = $200 (sum = $1000)
 
-- T1: Transfer $100 from A to B
--   Step 1: A = A - 100  (A becomes $400)
--   Step 2: B = B + 100  (B becomes $400)
 
-- T2: Audit - verify total = $1000
--   Step 1: Read A
--   Step 2: Read B
--   Step 3: Read C
--   Check: A + B + C should equal $1000
 
-- PROBLEMATIC INTERLEAVING:
-- w1(A)    : T1 debits A → A = $400
-- r2(A)    : T2 reads A = $400
-- r2(B)    : T2 reads B = $300 (T1 hasn't credited yet!)
-- r2(C)    : T2 reads C = $200
-- T2 commits with sum = $400 + $300 + $200 = $900
-- w1(B)    : T1 credits B → B = $400
-- T1 commits
 
-- Result:
-- - T2's audit shows $900, not $1000 — CONSISTENCY VIOLATED!
-- - Final state is actually consistent (A=$400, B=$400, C=$200 = $1000)
-- - But T2 observed an inconsistent state due to interleaving
 
-- This is why we need controlled interleaving:
-- Either T2 sees all of T1's effects (T1 then T2) 
-- Or T2 sees none of T1's effects (T2 then T1)
-- But not partial effects (mid-T1 state)

Isolation Is The Key

Among ACID properties, Isolation is most directly about interleaving. Atomicity, Consistency, and Durability can be satisfied by individual transactions. Isolation is inherently about the relationships BETWEEN transactions—how they interleave without interfering.

Visualizing Interleavings

Being able to visualize and analyze interleavings is a crucial skill for database developers and administrators. Several notation and visualization methods exist.

Timeline Notation:

The most intuitive representation shows time flowing downward (or left-to-right), with each transaction in its own column:

     T1         T2
     |          |
     r1(A)      |
     |          r2(A)
     w1(A)      |
     |          r2(B)
     r1(B)      |
     w1(B)      |
     |          r2(C)
     commit     |
     |          commit

This clearly shows the temporal ordering and which operations overlap.

Operation Sequence Notation:

A compact representation lists operations in order with subscripts indicating transactions:

r₁(A) r₂(A) w₁(A) r₂(B) r₁(B) w₁(B) r₂(C)

This makes it easy to identify conflicts and check ordering.

The Precedence Graph:

A directed graph showing transaction dependencies from conflicts:

T1 ──→ T2  (means T1 must come before T2 in equivalent serial schedule)

The graph is acyclic if and only if the schedule is conflict serializable.

Converting Mermaid diagram...

Precedence Graph Construction
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
-- Building a precedence graph for conflict serializability testing
 
-- Schedule S: r1(A) w2(A) r2(B) w1(B) r1(C) w2(C)
 
-- Step 1: Identify all conflicts (from earlier analysis)
--   Conflict 1: r1(A) before w2(A) → Edge T1 → T2
--   Conflict 2: r2(B) before w1(B) → Edge T2 → T1  
--   Conflict 3: r1(C) before w2(C) → Edge T1 → T2
 
-- Step 2: Draw the precedence graph
--   Nodes: T1, T2
--   Edges: T1 → T2 (from conflicts 1 and 3)
--          T2 → T1 (from conflict 2)
 
-- Step 3: Check for cycles
--   T1 → T2 → T1  ← CYCLE!
 
-- Conclusion: Schedule S is NOT conflict serializable
-- There is no serial schedule equivalent to S
 
-- Why the cycle means non-serializability:
--   - T1 → T2 means T1 must come before T2 in serial order
--   - T2 → T1 means T2 must come before T1 in serial order
--   - Both cannot be true simultaneously!
--   - Therefore, no serial order is equivalent
 
-- COMPARE WITH SAFE SCHEDULE:
-- Schedule S': r1(A) r1(B) w1(B) r2(A) r2(B) r2(C) w2(A) w2(C)
 
-- Conflicts in S':
--   r1(A) before w2(A) → T1 → T2
--   w1(B) before r2(B) → T1 → T2
--   (no T2 → T1 edges!)
 
-- Precedence graph: T1 → T2 (no cycle)
-- Equivalent serial order: T1, T2
-- Schedule S' IS conflict serializable!

Practical Use

While you won't manually draw precedence graphs in production, understanding them helps you reason about why certain transaction combinations cause problems and why concurrency control protocols work. The acyclicity test is the theoretical foundation for proving protocols correct.

Factors Affecting Interleaving

The specific interleaving that occurs in any execution depends on many factors, some controllable and others not. Understanding these factors helps in predicting and debugging concurrency behavior.

System-Level Factors:

CPU Scheduling: Which transaction's thread gets CPU time affects which operations proceed
I/O Completion Order: Disk operations complete in unpredictable order due to seek optimization
Network Latency: Distributed transactions experience variable communication delays
Lock Queue Order: Which waiting transaction gets a released lock

Application-Level Factors:

Transaction Arrival Time: When transactions start relative to each other
Operation Mix: Read-heavy transactions interleave more freely than write-heavy ones
Data Access Patterns: Transactions accessing the same data conflict more
Transaction Length: Longer transactions provide more interleaving opportunities

Non-Determinism in Interleaving

•Timing Sensitivity — The same transactions running milliseconds apart may interleave completely differently.
•Load Dependence — Under high load, more transactions compete, creating more complex interleavings.
•Hardware Variability — CPU speed, disk latency, and memory availability all affect timing.
•Non-reproducibility — The same program rarely produces the exact same interleaving twice.
•Debugging Difficulty — Concurrency bugs are hard to reproduce because the problematic interleaving may occur rarely.

The Heisenbug Problem:

Concurrency bugs are notoriously hard to debug because:

The bug only appears with specific interleavings that may occur rarely
Adding debugging code (print statements, breakpoints) changes timing, hiding the bug
The bug may disappear under observation (like Heisenberg's uncertainty principle)

This is why databases use proven concurrency control protocols rather than relying on testing to find all problematic interleavings. It's mathematically impossible to test all possible interleavings.

From a million operations: Even a modest workload generates astronomical numbers of possible interleavings. Testing can never cover them all. Correctness must come from protocol design, not exhaustive testing.

The Fallacy of 'Works in Testing'

A system that exhibits no concurrency bugs in testing may still have bugs. The specific interleavings tested may not include the problematic ones. Only formal correctness guarantees (like serializability through 2PL) provide assurance. This is why database engines use mathematically proven protocols.

Summary: Interleaving

We have explored interleaving in depth—the mechanism that enables concurrency benefits while potentially causing concurrency problems. Understanding interleaving is foundational for all concurrency control topics that follow.

Key Takeaways

•Interleaving is the mixing of operations from multiple transactions into a single execution sequence.
•The scheduler determines interleaving by deciding when each operation executes, based on correctness and performance goals.
•Safe interleavings are serializable—equivalent to some serial execution order.
•Conflicting operations (different transactions, same data, at least one write) are what make certain interleavings dangerous.
•The precedence graph visualizes transaction dependencies; acyclicity indicates serializability.
•ACID properties, especially Isolation, depend on proper interleaving control.
•Non-determinism makes concurrency bugs hard to find through testing; formal protocols are required.

What's next:

Having understood how interleaving works and what makes it dangerous, the final page of this module examines Why We Need Control—a comprehensive look at why databases must actively manage concurrency rather than hope for the best. We'll see how uncontrolled concurrency leads to real-world failures and why concurrency control mechanisms are non-negotiable.

Page Complete

You now understand interleaving—how operations from concurrent transactions are woven together, what makes certain interleavings safe, and why this process is both the source of concurrency benefits and the root of concurrency problems. Next, we synthesize everything into understanding why active control is essential.

Interleaving

The Dance of Operations

What You Will Learn

Understanding Interleaving

Formal Definition:

Given transactions T₁ and T₂ with operations:

T₁: O₁₁, O₁₂, O₁₃ (in this order)
T₂: O₂₁, O₂₂ (in this order)

An interleaved schedule is any sequence containing all operations that:

Includes every operation from every transaction exactly once
Preserves the internal ordering of operations within each transaction

Examples of valid interleavings:

O₁₁, O₁₂, O₁₃, O₂₁, O₂₂ (serial: T₁ then T₂)
O₂₁, O₂₂, O₁₁, O₁₂, O₁₃ (serial: T₂ then T₁)
O₁₁, O₂₁, O₁₂, O₂₂, O₁₃ (interleaved)
O₂₁, O₁₁, O₁₂, O₂₂, O₁₃ (interleaved differently)

Interleaving Visualization
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
-- Two transactions with their operations:
 
-- Transaction T1 (Bank Transfer):
--   r1(A)  : Read Account A balance
--   w1(A)  : Write new Account A balance (debit)
--   r1(B)  : Read Account B balance  
--   w1(B)  : Write new Account B balance (credit)
 
-- Transaction T2 (Balance Inquiry):
--   r2(A)  : Read Account A balance
--   r2(B)  : Read Account B balance
--   r2(C)  : Read Account C balance
 
-- SERIAL SCHEDULE S1 (T1 before T2):
-- r1(A) w1(A) r1(B) w1(B) | r2(A) r2(B) r2(C)
-- └────── T1 ──────┘       └────── T2 ───────┘
 
-- SERIAL SCHEDULE S2 (T2 before T1):
-- r2(A) r2(B) r2(C) | r1(A) w1(A) r1(B) w1(B)
-- └────── T2 ──────┘ └────── T1 ──────────────┘
 
-- INTERLEAVED SCHEDULE S3:
-- r1(A) r2(A) w1(A) r2(B) r1(B) w1(B) r2(C)
-- └T1   └T2   └T1   └T2   └T1   └T1   └T2
 
-- In S3, operations alternate between T1 and T2
-- The internal order within each transaction is preserved:
--   T1: r1(A) → w1(A) → r1(B) → w1(B) ✓
--   T2: r2(A) → r2(B) → r2(C) ✓
 
-- But wait! In S3, T2 reads A before T1 writes A,
-- but T2 reads B AFTER T1 writes B!
-- T2 might see an inconsistent state (old A, new B)
-- This interleaving could cause problems!

Why Interleaving Happens:

Interleaving is not random—it's a consequence of practical system behavior:

I/O Waits: When T₁ waits for disk I/O, the scheduler lets T₂ proceed
Lock Waits: When T₁ waits for a lock, T₂ can execute other operations
Multi-core Execution: Different CPU cores may execute different transactions' operations truly in parallel
Network/Client Delays: While T₁ waits for client communication, T₂ can proceed

The scheduler interleaves operations to maximize resource utilization while (ideally) maintaining correctness.

The Number of Possible Interleavings

The Scheduler's Role

Scheduler Responsibilities:

Receive operations from transactions as they're submitted
Decide execution order based on correctness requirements and performance goals
Delay or block operations that would cause problems if executed immediately
Abort transactions when necessary to prevent or resolve deadlocks
Maintain serializability (or the configured isolation level) guarantee

Converting Mermaid diagram...

The Fundamental Challenge:

The scheduler must make decisions incrementally and without knowledge of the future. When operation O arrives:

The scheduler doesn't know what operations will come next
It doesn't know if the current transaction will commit or abort
It doesn't know how long other transactions will take

Yet it must decide: execute now, delay, or abort the transaction?

Two Approaches to Scheduling:

Pessimistic (Locking-based): Assume conflicts will occur. Prevent problems by acquiring locks before accessing data. Operations wait until locks are available.
Optimistic (Validation-based): Assume conflicts are rare. Execute operations freely, but validate before commit that no conflicts occurred. Abort if validation fails.

Most production databases use some form of pessimistic concurrency control (locking or MVCC).

Scheduler Decision Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
-- Scheduler decision-making process (conceptual)
 
-- Incoming operation queue:
-- 1. T1: r(A)   -- T1 wants to read A
-- 2. T2: w(A)   -- T2 wants to write A  
-- 3. T1: w(A)   -- T1 wants to write A
 
-- PESSIMISTIC SCHEDULER (2PL) DECISIONS:
 
-- Operation 1: T1 wants r(A)
--   Check: Does anyone hold a conflicting lock on A? No.
--   Decision: Grant shared lock on A to T1, execute r(A)
--   Lock table: A -> S-lock by T1
 
-- Operation 2: T2 wants w(A)
--   Check: Does anyone hold a conflicting lock on A?
--   Yes! T1 holds S-lock, conflicts with X-lock needed for write.
--   Decision: BLOCK T2 until T1 releases lock
--   Queue: T2 waiting for X-lock on A
 
-- Operation 3: T1 wants w(A)
--   Check: Does T1 hold a compatible lock? T1 has S-lock.
--   Decision: Upgrade S-lock to X-lock, execute w(A)
--   Lock table: A -> X-lock by T1
 
-- Later: T1 commits, releases X-lock on A
--   Wake up T2, grant X-lock on A, execute w(A)
 
-- Result: Operations executed as r1(A) w1(A) w2(A)
-- This is equivalent to serial T1, T2 - safe!

The Scheduler's Goal

Safe vs. Unsafe Interleavings

The Serializability Standard:

Formal Definition:

Schedule S is serializable if there exists a serial schedule S' with the same transactions such that:

Both S and S' produce the same final database state, AND
Both S and S' return the same values to read operations in each transaction

Safe vs Unsafe Interleaving
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
-- Example: Analyzing two different interleavings
-- Initial state: A = 100, B = 100
 
-- Transaction T1: Transfer $10 from A to B
--   r1(A): Read A
--   A = A - 10
--   w1(A): Write A
--   r1(B): Read B
--   B = B + 10
--   w1(B): Write B
 
-- Transaction T2: Calculate A + B (should always be 200)
--   r2(A): Read A
--   r2(B): Read B
--   sum = A + B
 
-- ========================================
-- SCHEDULE S1 (SAFE - Serializable)
-- ========================================
-- r1(A) w1(A) r1(B) w1(B) r2(A) r2(B)
-- 
-- Timeline:
--   r1(A): T1 reads A=100
--   w1(A): T1 writes A=90
--   r1(B): T1 reads B=100
--   w1(B): T1 writes B=110
--   r2(A): T2 reads A=90
--   r2(B): T2 reads B=110
--   T2's sum: 90+110 = 200 ✓
--
-- This is equivalent to serial schedule: T1 then T2
-- SAFE!
 
-- ========================================
-- SCHEDULE S2 (UNSAFE - Not Serializable)
-- ========================================
-- r1(A) w1(A) r2(A) r2(B) r1(B) w1(B)
--
-- Timeline:
--   r1(A): T1 reads A=100
--   w1(A): T1 writes A=90
--   r2(A): T2 reads A=90 (after T1's write)
--   r2(B): T2 reads B=100 (before T1's write!)
--   r1(B): T1 reads B=100
--   w1(B): T1 writes B=110
--   T2's sum: 90+100 = 190 ✗ (WRONG!)
--
-- Not equivalent to T1,T2 (would give 200)
-- Not equivalent to T2,T1 (would give 200)
-- UNSAFE - produces incorrect result!

Characteristics of Safe vs Unsafe Interleavings
Characteristic	Safe (Serializable)	Unsafe (Non-serializable)
Equivalent to some serial schedule	Yes	No
Final database state	Same as some serial execution	Different from all serial executions
Read values	Consistent with some serial order	May see partial/inconsistent states
Occurrence	May or may not happen naturally	Should be prevented by scheduler
Example anomalies	None	Lost updates, dirty reads, etc.

The Challenge

Conflict Operations and Ordering

Definition of Conflicting Operations:

Two operations O₁ (from T₁) and O₂ (from T₂) conflict if:

They belong to different transactions (T₁ ≠ T₂)
They access the same data item
At least one is a write operation

Why Conflicts Matter:

Non-conflicting operations can be reordered without affecting the result. But conflicting operations' relative order matters—swapping them changes the outcome.

Read-Read: No conflict. Both get same value regardless of order.
Read-Write: Conflict! Order determines whether read sees old or new value.
Write-Read: Conflict! Same reason as above.
Write-Write: Conflict! Order determines which value persists.

Conflict Analysis
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
-- Identifying conflicts in a schedule
 
-- Schedule S: r1(A) w2(A) r2(B) w1(B) r1(C) w2(C)
 
-- Step 1: List all operation pairs from DIFFERENT transactions
-- (T1: r1(A), w1(B), r1(C))
-- (T2: w2(A), r2(B), w2(C))
 
-- Step 2: Check each pair for conflicts
 
-- r1(A) vs w2(A): Different transactions ✓, Same item (A) ✓, One write ✓
--   → CONFLICT!
 
-- r1(A) vs r2(B): Different transactions ✓, Different items (A≠B) ✗
--   → No conflict
 
-- r1(A) vs w2(C): Different transactions ✓, Different items (A≠C) ✗
--   → No conflict
 
-- w2(A) vs w1(B): Different transactions ✓, Different items (A≠B) ✗
--   → No conflict
 
-- w2(A) vs r1(C): Different transactions ✓, Different items (A≠C) ✗
--   → No conflict
 
-- r2(B) vs w1(B): Different transactions ✓, Same item (B) ✓, One write ✓
--   → CONFLICT!
 
-- r2(B) vs r1(C): Different transactions ✓, Different items (B≠C) ✗
--   → No conflict
 
-- w1(B) vs w2(C): Different transactions ✓, Different items (B≠C) ✗
--   → No conflict
 
-- r1(C) vs w2(C): Different transactions ✓, Same item (C) ✓, One write ✓
--   → CONFLICT!
 
-- Summary of conflicts in S:
--   1. r1(A) before w2(A) - T1 reads A before T2 writes A
--   2. w1(B) before r2(B) - Wait, check order in S...
--      In S: r2(B) comes before w1(B), so T2 reads B before T1 writes B
--   3. r1(C) before w2(C) - T1 reads C before T2 writes C

Conflict Equivalence:

Two schedules are conflict equivalent if:

They contain the same transactions and operations
All pairs of conflicting operations are ordered the same way in both schedules

If a schedule is conflict equivalent to a serial schedule, it's called conflict serializable and is guaranteed to be safe.

The Precedence Graph Method:

To test conflict serializability:

Create a node for each transaction
For each pair of conflicting operations where Oᵢ (from Tᵢ) comes before Oⱼ (from Tⱼ), draw an edge Tᵢ → Tⱼ
If the graph has no cycles, the schedule is conflict serializable
If there's a cycle, the schedule is NOT conflict serializable

The Key Insight

Interleaving and ACID Properties

Impact on Isolation:

The Isolation property promises that concurrent transactions don't interfere with each other—each appears to execute alone. Unsafe interleaving violates this promise:

Dirty reads occur when interleaving allows reading uncommitted writes
Non-repeatable reads occur when interleaving allows committed changes between reads
Lost updates occur when interleaving allows write-write conflicts to overwrite

A serializable interleaving preserves perfect isolation. Weaker isolation levels permit specific safe violations for better performance.

ACID Property Violations from Bad Interleaving
ACID Property	How Interleaving Can Violate	Example
Atomicity	Indirect: rollback complexity increases	Cascading aborts from dirty reads
Consistency	Seeing partial updates violates invariants	Sum check sees mid-transfer state
Isolation	Transactions see each other's effects	All concurrency anomalies
Durability	Not directly affected	N/A

The Consistency Connection:

If another transaction sees this mid-transaction state through bad interleaving, it sees an inconsistent database—even though no individual transaction violated consistency.

Example:

Constraint: A + B + C = $1000 (total money in system)

T1 (transfer): subtract from A, add to B
Midway: A has $100 less, B hasn't been updated yet
T2 (sum): reads all accounts, computes A + B + C = $900 (WRONG!)

T2 saw a consistent-constraint violation because interleaving exposed T1's intermediate state.

ACID Violation Through Interleaving
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
-- Consistency violation through interleaving
 
-- Invariant: Total account balance must always equal $1000
-- Initial: A = $500, B = $300, C = $200 (sum = $1000)
 
-- T1: Transfer $100 from A to B
--   Step 1: A = A - 100  (A becomes $400)
--   Step 2: B = B + 100  (B becomes $400)
 
-- T2: Audit - verify total = $1000
--   Step 1: Read A
--   Step 2: Read B
--   Step 3: Read C
--   Check: A + B + C should equal $1000
 
-- PROBLEMATIC INTERLEAVING:
-- w1(A)    : T1 debits A → A = $400
-- r2(A)    : T2 reads A = $400
-- r2(B)    : T2 reads B = $300 (T1 hasn't credited yet!)
-- r2(C)    : T2 reads C = $200
-- T2 commits with sum = $400 + $300 + $200 = $900
-- w1(B)    : T1 credits B → B = $400
-- T1 commits
 
-- Result:
-- - T2's audit shows $900, not $1000 — CONSISTENCY VIOLATED!
-- - Final state is actually consistent (A=$400, B=$400, C=$200 = $1000)
-- - But T2 observed an inconsistent state due to interleaving
 
-- This is why we need controlled interleaving:
-- Either T2 sees all of T1's effects (T1 then T2) 
-- Or T2 sees none of T1's effects (T2 then T1)
-- But not partial effects (mid-T1 state)

Isolation Is The Key

Visualizing Interleavings

Being able to visualize and analyze interleavings is a crucial skill for database developers and administrators. Several notation and visualization methods exist.

Timeline Notation:

The most intuitive representation shows time flowing downward (or left-to-right), with each transaction in its own column:

     T1         T2
     |          |
     r1(A)      |
     |          r2(A)
     w1(A)      |
     |          r2(B)
     r1(B)      |
     w1(B)      |
     |          r2(C)
     commit     |
     |          commit

This clearly shows the temporal ordering and which operations overlap.

Operation Sequence Notation:

A compact representation lists operations in order with subscripts indicating transactions:

r₁(A) r₂(A) w₁(A) r₂(B) r₁(B) w₁(B) r₂(C)

This makes it easy to identify conflicts and check ordering.

The Precedence Graph:

A directed graph showing transaction dependencies from conflicts:

T1 ──→ T2  (means T1 must come before T2 in equivalent serial schedule)

The graph is acyclic if and only if the schedule is conflict serializable.

Converting Mermaid diagram...

Precedence Graph Construction
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
-- Building a precedence graph for conflict serializability testing
 
-- Schedule S: r1(A) w2(A) r2(B) w1(B) r1(C) w2(C)
 
-- Step 1: Identify all conflicts (from earlier analysis)
--   Conflict 1: r1(A) before w2(A) → Edge T1 → T2
--   Conflict 2: r2(B) before w1(B) → Edge T2 → T1  
--   Conflict 3: r1(C) before w2(C) → Edge T1 → T2
 
-- Step 2: Draw the precedence graph
--   Nodes: T1, T2
--   Edges: T1 → T2 (from conflicts 1 and 3)
--          T2 → T1 (from conflict 2)
 
-- Step 3: Check for cycles
--   T1 → T2 → T1  ← CYCLE!
 
-- Conclusion: Schedule S is NOT conflict serializable
-- There is no serial schedule equivalent to S
 
-- Why the cycle means non-serializability:
--   - T1 → T2 means T1 must come before T2 in serial order
--   - T2 → T1 means T2 must come before T1 in serial order
--   - Both cannot be true simultaneously!
--   - Therefore, no serial order is equivalent
 
-- COMPARE WITH SAFE SCHEDULE:
-- Schedule S': r1(A) r1(B) w1(B) r2(A) r2(B) r2(C) w2(A) w2(C)
 
-- Conflicts in S':
--   r1(A) before w2(A) → T1 → T2
--   w1(B) before r2(B) → T1 → T2
--   (no T2 → T1 edges!)
 
-- Precedence graph: T1 → T2 (no cycle)
-- Equivalent serial order: T1, T2
-- Schedule S' IS conflict serializable!

Practical Use

Factors Affecting Interleaving

The specific interleaving that occurs in any execution depends on many factors, some controllable and others not. Understanding these factors helps in predicting and debugging concurrency behavior.

System-Level Factors:

CPU Scheduling: Which transaction's thread gets CPU time affects which operations proceed
I/O Completion Order: Disk operations complete in unpredictable order due to seek optimization
Network Latency: Distributed transactions experience variable communication delays
Lock Queue Order: Which waiting transaction gets a released lock

Application-Level Factors:

Transaction Arrival Time: When transactions start relative to each other
Operation Mix: Read-heavy transactions interleave more freely than write-heavy ones
Data Access Patterns: Transactions accessing the same data conflict more
Transaction Length: Longer transactions provide more interleaving opportunities

Non-Determinism in Interleaving

•Timing Sensitivity — The same transactions running milliseconds apart may interleave completely differently.
•Load Dependence — Under high load, more transactions compete, creating more complex interleavings.
•Hardware Variability — CPU speed, disk latency, and memory availability all affect timing.
•Non-reproducibility — The same program rarely produces the exact same interleaving twice.
•Debugging Difficulty — Concurrency bugs are hard to reproduce because the problematic interleaving may occur rarely.

The Heisenbug Problem:

Concurrency bugs are notoriously hard to debug because:

The bug only appears with specific interleavings that may occur rarely
Adding debugging code (print statements, breakpoints) changes timing, hiding the bug
The bug may disappear under observation (like Heisenberg's uncertainty principle)

This is why databases use proven concurrency control protocols rather than relying on testing to find all problematic interleavings. It's mathematically impossible to test all possible interleavings.

The Fallacy of 'Works in Testing'

Summary: Interleaving

Key Takeaways

•Interleaving is the mixing of operations from multiple transactions into a single execution sequence.
•The scheduler determines interleaving by deciding when each operation executes, based on correctness and performance goals.
•Safe interleavings are serializable—equivalent to some serial execution order.
•Conflicting operations (different transactions, same data, at least one write) are what make certain interleavings dangerous.
•The precedence graph visualizes transaction dependencies; acyclicity indicates serializability.
•ACID properties, especially Isolation, depend on proper interleaving control.
•Non-determinism makes concurrency bugs hard to find through testing; formal protocols are required.

What's next:

Page Complete