Conflict Serializability - Learning Module

Loading content...

0/241

Schedule Concept

The Concurrency Challenge

Imagine a busy banking system processing thousands of transactions simultaneously. At any given moment, multiple customers are transferring funds, checking balances, and making payments—all operating on shared account data. The database management system must orchestrate these concurrent operations in a way that preserves data integrity while maximizing throughput.

But how do we reason about whether a particular interleaving of operations is correct? How can we determine if concurrent execution produces the same results as running transactions one after another? The answer lies in the formal concept of a schedule—the fundamental abstraction for analyzing concurrent transaction execution.

A schedule is essentially a transcript of what happened: a precise, ordered record of every database operation executed by every transaction. Understanding schedules is the first step toward understanding serializability, the gold standard for concurrent transaction correctness.

What You Will Learn

By the end of this page, you will understand the formal definition of a schedule, the notation used to represent schedules, the distinction between serial and concurrent schedules, and why schedules serve as the mathematical foundation for reasoning about transaction correctness in database systems.

Formal Definition of a Schedule

A schedule (also called a history) is a sequence of operations from one or more transactions that represents a particular execution order. Formally, a schedule S over a set of transactions T = {T₁, T₂, ..., Tₙ} is a partially ordered list of operations where:

All operations are present: Every operation from every transaction in T appears exactly once in S
Transaction order is preserved: For any transaction Tᵢ, if operation O₁ precedes O₂ in Tᵢ, then O₁ must precede O₂ in S
Conflict operations are ordered: Any two conflicting operations must have a defined order in S

The schedule captures not just what operations occurred, but the precise order in which they executed across all participating transactions.

Why 'Schedule' Instead of 'Execution'?

We use the term 'schedule' rather than 'execution' because a schedule is an abstraction. Real execution involves timing, CPU scheduling, disk I/O, and many other details. A schedule abstracts away these implementation details, focusing purely on the logical order of operations relevant to correctness analysis.

The Building Blocks: Transaction Operations

Schedules are composed of operations—the atomic units of database access. For serializability analysis, we focus on the following fundamental operations:

Operation	Notation	Description
Read	R(X) or rᵢ(X)	Transaction reads data item X
Write	W(X) or wᵢ(X)	Transaction writes data item X
Commit	C or cᵢ	Transaction commits successfully
Abort	A or aᵢ	Transaction aborts/rolls back

The subscript i identifies which transaction performs the operation. For example, r₁(A) means "Transaction T₁ reads data item A," while w₂(B) means "Transaction T₂ writes data item B."

Important: For conflict serializability analysis, we typically focus on read and write operations since conflicts arise from how transactions access shared data items. Commit and abort operations become critical when analyzing recoverability.

Schedule Notation ExampleConsider two transactions T₁ and T₂ operating on bank accounts A and B:

Input

T₁: Transfer $100 from A to B
    1. Read A
    2. A = A - 100
    3. Write A
    4. Read B
    5. B = B + 100
    6. Write B
    7. Commit

T₂: Apply 10% interest to A
    1. Read A
    2. A = A * 1.10
    3. Write A
    4. Commit

Output

Compact notation for T₁: r₁(A) w₁(A) r₁(B) w₁(B) c₁
Compact notation for T₂: r₂(A) w₂(A) c₂

Possible Schedule S₁:
r₁(A) w₁(A) r₁(B) w₁(B) c₁ r₂(A) w₂(A) c₂

Possible Schedule S₂:
r₁(A) r₂(A) w₁(A) w₂(A) r₁(B) w₁(B) c₁ c₂

Explanation

S₁ is a serial schedule (T₁ runs completely before T₂). S₂ is a concurrent schedule with interleaved operations. Both are valid schedules, but they may produce different final results!

Operations and Data Items in Detail

Understanding Data Items

A data item is the unit of data that transactions read and write. The granularity of data items varies depending on the system and analysis context:

Attribute-level: Single column values (e.g., Account.balance)
Tuple-level: Entire rows (most common in practice)
Page-level: Disk pages containing multiple tuples
Table-level: Entire relations
Database-level: The entire database

For serializability theory, we abstract data items as simple named entities (A, B, X, Y, etc.) without specifying their physical representation. The key insight is that conflicts arise when two operations access the same data item.

Data Item Granularity Trade-offs
Granularity	Concurrency	Overhead	Use Case
Attribute	Highest	Very High	Column-specific updates
Tuple/Row	High	Moderate	OLTP workloads (default)
Page	Medium	Low	Bulk operations, range scans
Table	Low	Very Low	Schema changes, full scans
Database	Minimal	Minimal	Backup, global operations

Read vs. Write Operations

The behavior of operations determines whether conflicts can occur:

Read Operations (R(X)):

Retrieves the current value of data item X
Does not modify the database state
Multiple reads of the same item do NOT conflict with each other
Safe to execute concurrently with other reads

Write Operations (W(X)):

Modifies the value of data item X
Changes the database state
Conflicts with any other operation (read or write) on the same item
The order of writes determines the final database state

The Critical Insight: Reads are inherently compatible with each other, but writes introduce ordering dependencies. This asymmetry is the foundation of conflict analysis.

The Read-Write Asymmetry

Think of reads as 'observing' and writes as 'modifying.' Multiple observers can look at the same object simultaneously without interfering. But if someone modifies the object, the order matters: what observers see depends on whether they look before or after the modification.

Representing Schedules

Schedules can be represented in several ways, each useful for different purposes. Mastering these representations is essential for analyzing serializability.

Linear (Sequence) Notation

The simplest representation lists operations in their execution order as a linear sequence:

S = r₁(A) r₂(A) w₁(A) w₂(A) r₁(B) w₁(B) c₁ c₂

This notation is compact and easy to write, making it ideal for exam questions and quick analysis. However, it can become unwieldy for schedules with many transactions.

Tabular Notation

A table format shows the temporal progression with each transaction in its own column:

Time	T₁	T₂
t₁	r(A)
t₂		r(A)
t₃	w(A)
t₄		w(A)
t₅	r(B)
t₆	w(B)
t₇	c
t₈		c

This format makes it easy to visualize the interleaving and identify which transaction performs each operation.

Timeline Diagram

A visual timeline shows transactions as horizontal bars with operations marked along them:

T₁: ──r(A)────w(A)────r(B)────w(B)────c──────────────
         │         │
         ▼         ▼
T₂: ─────────r(A)────────w(A)────────────────────c───

Timeline diagrams excel at showing the temporal relationship between operations across transactions.

Formal Set Notation

For rigorous mathematical analysis, schedules can be defined as ordered sets:

S = (O, <S)

where:
- O is the set of all operations from all transactions
- <S is a partial order on O defining execution order

This formal notation is used in academic papers and advanced theoretical discussions.

Multi-Format Schedule RepresentationThe same schedule for three transactions T₁, T₂, T₃ in different formats:

Input

Transaction operations:
T₁: r(X) w(X)
T₂: r(X) r(Y) w(Y)
T₃: r(Y) w(Y)

Output

Linear: r₁(X) r₂(X) r₃(Y) w₁(X) r₂(Y) w₂(Y) w₃(Y) c₁ c₂ c₃

Tabular:
Time | T₁   | T₂   | T₃
-----|------|------|-----
  1  | r(X) |      |     
  2  |      | r(X) |     
  3  |      |      | r(Y)
  4  | w(X) |      |     
  5  |      | r(Y) |     
  6  |      | w(Y) |     
  7  |      |      | w(Y)
  8  | c    |      |     
  9  |      | c    |     
 10  |      |      | c

Explanation

All representations describe the exact same execution order. Choose the format that best suits your analysis needs.

Complete vs. Incomplete Schedules

Complete Schedules

A complete schedule satisfies two important properties:

All operations present: Every operation of every participating transaction appears in the schedule
All transactions terminated: Every transaction either commits or aborts (no pending transactions)

Complete schedules represent finished executions where all transactions have reached a final state. When analyzing serializability, we typically work with complete schedules because they represent the full picture of what happened.

Complete Schedule Example:
S = r₁(A) w₁(A) r₂(A) w₂(A) c₁ c₂

Both T₁ and T₂ have all their operations included and both have committed.

Incomplete Schedules

An incomplete schedule (also called a partial schedule or prefix) represents an execution in progress:

Some transaction operations may not yet have executed
Some transactions may not yet have committed or aborted
Represents a snapshot of ongoing execution

Incomplete Schedule Example:
S' = r₁(A) r₂(A) w₁(A)

This is incomplete because:
- T₁ hasn't committed yet
- T₂ still has operations to perform
- Neither transaction has terminated

Why do incomplete schedules matter?

In real systems, the scheduler must make decisions before transactions complete. Concurrency control protocols must ensure that any incomplete schedule can be extended to a serializable complete schedule—or detect when this becomes impossible and abort transactions.

The Extension Problem

A schedule that looks fine while incomplete may become non-serializable once completed. Good concurrency control protocols prevent this by never allowing schedules to reach states where no serializable completion exists. This is why most protocols are conservative—they prevent potential problems rather than fixing them later.

Complete Schedule Properties

•All transaction operations present
•Every transaction commits or aborts
•Represents finished execution
•Can be fully analyzed for correctness
•Used for serializability proofs

Incomplete Schedule Properties

•Some operations may be missing
•Transactions may be still active
•Represents ongoing execution
•Must be extendable to valid completion
•Used for scheduling decisions

Understanding Transaction Interleaving

Interleaving is the mixing of operations from multiple concurrent transactions during execution. The DBMS interleaves transaction operations to maximize resource utilization and system throughput.

Why Interleave?

Consider a transaction that reads data, performs computation, and writes results:

T₁: READ (disk I/O) → COMPUTE (CPU) → WRITE (disk I/O)

During disk I/O, the CPU sits idle. During computation, the disk sits idle. If we run transactions one at a time (serially), resources are underutilized:

Serial execution (wasted resources):

CPU:  [idle][T₁ compute][idle][idle][T₂ compute][idle]
Disk: [T₁ read][idle][T₁ write][T₂ read][idle][T₂ write]

With interleaving, we can overlap I/O from one transaction with computation from another:

Interleaved execution (better utilization):

CPU:  [idle][T₁ compute][T₂ compute][idle]
Disk: [T₁ read][T₂ read][T₁ write][T₂ write]

The Interleaving Trade-off

Interleaving improves throughput and resource utilization, but introduces complexity. The database must ensure that interleaved execution produces correct results—results equivalent to some non-interleaved (serial) execution. This is precisely what serializability guarantees.

The Combinatorial Explosion Problem

For n transactions, each with m operations, the number of possible interleavings is astronomically large. This is calculated as:

Number of interleavings = (n × m)! / (m!)ⁿ

Example: Just 3 transactions with 3 operations each:

Total operations: 9
Possible interleavings: 9! / (3!)³ = 362,880 / 216 = 1,680

Example: 5 transactions with 5 operations each:

Total operations: 25
Possible interleavings: 25! / (5!)⁵ ≈ 6.2 × 10¹⁴

This exponential growth makes exhaustive checking impractical. We need efficient algorithms—like precedence graphs—to determine serializability without examining every possible interleaving.

The Challenge: Among all possible interleavings, some produce correct results (equivalent to some serial execution), while others produce anomalies. The scheduler must either:

Only allow interleavings guaranteed to be correct (preventive approach)
Check each schedule for serializability (detective approach)

Scale of Interleaving Possibilities
Transactions	Ops/Transaction	Total Ops	Possible Interleavings
2	2	4	6
2	3	6	20
3	3	9	1,680
4	4	16	~2.5 million
5	5	25	~6.2 × 10¹⁴
10	10	100	10¹⁰⁰

Serial vs. Concurrent Schedules

Schedules are classified into two fundamental categories based on the degree of interleaving:

Serial Schedules

A serial schedule executes transactions one at a time, with no interleaving. Each transaction runs to completion before the next one begins:

Serial Schedule (T₁ before T₂):
S₁ = r₁(A) w₁(A) r₁(B) w₁(B) c₁ r₂(A) w₂(A) c₂
        |____________T₁______________|  |____T₂____|

Properties of Serial Schedules:

Always correct: Since transactions don't overlap, there's no interference
Order determines result: Different serial orders may produce different (but all valid) results
Poor performance: Resources are underutilized
Simple to reason about: No concurrency anomalies possible

For n transactions, there are exactly n! possible serial schedules (one for each ordering).

Concurrent (Non-Serial) Schedules

A concurrent schedule (also called interleaved schedule) has operations from different transactions mixed together:

Concurrent Schedule:
S₂ = r₁(A) r₂(A) w₁(A) r₁(B) w₂(A) w₁(B) c₁ c₂

Here, T₂'s operations are interspersed with T₁'s operations.

Properties of Concurrent Schedules:

May or may not be correct: Depends on the specific interleaving
Better performance: Improved resource utilization
Complex analysis required: Must verify correctness
Many more possibilities: Far more concurrent than serial schedules

The Serializability Goal

The goal of serializability is to identify which concurrent schedules are 'safe'—meaning they produce results identical to some serial schedule. A concurrent schedule that is equivalent to a serial schedule gives us the best of both worlds: the correctness of serial execution with the performance benefits of concurrency.

Serial vs. Concurrent Schedule ComparisonTwo transactions transferring money between accounts:

Input

Initial: A = 1000, B = 500

T₁: Transfer $100 from A to B
    r₁(A), A=A-100, w₁(A), r₁(B), B=B+100, w₁(B)

T₂: Transfer $50 from A to B
    r₂(A), A=A-50, w₂(A), r₂(B), B=B+50, w₂(B)

Output

Serial S1 (T₁→T₂): Final A=850, B=650 ✓
Serial S2 (T₂→T₁): Final A=850, B=650 ✓

Concurrent S3:
r₁(A) A=1000
r₂(A) A=1000     // T₂ reads before T₁ writes
w₁(A) A=900      // T₁ writes A-100
w₂(A) A=950      // T₂ overwrites with its calculation!
... 
Final A=950, B=650 ✗ (Lost $50!)

Explanation

The serial schedules both preserve the total ($1500). The bad concurrent schedule S3 loses $50 because T₂ read before T₁ wrote—the classic 'lost update' problem. This schedule is NOT equivalent to any serial schedule.

Key Schedule Properties

Before we move to analyzing schedules for serializability, let's consolidate the essential properties that characterize schedules:

Structural Properties

Property	Description	Importance
Completeness	All operations from all transactions are present	Required for final correctness analysis
Order Preservation	Operations within each transaction maintain their original order	Ensures transaction semantics
Termination	Every transaction ends with commit or abort	Defines final database state

Behavioral Properties

Property	Description	Importance
Serializability	Equivalent to some serial schedule	Correctness criterion
Recoverability	Safe to commit without risking cascading aborts	Durability requirement
Strictness	Even stronger than recoverability	Simplifies recovery

Performance Properties

Property	Description	Importance
Concurrency Level	Degree of interleaving	Throughput optimization
Response Time	Time from start to commit	User experience
Abort Rate	Frequency of rollbacks	System efficiency

The Trade-off Triangle

Concurrency control involves balancing three competing goals: (1) High concurrency for performance, (2) Guaranteed correctness (serializability), and (3) Simple recovery. No protocol optimizes all three simultaneously. Two-phase locking emphasizes correctness but may reduce concurrency. Optimistic protocols increase concurrency but may require more aborts.

Why Schedules Matter

Understanding schedules is not merely academic—it's the foundation for all database concurrency control. Here's why this matters:

1. Correctness Verification

Schedules give us a formal language to ask and answer: "Is this execution correct?" Without the schedule abstraction, we would have no systematic way to compare different executions or prove correctness properties.

2. Protocol Design

Every concurrency control protocol—two-phase locking, timestamp ordering, optimistic concurrency control, multi-version approaches—is designed to produce schedules with specific properties. Understanding schedules is essential to understanding why these protocols work.

3. Debugging Concurrency Issues

When concurrent transactions produce unexpected results, analyzing the execution as a schedule helps identify:

Which operations conflicted
Where the interleaving went wrong
What properties were violated

4. Performance Tuning

Schedule analysis helps optimize transaction design:

Minimizing lock duration
Reducing conflict probability
Choosing appropriate isolation levels

5. System Design Decisions

Architectural choices depend on schedule properties:

What isolation level to default to
Whether to use pessimistic vs. optimistic protocols
How to balance consistency vs. availability

Key Takeaways

•A schedule is a sequence of operations from one or more transactions, preserving each transaction's internal order
•Operations are reads, writes, commits, and aborts on data items of various granularities
•Schedules can be represented as linear sequences, tables, timelines, or formal mathematical structures
•Complete schedules include all operations and terminating actions; incomplete schedules are executions in progress
•Interleaving improves performance but creates correctness challenges; the number of possible interleavings grows combinatorially
•Serial schedules are always correct but inefficient; the goal is finding concurrent schedules equivalent to serial ones

What's Next:

Now that we understand what schedules are and how to represent them, we need to explore serial schedules in depth. Serial schedules serve as our reference point—the "gold standard" of correctness that concurrent schedules must match. The next page examines serial schedules, their properties, and why they define what we mean by a "correct" execution.

Page Complete

You now understand the foundational concept of schedules in database transaction processing. You can formally define a schedule, represent it in multiple formats, distinguish between serial and concurrent schedules, and appreciate why this abstraction is essential for reasoning about concurrent transaction correctness. Next, we'll dive deeper into serial schedules and their role as the correctness benchmark.

Schedule Concept

The Concurrency Challenge

What You Will Learn

Formal Definition of a Schedule

All operations are present: Every operation from every transaction in T appears exactly once in S
Transaction order is preserved: For any transaction Tᵢ, if operation O₁ precedes O₂ in Tᵢ, then O₁ must precede O₂ in S
Conflict operations are ordered: Any two conflicting operations must have a defined order in S

The schedule captures not just what operations occurred, but the precise order in which they executed across all participating transactions.

Why 'Schedule' Instead of 'Execution'?

The Building Blocks: Transaction Operations

Schedules are composed of operations—the atomic units of database access. For serializability analysis, we focus on the following fundamental operations:

Operation	Notation	Description
Read	R(X) or rᵢ(X)	Transaction reads data item X
Write	W(X) or wᵢ(X)	Transaction writes data item X
Commit	C or cᵢ	Transaction commits successfully
Abort	A or aᵢ	Transaction aborts/rolls back

The subscript i identifies which transaction performs the operation. For example, r₁(A) means "Transaction T₁ reads data item A," while w₂(B) means "Transaction T₂ writes data item B."

Schedule Notation ExampleConsider two transactions T₁ and T₂ operating on bank accounts A and B:

Input

T₁: Transfer $100 from A to B
    1. Read A
    2. A = A - 100
    3. Write A
    4. Read B
    5. B = B + 100
    6. Write B
    7. Commit

T₂: Apply 10% interest to A
    1. Read A
    2. A = A * 1.10
    3. Write A
    4. Commit

Output

Compact notation for T₁: r₁(A) w₁(A) r₁(B) w₁(B) c₁
Compact notation for T₂: r₂(A) w₂(A) c₂

Possible Schedule S₁:
r₁(A) w₁(A) r₁(B) w₁(B) c₁ r₂(A) w₂(A) c₂

Possible Schedule S₂:
r₁(A) r₂(A) w₁(A) w₂(A) r₁(B) w₁(B) c₁ c₂

Explanation

S₁ is a serial schedule (T₁ runs completely before T₂). S₂ is a concurrent schedule with interleaved operations. Both are valid schedules, but they may produce different final results!

Operations and Data Items in Detail

Understanding Data Items

A data item is the unit of data that transactions read and write. The granularity of data items varies depending on the system and analysis context:

Attribute-level: Single column values (e.g., Account.balance)
Tuple-level: Entire rows (most common in practice)
Page-level: Disk pages containing multiple tuples
Table-level: Entire relations
Database-level: The entire database

Data Item Granularity Trade-offs
Granularity	Concurrency	Overhead	Use Case
Attribute	Highest	Very High	Column-specific updates
Tuple/Row	High	Moderate	OLTP workloads (default)
Page	Medium	Low	Bulk operations, range scans
Table	Low	Very Low	Schema changes, full scans
Database	Minimal	Minimal	Backup, global operations

Read vs. Write Operations

The behavior of operations determines whether conflicts can occur:

Read Operations (R(X)):

Retrieves the current value of data item X
Does not modify the database state
Multiple reads of the same item do NOT conflict with each other
Safe to execute concurrently with other reads

Write Operations (W(X)):

Modifies the value of data item X
Changes the database state
Conflicts with any other operation (read or write) on the same item
The order of writes determines the final database state

The Critical Insight: Reads are inherently compatible with each other, but writes introduce ordering dependencies. This asymmetry is the foundation of conflict analysis.

The Read-Write Asymmetry

Representing Schedules

Schedules can be represented in several ways, each useful for different purposes. Mastering these representations is essential for analyzing serializability.

Linear (Sequence) Notation

The simplest representation lists operations in their execution order as a linear sequence:

S = r₁(A) r₂(A) w₁(A) w₂(A) r₁(B) w₁(B) c₁ c₂

This notation is compact and easy to write, making it ideal for exam questions and quick analysis. However, it can become unwieldy for schedules with many transactions.

Tabular Notation

A table format shows the temporal progression with each transaction in its own column:

Time	T₁	T₂
t₁	r(A)
t₂		r(A)
t₃	w(A)
t₄		w(A)
t₅	r(B)
t₆	w(B)
t₇	c
t₈		c

This format makes it easy to visualize the interleaving and identify which transaction performs each operation.

Timeline Diagram

A visual timeline shows transactions as horizontal bars with operations marked along them:

T₁: ──r(A)────w(A)────r(B)────w(B)────c──────────────
         │         │
         ▼         ▼
T₂: ─────────r(A)────────w(A)────────────────────c───

Timeline diagrams excel at showing the temporal relationship between operations across transactions.

Formal Set Notation

For rigorous mathematical analysis, schedules can be defined as ordered sets:

S = (O, <S)

where:
- O is the set of all operations from all transactions
- <S is a partial order on O defining execution order

This formal notation is used in academic papers and advanced theoretical discussions.

Multi-Format Schedule RepresentationThe same schedule for three transactions T₁, T₂, T₃ in different formats:

Input

Transaction operations:
T₁: r(X) w(X)
T₂: r(X) r(Y) w(Y)
T₃: r(Y) w(Y)

Output

Linear: r₁(X) r₂(X) r₃(Y) w₁(X) r₂(Y) w₂(Y) w₃(Y) c₁ c₂ c₃

Tabular:
Time | T₁   | T₂   | T₃
-----|------|------|-----
  1  | r(X) |      |     
  2  |      | r(X) |     
  3  |      |      | r(Y)
  4  | w(X) |      |     
  5  |      | r(Y) |     
  6  |      | w(Y) |     
  7  |      |      | w(Y)
  8  | c    |      |     
  9  |      | c    |     
 10  |      |      | c

Explanation

All representations describe the exact same execution order. Choose the format that best suits your analysis needs.

Complete vs. Incomplete Schedules

Complete Schedules

A complete schedule satisfies two important properties:

All operations present: Every operation of every participating transaction appears in the schedule
All transactions terminated: Every transaction either commits or aborts (no pending transactions)

Complete Schedule Example:
S = r₁(A) w₁(A) r₂(A) w₂(A) c₁ c₂

Both T₁ and T₂ have all their operations included and both have committed.

Incomplete Schedules

An incomplete schedule (also called a partial schedule or prefix) represents an execution in progress:

Some transaction operations may not yet have executed
Some transactions may not yet have committed or aborted
Represents a snapshot of ongoing execution

Incomplete Schedule Example:
S' = r₁(A) r₂(A) w₁(A)

This is incomplete because:
- T₁ hasn't committed yet
- T₂ still has operations to perform
- Neither transaction has terminated

Why do incomplete schedules matter?

The Extension Problem

Complete Schedule Properties

•All transaction operations present
•Every transaction commits or aborts
•Represents finished execution
•Can be fully analyzed for correctness
•Used for serializability proofs

Incomplete Schedule Properties

•Some operations may be missing
•Transactions may be still active
•Represents ongoing execution
•Must be extendable to valid completion
•Used for scheduling decisions

Understanding Transaction Interleaving

Why Interleave?

Consider a transaction that reads data, performs computation, and writes results:

T₁: READ (disk I/O) → COMPUTE (CPU) → WRITE (disk I/O)

During disk I/O, the CPU sits idle. During computation, the disk sits idle. If we run transactions one at a time (serially), resources are underutilized:

Serial execution (wasted resources):

CPU:  [idle][T₁ compute][idle][idle][T₂ compute][idle]
Disk: [T₁ read][idle][T₁ write][T₂ read][idle][T₂ write]

With interleaving, we can overlap I/O from one transaction with computation from another:

Interleaved execution (better utilization):

CPU:  [idle][T₁ compute][T₂ compute][idle]
Disk: [T₁ read][T₂ read][T₁ write][T₂ write]

The Interleaving Trade-off

The Combinatorial Explosion Problem

For n transactions, each with m operations, the number of possible interleavings is astronomically large. This is calculated as:

Number of interleavings = (n × m)! / (m!)ⁿ

Example: Just 3 transactions with 3 operations each:

Total operations: 9
Possible interleavings: 9! / (3!)³ = 362,880 / 216 = 1,680

Example: 5 transactions with 5 operations each:

Total operations: 25
Possible interleavings: 25! / (5!)⁵ ≈ 6.2 × 10¹⁴

This exponential growth makes exhaustive checking impractical. We need efficient algorithms—like precedence graphs—to determine serializability without examining every possible interleaving.

The Challenge: Among all possible interleavings, some produce correct results (equivalent to some serial execution), while others produce anomalies. The scheduler must either:

Only allow interleavings guaranteed to be correct (preventive approach)
Check each schedule for serializability (detective approach)

Scale of Interleaving Possibilities
Transactions	Ops/Transaction	Total Ops	Possible Interleavings
2	2	4	6
2	3	6	20
3	3	9	1,680
4	4	16	~2.5 million
5	5	25	~6.2 × 10¹⁴
10	10	100	10¹⁰⁰

Serial vs. Concurrent Schedules

Schedules are classified into two fundamental categories based on the degree of interleaving:

Serial Schedules

A serial schedule executes transactions one at a time, with no interleaving. Each transaction runs to completion before the next one begins:

Serial Schedule (T₁ before T₂):
S₁ = r₁(A) w₁(A) r₁(B) w₁(B) c₁ r₂(A) w₂(A) c₂
        |____________T₁______________|  |____T₂____|

Properties of Serial Schedules:

Always correct: Since transactions don't overlap, there's no interference
Order determines result: Different serial orders may produce different (but all valid) results
Poor performance: Resources are underutilized
Simple to reason about: No concurrency anomalies possible

For n transactions, there are exactly n! possible serial schedules (one for each ordering).

Concurrent (Non-Serial) Schedules

A concurrent schedule (also called interleaved schedule) has operations from different transactions mixed together:

Concurrent Schedule:
S₂ = r₁(A) r₂(A) w₁(A) r₁(B) w₂(A) w₁(B) c₁ c₂

Here, T₂'s operations are interspersed with T₁'s operations.

Properties of Concurrent Schedules:

May or may not be correct: Depends on the specific interleaving
Better performance: Improved resource utilization
Complex analysis required: Must verify correctness
Many more possibilities: Far more concurrent than serial schedules

The Serializability Goal

Serial vs. Concurrent Schedule ComparisonTwo transactions transferring money between accounts:

Input

Initial: A = 1000, B = 500

T₁: Transfer $100 from A to B
    r₁(A), A=A-100, w₁(A), r₁(B), B=B+100, w₁(B)

T₂: Transfer $50 from A to B
    r₂(A), A=A-50, w₂(A), r₂(B), B=B+50, w₂(B)

Output

Serial S1 (T₁→T₂): Final A=850, B=650 ✓
Serial S2 (T₂→T₁): Final A=850, B=650 ✓

Concurrent S3:
r₁(A) A=1000
r₂(A) A=1000     // T₂ reads before T₁ writes
w₁(A) A=900      // T₁ writes A-100
w₂(A) A=950      // T₂ overwrites with its calculation!
... 
Final A=950, B=650 ✗ (Lost $50!)

Explanation

Key Schedule Properties

Before we move to analyzing schedules for serializability, let's consolidate the essential properties that characterize schedules:

Structural Properties

Property	Description	Importance
Completeness	All operations from all transactions are present	Required for final correctness analysis
Order Preservation	Operations within each transaction maintain their original order	Ensures transaction semantics
Termination	Every transaction ends with commit or abort	Defines final database state

Behavioral Properties

Property	Description	Importance
Serializability	Equivalent to some serial schedule	Correctness criterion
Recoverability	Safe to commit without risking cascading aborts	Durability requirement
Strictness	Even stronger than recoverability	Simplifies recovery

Performance Properties

Property	Description	Importance
Concurrency Level	Degree of interleaving	Throughput optimization
Response Time	Time from start to commit	User experience
Abort Rate	Frequency of rollbacks	System efficiency

The Trade-off Triangle

Why Schedules Matter

Understanding schedules is not merely academic—it's the foundation for all database concurrency control. Here's why this matters:

1. Correctness Verification

2. Protocol Design

3. Debugging Concurrency Issues

When concurrent transactions produce unexpected results, analyzing the execution as a schedule helps identify:

Which operations conflicted
Where the interleaving went wrong
What properties were violated

4. Performance Tuning

Schedule analysis helps optimize transaction design:

Minimizing lock duration
Reducing conflict probability
Choosing appropriate isolation levels

5. System Design Decisions

Architectural choices depend on schedule properties:

What isolation level to default to
Whether to use pessimistic vs. optimistic protocols
How to balance consistency vs. availability

Key Takeaways

•A schedule is a sequence of operations from one or more transactions, preserving each transaction's internal order
•Operations are reads, writes, commits, and aborts on data items of various granularities
•Schedules can be represented as linear sequences, tables, timelines, or formal mathematical structures
•Complete schedules include all operations and terminating actions; incomplete schedules are executions in progress
•Interleaving improves performance but creates correctness challenges; the number of possible interleavings grows combinatorially
•Serial schedules are always correct but inefficient; the goal is finding concurrent schedules equivalent to serial ones

What's Next:

Page Complete