Conflict Serializability - Learning Module

Loading content...

0/241

Serial Schedule

The Simplest Correct Execution

Before databases supported concurrent execution, transactions ran one at a time. Transaction T₁ would complete entirely—every read, every write, every decision—before Transaction T₂ could begin. This sequential execution model, known as serial execution, represents the simplest guarantee of correctness.

When transactions execute serially, they cannot interfere with each other. There are no race conditions, no lost updates, no dirty reads. Each transaction observes a consistent database state, makes its changes, and leaves a consistent state for the next transaction. It's clean, simple, and provably correct.

The concept of a serial schedule formalizes this serial execution. Understanding serial schedules is essential because they serve as the definition of correctness for concurrent schedules. A concurrent schedule is considered correct if and only if it produces results equivalent to some serial schedule.

In database theory, serial schedules are not just a type of schedule—they are the benchmark by which all other schedules are judged.

What You Will Learn

By the end of this page, you will understand the formal definition of serial schedules, why they guarantee correctness, how to enumerate all serial schedules for a given set of transactions, the performance limitations of serial execution, and why serial schedules serve as the reference point for serializability theory.

Formal Definition of Serial Schedules

A serial schedule is a schedule in which transactions execute sequentially without any interleaving of operations from different transactions.

Formal Definition:

Given a set of transactions T = {T₁, T₂, ..., Tₙ}, a schedule S is serial if and only if:

For every pair of transactions Tᵢ and Tⱼ (where i ≠ j), either:

All operations of Tᵢ appear before all operations of Tⱼ in S, OR
All operations of Tⱼ appear before all operations of Tᵢ in S

In other words, transactions appear as contiguous, non-overlapping blocks in the schedule.

Equivalent Definition:

A schedule S is serial if there exists an ordering (permutation) π of transactions such that S = Tπ(1) · Tπ(2) · ... · Tπ(n), where · denotes concatenation and each Tπ(k) is the complete sequence of operations for transaction Tπ(k).

Serial Schedule IdentificationGiven transactions T₁, T₂, and T₃, identify which schedules are serial:

Input

Transaction operations:
T₁: r₁(A) w₁(A)
T₂: r₂(B) w₂(B)
T₃: r₃(A) r₃(B) w₃(C)

Schedule A: r₁(A) w₁(A) r₂(B) w₂(B) r₃(A) r₃(B) w₃(C)
Schedule B: r₂(B) w₂(B) r₃(A) r₃(B) w₃(C) r₁(A) w₁(A)
Schedule C: r₁(A) r₂(B) w₁(A) w₂(B) r₃(A) r₃(B) w₃(C)
Schedule D: r₃(A) r₃(B) w₃(C) r₂(B) w₂(B) r₁(A) w₁(A)

Output

Schedule A: SERIAL (order: T₁ → T₂ → T₃)
Schedule B: SERIAL (order: T₂ → T₃ → T₁)
Schedule C: NOT SERIAL (T₁ and T₂ operations are interleaved)
Schedule D: SERIAL (order: T₃ → T₂ → T₁)

Explanation

In serial schedules, each transaction's operations form a contiguous block. Schedule C interleaves T₁ and T₂ (r₁, r₂, w₁, w₂), making it non-serial.

Visual Test for Serial Schedules

A quick visual test: In a tabular representation, draw a horizontal line after each transaction's last operation. In a serial schedule, operations from different transactions never appear on the same 'segment' between lines. If you ever see operations from two different transactions intermixed between any two horizontal lines, the schedule is not serial.

Why Serial Schedules Are Correct

The correctness of serial schedules stems from a fundamental property: complete isolation. When transactions execute serially, each transaction:

Observes a complete, consistent state: The database contains only the results of fully committed transactions
Operates without interference: No concurrent transaction can read or modify data during execution
Leaves a complete, consistent state: All changes are visible to subsequent transactions

The Isolation Guarantee

In a serial schedule, when transaction Tᵢ executes:

All transactions before Tᵢ have already committed—their changes are final
All transactions after Tᵢ haven't started—they can't see Tᵢ's uncommitted changes
Tᵢ has exclusive access to the database for its entire duration

This provides perfect isolation—the 'I' in ACID at its strongest level.

Consistency Preservation

If we assume:

The database starts in a consistent state
Each individual transaction, if run alone on a consistent database, leaves it consistent

Then serial execution preserves consistency through induction:

Base case: Database starts consistent (given)

Inductive step: Transaction Tᵢ runs on consistent database
               → Tᵢ transforms it to another consistent state (by assumption 2)
               → Database still consistent for Tᵢ₊₁

Conclusion: Database remains consistent after all transactions

This is the theoretical foundation for database correctness: transactions are units of consistency preservation, and running them serially chains these preservations together.

The Consistency Contract

Serial execution doesn't magically make incorrect transactions correct. The DBMS trusts that each transaction, considered in isolation, will maintain database consistency. Serial execution simply ensures that this assumption—the 'consistency contract' between transactions and the database—is honored by preventing interference.

Serial Schedule Correctness Properties

•No Lost Updates: A transaction's writes cannot be overwritten by concurrent writes
•No Dirty Reads: A transaction cannot read uncommitted data from another transaction
•No Non-Repeatable Reads: Multiple reads of the same data within a transaction return identical values
•No Phantom Reads: Set-based queries return consistent results throughout the transaction
•No Write Skew: Transactions cannot make decisions based on stale observations

Enumerating Serial Schedules

For a given set of transactions, there are multiple possible serial schedules. The number of serial schedules equals the number of ways to order the transactions—a factorial relationship.

Counting Serial Schedules

For n transactions, the number of possible serial schedules is:

Number of serial schedules = n!

Examples:

2 transactions: 2! = 2 serial schedules
3 transactions: 3! = 6 serial schedules
4 transactions: 4! = 24 serial schedules
5 transactions: 5! = 120 serial schedules
10 transactions: 10! = 3,628,800 serial schedules

All Serial Schedules May Produce Different Results

Importantly, different serial schedules may leave the database in different final states. This is not a problem—all serial schedules are considered correct, even if they produce different outcomes.

Different Serial Schedules, Different ResultsTwo transactions operating on a shared variable X demonstrate that serial schedule order matters:

Input

Initial state: X = 10

T₁: Read X, Compute X = X + 5, Write X
    (r₁(X) w₁(X))

T₂: Read X, Compute X = X * 2, Write X
    (r₂(X) w₂(X))

Serial Schedule 1: T₁ → T₂
Serial Schedule 2: T₂ → T₁

Output

Serial Schedule 1 (T₁ → T₂):
  r₁(X): X = 10
  X = 10 + 5 = 15
  w₁(X): X = 15
  r₂(X): X = 15
  X = 15 * 2 = 30
  w₂(X): X = 30
  Final: X = 30

Serial Schedule 2 (T₂ → T₁):
  r₂(X): X = 10
  X = 10 * 2 = 20
  w₂(X): X = 20
  r₁(X): X = 20
  X = 20 + 5 = 25
  w₁(X): X = 25
  Final: X = 25

Explanation

The two serial schedules produce different final values (30 vs 25). Both are correct because both represent valid, isolated executions. The difference reflects the inherent non-commutativity of the operations—the order genuinely matters.

Order Matters, But All Orders Are Valid

Don't confuse 'different results' with 'incorrect results.' When transactions arrive at a database, the system doesn't inherently know the 'right' order. Any serial order is acceptable because it represents a valid hypothetical scenario where transactions genuinely arrived in that sequence. The key insight is that concurrent schedules must match SOME serial order, not a specific one.

All Serial Schedules for Three Transactions
Schedule #	Order	Notation
1	T₁ → T₂ → T₃	T₁ · T₂ · T₃
2	T₁ → T₃ → T₂	T₁ · T₃ · T₂
3	T₂ → T₁ → T₃	T₂ · T₁ · T₃
4	T₂ → T₃ → T₁	T₂ · T₃ · T₁
5	T₃ → T₁ → T₂	T₃ · T₁ · T₂
6	T₃ → T₂ → T₁	T₃ · T₂ · T₁

The Problem with Serial Execution

If serial schedules are always correct, why don't databases just execute transactions serially? The answer is performance. Serial execution suffers from severe limitations that make it impractical for real-world systems.

Resource Underutilization

Modern databases operate on systems with multiple resources:

CPU cores (often 16, 32, 64, or more)
Disk channels and SSDs
Network connections
Memory bandwidth

Serial execution uses only one resource at a time per transaction:

Serial execution timeline:

Transaction: [Read from disk]→[CPU compute]→[Write to disk]→[Wait for commit]
CPU:         [----idle----][processing][---------idle----------]
Disk:        [--reading---][---idle---][---writing---][--idle--]

With serial execution, while one transaction waits for disk I/O, the CPU sits idle. While computing, the disk sits idle. This leads to utilization rates often below 20%.

Throughput Limitations

Consider a transaction that takes 100ms on average:

Serial execution: 10 transactions/second maximum
With interleaving: Could achieve 100+ transactions/second

For high-volume OLTP systems processing thousands of transactions per second, serial execution is simply not viable.

Latency Impact

Serial execution forces transactions to wait in line:

Scenario: 100 pending transactions, each taking 50ms

Serial execution:
- Transaction 1: Waits 0ms, completes at 50ms
- Transaction 50: Waits 2450ms, completes at 2500ms
- Transaction 100: Waits 4950ms, completes at 5000ms
- Average wait: ~2500ms

Concurrent execution (with 10 parallel streams, ideally):
- Transaction 100: Waits ~450ms, completes at ~500ms
- Average wait: ~250ms (10x improvement)

Serial Execution Advantages

•Guaranteed correctness
•Simple to implement
•No concurrency anomalies
•Predictable behavior
•Easy to debug
•No deadlocks possible

Serial Execution Disadvantages

•Poor resource utilization
•Low throughput
•High latency for queued transactions
•Doesn't scale with hardware
•Single point of serialization
•Impractical for high-volume systems

Modern Attempts at Serial Execution

Interestingly, some modern systems have revisited serial execution. H-Store/VoltDB use single-threaded execution per partition, achieving serial execution within partitions while allowing parallelism across partitions. Redis executes commands serially in a single thread. These systems work because they're optimized for in-memory data with very fast operations, minimizing the idle time that makes serial execution inefficient for disk-based systems.

Serial Schedules as the Correctness Benchmark

Given that serial schedules are correct but impractical, the database community developed a crucial insight:

A concurrent schedule is correct if it produces the same result as some serial schedule.

This principle, called serializability, defines correctness in terms of equivalence to serial execution. It allows databases to:

Execute transactions concurrently for performance
Still guarantee correct results by ensuring equivalence to serial execution

The Serializability Principle

┌─────────────────────────────────────────────────────────────┐
│                    SERIALIZABILITY PRINCIPLE                 │
│                                                             │
│  A concurrent schedule S is CORRECT if and only if          │
│  S is EQUIVALENT to some serial schedule Ss                 │
│                                                             │
│  S ≡ Ss  →  S is correct                                   │
│  S ≢ any Ss  →  S may be incorrect                         │
└─────────────────────────────────────────────────────────────┘

Notice we don't require equivalence to a specific serial schedule—just some serial schedule. This flexibility is important because different serial orders produce different (but all valid) results.

Types of Equivalence

The concept of "equivalence" can be defined in multiple ways, each leading to a different notion of serializability:

Equivalence Type	Definition	Resulting Serializability
Result Equivalence	Same final database state	Result Serializability
Conflict Equivalence	Same ordering of conflicting operations	Conflict Serializability
View Equivalence	Same reads observe same writes	View Serializability

Conflict serializability (our focus) is the most practical:

Efficiently testable using precedence graphs
Sufficient for most real-world needs
Forms the basis for protocols like Two-Phase Locking

View serializability is more permissive:

Some additional schedules are view-serializable but not conflict-serializable
Testing is NP-complete (computationally expensive)
Rarely used in practice

The Big Picture

Think of serial schedules as 'reference implementations'—we know they're correct, but we don't actually want to use them. Concurrent schedules are the 'optimized implementations'—faster, but we need to verify they match the reference. Serializability is the verification that the optimized version produces the same results as some reference implementation.

Complete Properties of Serial Schedules

Serial schedules possess several important properties beyond mere correctness. Understanding these properties helps clarify what we're trying to achieve with concurrent schedules.

Structural Properties

1. Transaction Contiguity

All operations of each transaction appear consecutively, with no interleaving:

Serial: [----T₁----][----T₂----][----T₃----]
Not:    [--T₁--][--T₂--][--T₁--][--T₃--]...

2. Total Ordering

Every pair of transactions has a defined before/after relationship:

For any Tᵢ and Tⱼ, either Tᵢ < Tⱼ or Tⱼ < Tᵢ
No circular or ambiguous orderings

3. Deterministic Conflict Resolution

For any two conflicting operations, the order is determined by the transaction order:

If Tᵢ < Tⱼ, then all conflicts between Tᵢ and Tⱼ favor Tᵢ going first

Correctness Properties

4. Always Serializable (by definition)

Every serial schedule is trivially serializable—it's equivalent to itself:

S_serial ≡ S_serial (conflict-equivalent)
Conflict serializability is vacuously satisfied

5. Recoverable and Strict

Serial schedules are inherently recoverable and strict:

No transaction reads uncommitted data (previous transactions already committed)
No cascading aborts possible (nothing depends on uncommitted state)
Recovery is straightforward (undo the single failed transaction)

6. Deadlock-Free

Deadlocks cannot occur in serial execution:

No circular waits possible when only one transaction is active
Resource allocation is never contested

Serial Schedule Properties Summary
Property	Status	Reason
Serializable	Always ✓	Equivalent to itself
Recoverable	Always ✓	Reads only committed data
Cascadeless	Always ✓	No dirty reads possible
Strict	Always ✓	Write after commit only
Deadlock-free	Always ✓	Single active transaction
Conflict-free	Always ✓	No concurrent operations

Properties We Inherit vs. Properties We Want

Serial schedules have ALL desirable properties, but not all properties are equally important for practical systems. Serializability ensures correctness. Recoverability ensures durability. Deadlock-freedom is nice but achievable through other means (deadlock detection). When designing concurrency protocols, we typically target serializability and recoverability as non-negotiable, while accepting that deadlocks may require detection and resolution.

Relationship to Concurrent Schedules

Understanding the relationship between serial and concurrent schedules is fundamental to serializability theory.

The Schedule Space

All possible schedules for a set of transactions can be visualized as a containment hierarchy:

┌─────────────────────────────────────────────────┐
│            ALL POSSIBLE SCHEDULES                │
│                                                  │
│   ┌────────────────────────────────────────┐    │
│   │       SERIALIZABLE SCHEDULES           │    │
│   │                                        │    │
│   │   ┌─────────────────────────────┐     │    │
│   │   │    SERIAL SCHEDULES (n!)    │     │    │
│   │   │                             │     │    │
│   │   │ (Correct by definition)     │     │    │
│   │   └─────────────────────────────┘     │    │
│   │                                        │    │
│   │   (Correct concurrent schedules)       │    │
│   └────────────────────────────────────────┘    │
│                                                  │
│   (Incorrectly interleaved schedules)           │
└─────────────────────────────────────────────────┘

Serial schedules are a tiny subset of all schedules (exactly n! of them)
Serializable schedules include serial schedules plus equivalent concurrent schedules
Non-serializable schedules are everything else—potentially incorrect

The Numbers Tell the Story

The disparity between serial and total schedules grows explosively:

Transactions	Ops Each	Serial Schedules	Total Schedules	Serial %
2	2	2	6	33.3%
2	3	2	20	10.0%
3	2	6	90	6.7%
3	3	6	1,680	0.36%
4	3	24	369,600	0.0065%
5	4	120	~10¹⁰	~10⁻⁸%

Serial schedules become an infinitesimally small fraction of the total schedule space. This is why testing every schedule for equivalence to every serial schedule is impractical—we need smarter algorithms.

The Core Insight

Serial schedules serve as 'ground truth' for correctness, but they're impractical to execute. Serializable concurrent schedules give us correctness with performance. The challenge is efficiently determining which concurrent schedules are serializable—and that's where conflict analysis and precedence graphs come in, which we'll explore in subsequent pages.

Summary: Serial Schedules

We've thoroughly explored serial schedules—the foundation of database correctness theory. Let's consolidate the key insights:

Key Takeaways

•Serial schedules execute transactions without interleaving—each transaction runs to completion before the next begins
•Serial schedules are always correct because they provide complete isolation—no interference between transactions
•For n transactions, exactly n! serial schedules exist—one for each possible ordering
•Different serial schedules may produce different results—all are considered correct since they represent valid execution orders
•Serial schedules are impractical due to poor resource utilization, low throughput, and high latency
•Serializability means equivalence to some serial schedule—this defines correctness for concurrent schedules
•Conflict serializability is the practical standard for testing whether concurrent schedules are correct

What's Next:

Serial schedules define what we mean by correct execution, but we need concurrent schedules for practical execution. The next page explores equivalent schedules—how we formally define when two schedules produce the same results. This equivalence concept is the bridge between the theoretical correctness of serial schedules and the practical efficiency of concurrent schedules.

Page Complete

You now understand serial schedules: their formal definition, why they're correct, how to enumerate them, their limitations, and their role as the correctness benchmark for all database execution. Next, we'll explore what it means for two schedules to be 'equivalent'—the key concept linking serial schedules to practical concurrent execution.

Serial Schedule

The Simplest Correct Execution

In database theory, serial schedules are not just a type of schedule—they are the benchmark by which all other schedules are judged.

What You Will Learn

Formal Definition of Serial Schedules

A serial schedule is a schedule in which transactions execute sequentially without any interleaving of operations from different transactions.

Formal Definition:

Given a set of transactions T = {T₁, T₂, ..., Tₙ}, a schedule S is serial if and only if:

For every pair of transactions Tᵢ and Tⱼ (where i ≠ j), either:

All operations of Tᵢ appear before all operations of Tⱼ in S, OR
All operations of Tⱼ appear before all operations of Tᵢ in S

In other words, transactions appear as contiguous, non-overlapping blocks in the schedule.

Equivalent Definition:

Serial Schedule IdentificationGiven transactions T₁, T₂, and T₃, identify which schedules are serial:

Input

Transaction operations:
T₁: r₁(A) w₁(A)
T₂: r₂(B) w₂(B)
T₃: r₃(A) r₃(B) w₃(C)

Schedule A: r₁(A) w₁(A) r₂(B) w₂(B) r₃(A) r₃(B) w₃(C)
Schedule B: r₂(B) w₂(B) r₃(A) r₃(B) w₃(C) r₁(A) w₁(A)
Schedule C: r₁(A) r₂(B) w₁(A) w₂(B) r₃(A) r₃(B) w₃(C)
Schedule D: r₃(A) r₃(B) w₃(C) r₂(B) w₂(B) r₁(A) w₁(A)

Output

Schedule A: SERIAL (order: T₁ → T₂ → T₃)
Schedule B: SERIAL (order: T₂ → T₃ → T₁)
Schedule C: NOT SERIAL (T₁ and T₂ operations are interleaved)
Schedule D: SERIAL (order: T₃ → T₂ → T₁)

Explanation

In serial schedules, each transaction's operations form a contiguous block. Schedule C interleaves T₁ and T₂ (r₁, r₂, w₁, w₂), making it non-serial.

Visual Test for Serial Schedules

Why Serial Schedules Are Correct

The correctness of serial schedules stems from a fundamental property: complete isolation. When transactions execute serially, each transaction:

Observes a complete, consistent state: The database contains only the results of fully committed transactions
Operates without interference: No concurrent transaction can read or modify data during execution
Leaves a complete, consistent state: All changes are visible to subsequent transactions

The Isolation Guarantee

In a serial schedule, when transaction Tᵢ executes:

All transactions before Tᵢ have already committed—their changes are final
All transactions after Tᵢ haven't started—they can't see Tᵢ's uncommitted changes
Tᵢ has exclusive access to the database for its entire duration

This provides perfect isolation—the 'I' in ACID at its strongest level.

Consistency Preservation

If we assume:

The database starts in a consistent state
Each individual transaction, if run alone on a consistent database, leaves it consistent

Then serial execution preserves consistency through induction:

Base case: Database starts consistent (given)

Inductive step: Transaction Tᵢ runs on consistent database
               → Tᵢ transforms it to another consistent state (by assumption 2)
               → Database still consistent for Tᵢ₊₁

Conclusion: Database remains consistent after all transactions

This is the theoretical foundation for database correctness: transactions are units of consistency preservation, and running them serially chains these preservations together.

The Consistency Contract

Serial Schedule Correctness Properties

•No Lost Updates: A transaction's writes cannot be overwritten by concurrent writes
•No Dirty Reads: A transaction cannot read uncommitted data from another transaction
•No Non-Repeatable Reads: Multiple reads of the same data within a transaction return identical values
•No Phantom Reads: Set-based queries return consistent results throughout the transaction
•No Write Skew: Transactions cannot make decisions based on stale observations

Enumerating Serial Schedules

For a given set of transactions, there are multiple possible serial schedules. The number of serial schedules equals the number of ways to order the transactions—a factorial relationship.

Counting Serial Schedules

For n transactions, the number of possible serial schedules is:

Number of serial schedules = n!

Examples:

2 transactions: 2! = 2 serial schedules
3 transactions: 3! = 6 serial schedules
4 transactions: 4! = 24 serial schedules
5 transactions: 5! = 120 serial schedules
10 transactions: 10! = 3,628,800 serial schedules

All Serial Schedules May Produce Different Results

Different Serial Schedules, Different ResultsTwo transactions operating on a shared variable X demonstrate that serial schedule order matters:

Input

Initial state: X = 10

T₁: Read X, Compute X = X + 5, Write X
    (r₁(X) w₁(X))

T₂: Read X, Compute X = X * 2, Write X
    (r₂(X) w₂(X))

Serial Schedule 1: T₁ → T₂
Serial Schedule 2: T₂ → T₁

Output

Serial Schedule 1 (T₁ → T₂):
  r₁(X): X = 10
  X = 10 + 5 = 15
  w₁(X): X = 15
  r₂(X): X = 15
  X = 15 * 2 = 30
  w₂(X): X = 30
  Final: X = 30

Serial Schedule 2 (T₂ → T₁):
  r₂(X): X = 10
  X = 10 * 2 = 20
  w₂(X): X = 20
  r₁(X): X = 20
  X = 20 + 5 = 25
  w₁(X): X = 25
  Final: X = 25

Explanation

Order Matters, But All Orders Are Valid

All Serial Schedules for Three Transactions
Schedule #	Order	Notation
1	T₁ → T₂ → T₃	T₁ · T₂ · T₃
2	T₁ → T₃ → T₂	T₁ · T₃ · T₂
3	T₂ → T₁ → T₃	T₂ · T₁ · T₃
4	T₂ → T₃ → T₁	T₂ · T₃ · T₁
5	T₃ → T₁ → T₂	T₃ · T₁ · T₂
6	T₃ → T₂ → T₁	T₃ · T₂ · T₁

The Problem with Serial Execution

Resource Underutilization

Modern databases operate on systems with multiple resources:

CPU cores (often 16, 32, 64, or more)
Disk channels and SSDs
Network connections
Memory bandwidth

Serial execution uses only one resource at a time per transaction:

Serial execution timeline:

Transaction: [Read from disk]→[CPU compute]→[Write to disk]→[Wait for commit]
CPU:         [----idle----][processing][---------idle----------]
Disk:        [--reading---][---idle---][---writing---][--idle--]

With serial execution, while one transaction waits for disk I/O, the CPU sits idle. While computing, the disk sits idle. This leads to utilization rates often below 20%.

Throughput Limitations

Consider a transaction that takes 100ms on average:

Serial execution: 10 transactions/second maximum
With interleaving: Could achieve 100+ transactions/second

For high-volume OLTP systems processing thousands of transactions per second, serial execution is simply not viable.

Latency Impact

Serial execution forces transactions to wait in line:

Scenario: 100 pending transactions, each taking 50ms

Serial execution:
- Transaction 1: Waits 0ms, completes at 50ms
- Transaction 50: Waits 2450ms, completes at 2500ms
- Transaction 100: Waits 4950ms, completes at 5000ms
- Average wait: ~2500ms

Concurrent execution (with 10 parallel streams, ideally):
- Transaction 100: Waits ~450ms, completes at ~500ms
- Average wait: ~250ms (10x improvement)

Serial Execution Advantages

•Guaranteed correctness
•Simple to implement
•No concurrency anomalies
•Predictable behavior
•Easy to debug
•No deadlocks possible

Serial Execution Disadvantages

•Poor resource utilization
•Low throughput
•High latency for queued transactions
•Doesn't scale with hardware
•Single point of serialization
•Impractical for high-volume systems

Modern Attempts at Serial Execution

Serial Schedules as the Correctness Benchmark

Given that serial schedules are correct but impractical, the database community developed a crucial insight:

A concurrent schedule is correct if it produces the same result as some serial schedule.

This principle, called serializability, defines correctness in terms of equivalence to serial execution. It allows databases to:

Execute transactions concurrently for performance
Still guarantee correct results by ensuring equivalence to serial execution

The Serializability Principle

┌─────────────────────────────────────────────────────────────┐
│                    SERIALIZABILITY PRINCIPLE                 │
│                                                             │
│  A concurrent schedule S is CORRECT if and only if          │
│  S is EQUIVALENT to some serial schedule Ss                 │
│                                                             │
│  S ≡ Ss  →  S is correct                                   │
│  S ≢ any Ss  →  S may be incorrect                         │
└─────────────────────────────────────────────────────────────┘

Types of Equivalence

The concept of "equivalence" can be defined in multiple ways, each leading to a different notion of serializability:

Equivalence Type	Definition	Resulting Serializability
Result Equivalence	Same final database state	Result Serializability
Conflict Equivalence	Same ordering of conflicting operations	Conflict Serializability
View Equivalence	Same reads observe same writes	View Serializability

Conflict serializability (our focus) is the most practical:

Efficiently testable using precedence graphs
Sufficient for most real-world needs
Forms the basis for protocols like Two-Phase Locking

View serializability is more permissive:

Some additional schedules are view-serializable but not conflict-serializable
Testing is NP-complete (computationally expensive)
Rarely used in practice

The Big Picture

Complete Properties of Serial Schedules

Serial schedules possess several important properties beyond mere correctness. Understanding these properties helps clarify what we're trying to achieve with concurrent schedules.

Structural Properties

1. Transaction Contiguity

All operations of each transaction appear consecutively, with no interleaving:

Serial: [----T₁----][----T₂----][----T₃----]
Not:    [--T₁--][--T₂--][--T₁--][--T₃--]...

2. Total Ordering

Every pair of transactions has a defined before/after relationship:

For any Tᵢ and Tⱼ, either Tᵢ < Tⱼ or Tⱼ < Tᵢ
No circular or ambiguous orderings

3. Deterministic Conflict Resolution

For any two conflicting operations, the order is determined by the transaction order:

If Tᵢ < Tⱼ, then all conflicts between Tᵢ and Tⱼ favor Tᵢ going first

Correctness Properties

4. Always Serializable (by definition)

Every serial schedule is trivially serializable—it's equivalent to itself:

S_serial ≡ S_serial (conflict-equivalent)
Conflict serializability is vacuously satisfied

5. Recoverable and Strict

Serial schedules are inherently recoverable and strict:

No transaction reads uncommitted data (previous transactions already committed)
No cascading aborts possible (nothing depends on uncommitted state)
Recovery is straightforward (undo the single failed transaction)

6. Deadlock-Free

Deadlocks cannot occur in serial execution:

No circular waits possible when only one transaction is active
Resource allocation is never contested

Serial Schedule Properties Summary
Property	Status	Reason
Serializable	Always ✓	Equivalent to itself
Recoverable	Always ✓	Reads only committed data
Cascadeless	Always ✓	No dirty reads possible
Strict	Always ✓	Write after commit only
Deadlock-free	Always ✓	Single active transaction
Conflict-free	Always ✓	No concurrent operations

Properties We Inherit vs. Properties We Want

Relationship to Concurrent Schedules

Understanding the relationship between serial and concurrent schedules is fundamental to serializability theory.

The Schedule Space

All possible schedules for a set of transactions can be visualized as a containment hierarchy:

┌─────────────────────────────────────────────────┐
│            ALL POSSIBLE SCHEDULES                │
│                                                  │
│   ┌────────────────────────────────────────┐    │
│   │       SERIALIZABLE SCHEDULES           │    │
│   │                                        │    │
│   │   ┌─────────────────────────────┐     │    │
│   │   │    SERIAL SCHEDULES (n!)    │     │    │
│   │   │                             │     │    │
│   │   │ (Correct by definition)     │     │    │
│   │   └─────────────────────────────┘     │    │
│   │                                        │    │
│   │   (Correct concurrent schedules)       │    │
│   └────────────────────────────────────────┘    │
│                                                  │
│   (Incorrectly interleaved schedules)           │
└─────────────────────────────────────────────────┘

Serial schedules are a tiny subset of all schedules (exactly n! of them)
Serializable schedules include serial schedules plus equivalent concurrent schedules
Non-serializable schedules are everything else—potentially incorrect

The Numbers Tell the Story

The disparity between serial and total schedules grows explosively:

Transactions	Ops Each	Serial Schedules	Total Schedules	Serial %
2	2	2	6	33.3%
2	3	2	20	10.0%
3	2	6	90	6.7%
3	3	6	1,680	0.36%
4	3	24	369,600	0.0065%
5	4	120	~10¹⁰	~10⁻⁸%

The Core Insight

Summary: Serial Schedules

We've thoroughly explored serial schedules—the foundation of database correctness theory. Let's consolidate the key insights:

Key Takeaways

•Serial schedules execute transactions without interleaving—each transaction runs to completion before the next begins
•Serial schedules are always correct because they provide complete isolation—no interference between transactions
•For n transactions, exactly n! serial schedules exist—one for each possible ordering
•Different serial schedules may produce different results—all are considered correct since they represent valid execution orders
•Serial schedules are impractical due to poor resource utilization, low throughput, and high latency
•Serializability means equivalence to some serial schedule—this defines correctness for concurrent schedules
•Conflict serializability is the practical standard for testing whether concurrent schedules are correct

What's Next:

Page Complete