Loading content...
Before databases supported concurrent execution, transactions ran one at a time. Transaction T₁ would complete entirely—every read, every write, every decision—before Transaction T₂ could begin. This sequential execution model, known as serial execution, represents the simplest guarantee of correctness.
When transactions execute serially, they cannot interfere with each other. There are no race conditions, no lost updates, no dirty reads. Each transaction observes a consistent database state, makes its changes, and leaves a consistent state for the next transaction. It's clean, simple, and provably correct.
The concept of a serial schedule formalizes this serial execution. Understanding serial schedules is essential because they serve as the definition of correctness for concurrent schedules. A concurrent schedule is considered correct if and only if it produces results equivalent to some serial schedule.
In database theory, serial schedules are not just a type of schedule—they are the benchmark by which all other schedules are judged.
By the end of this page, you will understand the formal definition of serial schedules, why they guarantee correctness, how to enumerate all serial schedules for a given set of transactions, the performance limitations of serial execution, and why serial schedules serve as the reference point for serializability theory.
A serial schedule is a schedule in which transactions execute sequentially without any interleaving of operations from different transactions.
Formal Definition:
Given a set of transactions T = {T₁, T₂, ..., Tₙ}, a schedule S is serial if and only if:
For every pair of transactions Tᵢ and Tⱼ (where i ≠ j), either:
In other words, transactions appear as contiguous, non-overlapping blocks in the schedule.
Equivalent Definition:
A schedule S is serial if there exists an ordering (permutation) π of transactions such that S = Tπ(1) · Tπ(2) · ... · Tπ(n), where · denotes concatenation and each Tπ(k) is the complete sequence of operations for transaction Tπ(k).
Transaction operations:
T₁: r₁(A) w₁(A)
T₂: r₂(B) w₂(B)
T₃: r₃(A) r₃(B) w₃(C)
Schedule A: r₁(A) w₁(A) r₂(B) w₂(B) r₃(A) r₃(B) w₃(C)
Schedule B: r₂(B) w₂(B) r₃(A) r₃(B) w₃(C) r₁(A) w₁(A)
Schedule C: r₁(A) r₂(B) w₁(A) w₂(B) r₃(A) r₃(B) w₃(C)
Schedule D: r₃(A) r₃(B) w₃(C) r₂(B) w₂(B) r₁(A) w₁(A)Schedule A: SERIAL (order: T₁ → T₂ → T₃)
Schedule B: SERIAL (order: T₂ → T₃ → T₁)
Schedule C: NOT SERIAL (T₁ and T₂ operations are interleaved)
Schedule D: SERIAL (order: T₃ → T₂ → T₁)In serial schedules, each transaction's operations form a contiguous block. Schedule C interleaves T₁ and T₂ (r₁, r₂, w₁, w₂), making it non-serial.
A quick visual test: In a tabular representation, draw a horizontal line after each transaction's last operation. In a serial schedule, operations from different transactions never appear on the same 'segment' between lines. If you ever see operations from two different transactions intermixed between any two horizontal lines, the schedule is not serial.
The correctness of serial schedules stems from a fundamental property: complete isolation. When transactions execute serially, each transaction:
In a serial schedule, when transaction Tᵢ executes:
This provides perfect isolation—the 'I' in ACID at its strongest level.
If we assume:
Then serial execution preserves consistency through induction:
Base case: Database starts consistent (given)
Inductive step: Transaction Tᵢ runs on consistent database
→ Tᵢ transforms it to another consistent state (by assumption 2)
→ Database still consistent for Tᵢ₊₁
Conclusion: Database remains consistent after all transactions
This is the theoretical foundation for database correctness: transactions are units of consistency preservation, and running them serially chains these preservations together.
Serial execution doesn't magically make incorrect transactions correct. The DBMS trusts that each transaction, considered in isolation, will maintain database consistency. Serial execution simply ensures that this assumption—the 'consistency contract' between transactions and the database—is honored by preventing interference.
For a given set of transactions, there are multiple possible serial schedules. The number of serial schedules equals the number of ways to order the transactions—a factorial relationship.
For n transactions, the number of possible serial schedules is:
Number of serial schedules = n!
Examples:
Importantly, different serial schedules may leave the database in different final states. This is not a problem—all serial schedules are considered correct, even if they produce different outcomes.
Initial state: X = 10
T₁: Read X, Compute X = X + 5, Write X
(r₁(X) w₁(X))
T₂: Read X, Compute X = X * 2, Write X
(r₂(X) w₂(X))
Serial Schedule 1: T₁ → T₂
Serial Schedule 2: T₂ → T₁Serial Schedule 1 (T₁ → T₂):
r₁(X): X = 10
X = 10 + 5 = 15
w₁(X): X = 15
r₂(X): X = 15
X = 15 * 2 = 30
w₂(X): X = 30
Final: X = 30
Serial Schedule 2 (T₂ → T₁):
r₂(X): X = 10
X = 10 * 2 = 20
w₂(X): X = 20
r₁(X): X = 20
X = 20 + 5 = 25
w₁(X): X = 25
Final: X = 25The two serial schedules produce different final values (30 vs 25). Both are correct because both represent valid, isolated executions. The difference reflects the inherent non-commutativity of the operations—the order genuinely matters.
Don't confuse 'different results' with 'incorrect results.' When transactions arrive at a database, the system doesn't inherently know the 'right' order. Any serial order is acceptable because it represents a valid hypothetical scenario where transactions genuinely arrived in that sequence. The key insight is that concurrent schedules must match SOME serial order, not a specific one.
| Schedule # | Order | Notation |
|---|---|---|
| 1 | T₁ → T₂ → T₃ | T₁ · T₂ · T₃ |
| 2 | T₁ → T₃ → T₂ | T₁ · T₃ · T₂ |
| 3 | T₂ → T₁ → T₃ | T₂ · T₁ · T₃ |
| 4 | T₂ → T₃ → T₁ | T₂ · T₃ · T₁ |
| 5 | T₃ → T₁ → T₂ | T₃ · T₁ · T₂ |
| 6 | T₃ → T₂ → T₁ | T₃ · T₂ · T₁ |
If serial schedules are always correct, why don't databases just execute transactions serially? The answer is performance. Serial execution suffers from severe limitations that make it impractical for real-world systems.
Modern databases operate on systems with multiple resources:
Serial execution uses only one resource at a time per transaction:
Serial execution timeline:
Transaction: [Read from disk]→[CPU compute]→[Write to disk]→[Wait for commit]
CPU: [----idle----][processing][---------idle----------]
Disk: [--reading---][---idle---][---writing---][--idle--]
With serial execution, while one transaction waits for disk I/O, the CPU sits idle. While computing, the disk sits idle. This leads to utilization rates often below 20%.
Consider a transaction that takes 100ms on average:
For high-volume OLTP systems processing thousands of transactions per second, serial execution is simply not viable.
Serial execution forces transactions to wait in line:
Scenario: 100 pending transactions, each taking 50ms
Serial execution:
- Transaction 1: Waits 0ms, completes at 50ms
- Transaction 50: Waits 2450ms, completes at 2500ms
- Transaction 100: Waits 4950ms, completes at 5000ms
- Average wait: ~2500ms
Concurrent execution (with 10 parallel streams, ideally):
- Transaction 100: Waits ~450ms, completes at ~500ms
- Average wait: ~250ms (10x improvement)
Interestingly, some modern systems have revisited serial execution. H-Store/VoltDB use single-threaded execution per partition, achieving serial execution within partitions while allowing parallelism across partitions. Redis executes commands serially in a single thread. These systems work because they're optimized for in-memory data with very fast operations, minimizing the idle time that makes serial execution inefficient for disk-based systems.
Given that serial schedules are correct but impractical, the database community developed a crucial insight:
A concurrent schedule is correct if it produces the same result as some serial schedule.
This principle, called serializability, defines correctness in terms of equivalence to serial execution. It allows databases to:
┌─────────────────────────────────────────────────────────────┐
│ SERIALIZABILITY PRINCIPLE │
│ │
│ A concurrent schedule S is CORRECT if and only if │
│ S is EQUIVALENT to some serial schedule Ss │
│ │
│ S ≡ Ss → S is correct │
│ S ≢ any Ss → S may be incorrect │
└─────────────────────────────────────────────────────────────┘
Notice we don't require equivalence to a specific serial schedule—just some serial schedule. This flexibility is important because different serial orders produce different (but all valid) results.
The concept of "equivalence" can be defined in multiple ways, each leading to a different notion of serializability:
| Equivalence Type | Definition | Resulting Serializability |
|---|---|---|
| Result Equivalence | Same final database state | Result Serializability |
| Conflict Equivalence | Same ordering of conflicting operations | Conflict Serializability |
| View Equivalence | Same reads observe same writes | View Serializability |
Conflict serializability (our focus) is the most practical:
View serializability is more permissive:
Think of serial schedules as 'reference implementations'—we know they're correct, but we don't actually want to use them. Concurrent schedules are the 'optimized implementations'—faster, but we need to verify they match the reference. Serializability is the verification that the optimized version produces the same results as some reference implementation.
Serial schedules possess several important properties beyond mere correctness. Understanding these properties helps clarify what we're trying to achieve with concurrent schedules.
1. Transaction Contiguity
All operations of each transaction appear consecutively, with no interleaving:
Serial: [----T₁----][----T₂----][----T₃----]
Not: [--T₁--][--T₂--][--T₁--][--T₃--]...
2. Total Ordering
Every pair of transactions has a defined before/after relationship:
3. Deterministic Conflict Resolution
For any two conflicting operations, the order is determined by the transaction order:
4. Always Serializable (by definition)
Every serial schedule is trivially serializable—it's equivalent to itself:
5. Recoverable and Strict
Serial schedules are inherently recoverable and strict:
6. Deadlock-Free
Deadlocks cannot occur in serial execution:
| Property | Status | Reason |
|---|---|---|
| Serializable | Always ✓ | Equivalent to itself |
| Recoverable | Always ✓ | Reads only committed data |
| Cascadeless | Always ✓ | No dirty reads possible |
| Strict | Always ✓ | Write after commit only |
| Deadlock-free | Always ✓ | Single active transaction |
| Conflict-free | Always ✓ | No concurrent operations |
Serial schedules have ALL desirable properties, but not all properties are equally important for practical systems. Serializability ensures correctness. Recoverability ensures durability. Deadlock-freedom is nice but achievable through other means (deadlock detection). When designing concurrency protocols, we typically target serializability and recoverability as non-negotiable, while accepting that deadlocks may require detection and resolution.
Understanding the relationship between serial and concurrent schedules is fundamental to serializability theory.
All possible schedules for a set of transactions can be visualized as a containment hierarchy:
┌─────────────────────────────────────────────────┐
│ ALL POSSIBLE SCHEDULES │
│ │
│ ┌────────────────────────────────────────┐ │
│ │ SERIALIZABLE SCHEDULES │ │
│ │ │ │
│ │ ┌─────────────────────────────┐ │ │
│ │ │ SERIAL SCHEDULES (n!) │ │ │
│ │ │ │ │ │
│ │ │ (Correct by definition) │ │ │
│ │ └─────────────────────────────┘ │ │
│ │ │ │
│ │ (Correct concurrent schedules) │ │
│ └────────────────────────────────────────┘ │
│ │
│ (Incorrectly interleaved schedules) │
└─────────────────────────────────────────────────┘
The disparity between serial and total schedules grows explosively:
| Transactions | Ops Each | Serial Schedules | Total Schedules | Serial % |
|---|---|---|---|---|
| 2 | 2 | 2 | 6 | 33.3% |
| 2 | 3 | 2 | 20 | 10.0% |
| 3 | 2 | 6 | 90 | 6.7% |
| 3 | 3 | 6 | 1,680 | 0.36% |
| 4 | 3 | 24 | 369,600 | 0.0065% |
| 5 | 4 | 120 | ~10¹⁰ | ~10⁻⁸% |
Serial schedules become an infinitesimally small fraction of the total schedule space. This is why testing every schedule for equivalence to every serial schedule is impractical—we need smarter algorithms.
Serial schedules serve as 'ground truth' for correctness, but they're impractical to execute. Serializable concurrent schedules give us correctness with performance. The challenge is efficiently determining which concurrent schedules are serializable—and that's where conflict analysis and precedence graphs come in, which we'll explore in subsequent pages.
We've thoroughly explored serial schedules—the foundation of database correctness theory. Let's consolidate the key insights:
What's Next:
Serial schedules define what we mean by correct execution, but we need concurrent schedules for practical execution. The next page explores equivalent schedules—how we formally define when two schedules produce the same results. This equivalence concept is the bridge between the theoretical correctness of serial schedules and the practical efficiency of concurrent schedules.
You now understand serial schedules: their formal definition, why they're correct, how to enumerate them, their limitations, and their role as the correctness benchmark for all database execution. Next, we'll explore what it means for two schedules to be 'equivalent'—the key concept linking serial schedules to practical concurrent execution.