Loading content...
ACID properties—Atomicity, Consistency, Isolation, and Durability—define what it means for a database to be reliable. But properties are just promises; they mean nothing without implementation. The recovery system is where ACID properties become reality.
Every ACID property depends on mechanisms we've studied: logging, checkpoints, stable storage, and recovery algorithms. Understanding these connections reveals why recovery is not just one subsystem among many—it's the foundation that makes transactional databases trustworthy.
This page ties together everything we've learned by showing how each ACID property is implemented and maintained through recovery-related mechanisms.
By the end of this page, you will understand how Atomicity depends on undo logging, how Durability depends on write-ahead logging and stable storage, how Consistency is preserved through recovery, and how Isolation interacts with recovery during abort and crash scenarios. You'll see ACID as an integrated system, not four independent properties.
Before diving into the connections, let's precisely define each ACID property with an eye toward implementation:
| Property | Definition | Implementation Requirement | Primary Mechanism |
|---|---|---|---|
| Atomicity | All operations of a transaction complete, or none do | Ability to undo partial transactions | Undo logging, rollback |
| Consistency | Transactions take the database from one consistent state to another | Preserve invariants even after failures | Constraints + complete recovery |
| Isolation | Concurrent transactions appear to execute serially | Prevent interference, handle concurrent aborts | Locking, undo for cascading aborts |
| Durability | Committed transactions survive any subsequent failure | Committed data reaches stable storage | Redo logging, WAL, stable storage |
The Recovery System's Role in Each:
┌───────────────────────────────────────────────────────────────┐
│ ACID PROPERTIES │
└───────────────────────────────────────────────────────────────┘
│ │ │ │
▼ ▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ Atomicity│ │Consistency│ │ Isolation│ │Durability│
└────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │ │
│ │ │ │
┌────▼─────┐ ┌────▼─────┐ ┌────▼─────┐ ┌────▼─────┐
│ UNDO │ │ Complete │ │ Abort │ │ REDO │
│ Logging │ │ Recovery │ │ Cascading│ │ Logging │
│ │ │ + Checks │ │ Aborts │ │ + Stable │
│ Rollback │ │ │ │ │ │ Storage │
└────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │ │
└─────────────┴─────────────┴──────────────┘
│
▼
┌─────────────────────────────┐
│ RECOVERY SYSTEM │
│ (Logging, Checkpoints, │
│ Stable Storage, ARIES) │
└─────────────────────────────┘
Notice that all four properties feed into the recovery system. This isn't coincidence—transactional guarantees are fundamentally about surviving failures, and failure survival is the recovery system's job.
While ACID is presented as four separate properties, they're deeply interconnected. Atomicity without durability means completed transactions might vanish. Isolation without atomicity means concurrent readers might see partial states. The recovery system implements these interdependencies.
Atomicity guarantees that transactions are indivisible—either all operations complete, or none do. This seems simple until you consider that:
How does atomicity survive these scenarios?
Atomicity Timeline:
Transaction T1: Insert A, Update B, Delete C
Time ────────────────────────────────────────────────────────────────▶
|─────── T1 Active ───────|
│ │
│ Insert A logged │
│ (before: ∅, after: A) │
│ │ │
│ Update B logged │
│ (before: b, after: B) │
│ │ │
│ Delete C logged
│ (before: c, after: ∅)
│ │
│ │
▼ ▼
SCENARIO 1: Application requests COMMIT
→ Force log to disk
→ Write commit record
→ Ack to application
→ T1 is atomic (all operations committed)
SCENARIO 2: Application requests ABORT
→ Undo: Restore C (insert c)
→ Undo: Restore B (update to b)
→ Undo: Restore A (delete A)
→ Write abort record
→ T1 is atomic (no operations visible)
SCENARIO 3: CRASH before commit
→ On recovery, T1 is uncommitted
→ Undo phase reverses all T1 modifications
→ T1 is atomic (no operations visible)
In all scenarios, the transaction's effects are all-or-nothing. The undo log makes 'nothing' possible even after partial execution.
Under STEAL policy, uncommitted modifications may reach the disk before the transaction commits. This is precisely why undo is necessary at crash recovery—those uncommitted changes are on disk and must be reversed. Without STEAL, atomicity would be simpler but memory management would suffer.
Durability guarantees that committed transactions survive all subsequent failures. This requires:
The WAL + Redo Solution:
Durability Timeline:
Transaction T1: Update X from 100 to 200
Time ────────────────────────────────────────────────────────────────▶
│ Log record written to buffer:
│ "T1: Page P, Offset O, Before=100, After=200"
│ │
│ Modification applied to buffer page P
│ (Page P now has value 200 in memory)
│ │
│ Application requests COMMIT
│ │
│ LOG FORCED TO STABLE STORAGE
│ (fsync completes)
│ │
│ Commit record written and forced
│ │
│ Commit acknowledged to application
│ │
▼ ▼
AT THIS POINT: Durability is guaranteed!
SCENARIO A: No crash
→ Buffer manager eventually writes page P to disk
→ System operates normally
SCENARIO B: Crash before page P written to disk
→ On recovery, log shows T1 committed
→ Page P on disk still has old value (100)
→ Redo phase applies: "Set Page P, Offset O = 200"
→ Page P now has value 200
→ T1's durability is confirmed
SCENARIO C: Crash after page P written to disk
→ On recovery, redo phase checks page
→ PageLSN shows modification already applied
→ Redo skips (optimization)
→ T1's durability is confirmed
The key insight: the log, not the data pages, is the source of durability. As long as log records reach stable storage before commit, recovery can reconstruct any missing modifications.
Under NO-FORCE policy, commit doesn't wait for data pages to be written—only log records. Since log writes are sequential (fast) while data page writes are random (slow), NO-FORCE dramatically reduces commit latency. Redo makes this possible by guaranteeing committed work can be recovered from the log.
Consistency is the most nuanced ACID property. It encompasses:
Recovery's Role in Consistency:
Consistency Levels:
| Consistency Aspect | Pre-Crash State | Post-Recovery State | Mechanism |
|---|---|---|---|
| Schema constraints | Valid (enforced at commit) | Valid | Atomic commit/recovery |
| Referential integrity | Valid | Valid | Same—partial commits impossible |
| Index consistency | May be in-flight updates | Valid | Redo/undo applied to indexes too |
| Physical page structure | May have partial writes | Valid | Torn page protection + recovery |
| Transaction state | May be indeterminate | Committed or aborted | Recovery resolves all transactions |
The Consistency Guarantee:
If the database was consistent before the crash (and transactions maintained consistency during execution), then recovery produces a consistent database. Recovery doesn't validate consistency—it assumes consistent transactions and ensures only complete transactions take effect.
Responsibility Split:
The database cannot verify arbitrary business rules. If an application creates a transaction that leaves data in a logically inconsistent state (e.g., negative bank balance without triggering a constraint), the database will commit it. Consistency requires correct application logic plus database mechanism support.
Isolation is primarily implemented by concurrency control (locking, MVCC), but recovery has important interactions with isolation, especially during abort scenarios:
Scenario: Cascading Abort
With some isolation levels and locking schemes, one transaction aborting can force other transactions to abort:
T1: Update X = 100 → 200
T2: Read X (sees 200) ← T2 read uncommitted value from T1
T2: Continue processing based on X=200
T1: ABORT
→ T1's change is undone: X = 100
→ But T2 has already seen X=200!
→ T2 has a 'dirty read'
→ T2 must also abort to maintain isolation
→ This is a 'cascading abort'
Recovery's Role:
| Isolation Level | Dirty Read Possible? | Cascading Abort Risk | Recovery Complexity |
|---|---|---|---|
| Read Uncommitted | Yes | High—readers may see uncommitted data that gets rolled back | Must handle cascading aborts |
| Read Committed | No | Low—readers only see committed data | Standard recovery |
| Repeatable Read | No | Low | Standard recovery |
| Serializable | No | None—full isolation prevents all anomalies | Standard recovery |
| MVCC-based levels | No | None—readers use snapshots, never see uncommitted | Garbage collection needed for old versions |
Crash During Concurrent Execution:
When a crash occurs with multiple active transactions:
Recovery and Strict 2PL:
Strict 2PL (holding write locks until commit) actually simplifies recovery:
This is why strict 2PL is the most common concurrency control scheme in traditional databases—it provides both serializability and clean recovery semantics.
MVCC (Multi-Version Concurrency Control) maintains multiple versions of data items. Readers access old, committed versions while writers create new versions. This eliminates read-write conflicts, but adds complexity: old versions must be garbage collected, and recovery must account for version chains. The tradeoff is often worthwhile for read-heavy workloads.
Having examined each property individually, let's appreciate how they form an integrated system. The log is the unifying mechanism:
The Log as ACID Foundation:
┌─────────────────────────────────────────────────────────────────────────┐
│ TRANSACTION LOG │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ LSN 100: T1 BEGIN │
│ LSN 101: T1 Update P1 (before=A, after=B) ← Atomicity (undo) │
│ LSN 102: T2 BEGIN │
│ LSN 103: T2 Update P2 (before=X, after=Y) ← Durability (redo) │
│ LSN 104: T1 Update P3 (before=M, after=N) ← Both │
│ LSN 105: T2 COMMIT ← Durability │
│ LSN 106: T1 ABORT ← Atomicity │
│ LSN 107: CLR for LSN 104 (undo P3: N→M) ← Atomicity │
│ LSN 108: CLR for LSN 101 (undo P1: B→A) ← Atomicity │
│ LSN 109: T1 END │
│ │
└─────────────────────────────────────────────────────────────────────────┘
After this sequence:
- T2 is committed and durable (DURABILITY)
- T1 is fully rolled back, no trace (ATOMICITY)
- Database is consistent (CONSISTENCY via complete tx only)
- T2 didn't see T1's uncommitted changes (ISOLATION via locks/MVCC)
The Recovery Algorithm Unifies All Properties:
| ARIES Phase | Atomicity Support | Durability Support | Consistency Support | Isolation Support |
|---|---|---|---|---|
| Analysis | Identifies uncommitted transactions | Identifies redo point | N/A | N/A |
| Redo | N/A | Reapplies committed changes | Restores complete tx state | Rebuilds pre-crash state |
| Undo | Reverses uncommitted changes | N/A | Removes partial tx effects | Cleans up for future tx |
Every phase serves multiple properties. The recovery algorithm is the implementation of ACID.
When reasoning about ACID, think log-centric: The log is the source of truth. Data pages are an optimization for fast random access. During recovery, the log reconstructs the guaranteed-correct state. This mental model clarifies why logging is so fundamental to transactional databases.
Understanding the ACID-recovery connection has practical implications for database configuration, monitoring, and troubleshooting:
synchronous_commit=off (PostgreSQL) or innodb_flush_log_at_trx_commit=2 (MySQL) reduces commit latency but creates a window of potential data loss.| Setting | Strong ACID | Reduced ACID | Risk |
|---|---|---|---|
| Log sync at commit | Every commit syncs log | Batched/delayed sync | Recent commits may be lost on crash |
| Replication mode | Synchronous (wait for standby) | Asynchronous | Committed data may be lost if primary fails |
| Checkpoint interval | Frequent (2-5 min) | Infrequent (30 min) | Longer recovery time after crash |
| Isolation level | Serializable | Read Committed | Possible anomalies (non-repeatable reads) |
Monitoring for ACID Health:
Operators should monitor:
Troubleshooting Recovery Issues:
Many databases ship with settings optimized for benchmark performance, not production durability. Before deploying, explicitly verify: Are commits synchronous? Is the log on reliable storage? Is replication synchronous? Document your durability posture—the cost of discovering misconfiguration through data loss is unacceptable.
ACID properties are the promises databases make; the recovery system is how those promises are kept. Every aspect of recovery—logging, checkpoints, stable storage, and recovery algorithms—exists to implement one or more ACID guarantees. Let's consolidate the key insights:
Module Complete:
This concludes Module 2: Recovery Concepts. You now have a comprehensive understanding of:
With this conceptual foundation, you're prepared for the detailed chapters ahead on specific recovery mechanisms: Write-Ahead Logging (WAL), Checkpoints, and the ARIES algorithm in depth.
You have completed Module 2: Recovery Concepts. You now understand recovery not as an isolated subsystem but as the foundational mechanism that makes ACID properties real. This understanding will serve you well as you dive deeper into logging, checkpointing, and the ARIES algorithm.