Loading content...
What if transactions could write changes to the database before committing? This simple question opens the door to the Immediate Update technique—also known as UNDO/REDO—which represents the dominant approach in modern database management systems.
Unlike deferred update, which holds all modifications in memory until commit, immediate update allows the buffer manager to flush modified pages to disk at any time, even while transactions are still active. This flexibility solves the critical limitation of deferred update: transactions are no longer constrained by buffer pool size.
By the end of this page, you will understand how immediate update enables scalable transaction processing, why Write-Ahead Logging (WAL) is essential, how recovery handles both incomplete and committed transactions, and the trade-offs that come with this additional flexibility.
The Immediate Update technique relaxes the constraint imposed by deferred update:
Uncommitted data may be written to the stable database before a transaction commits.
This means that at any point during execution, the database on disk may contain a mixture of:
Why 'UNDO/REDO'?
The name describes the recovery operations required:
Both operations are necessary because we lose control over the relationship between transaction state and database state.
| Aspect | Policy | Rationale |
|---|---|---|
| Write Timing | Anytime (before or after commit) | Flexible buffer management |
| Buffer Management | Pages can be flushed at will (STEAL) | No buffer pool size constraints |
| Log Content | Both old and new values | Enables both UNDO and REDO |
| Recovery Action | UNDO uncommitted, REDO committed | Handles both failure scenarios |
| Transaction Size | Unlimited (can exceed buffer pool) | Supports bulk operations |
The Trade-off:
Immediate update trades recovery simplicity for operational flexibility. While deferred update offers a clean 'REDO-only' recovery, immediate update requires a more complex recovery procedure that must both undo incomplete work and redo lost committed work. This complexity is the price of supporting arbitrary transaction sizes and flexible memory management.
Immediate update (with UNDO/REDO recovery) is the foundation of virtually all modern commercial DBMS implementations, including Oracle, PostgreSQL, MySQL InnoDB, SQL Server, and DB2. The ARIES algorithm, which we'll study later, is an optimized form of immediate update.
With uncommitted changes potentially on disk, we face a critical question: How do we ensure we can always undo them? The answer is Write-Ahead Logging (WAL)—the most important rule in database recovery.
The WAL Protocol:
Before any modification is written to the database, the corresponding log record must first be written to stable storage.
This rule has two components:
UNDO Rule (WAL for atomicity): Before writing a dirty page to disk, all log records for that page (containing before-images) must be on stable storage. This ensures we can undo any uncommitted change found in the database.
REDO Rule (WAL for durability): Before acknowledging a commit, all log records for the transaction (including the COMMIT record) must be on stable storage. This ensures we can redo any committed change not yet in the database.
123456789101112131415161718192021222324252627282930313233
// The WAL Protocol: Two fundamental rules RULE 1: UNDO Guarantee (Atomicity)// "Log before data" for dirty pages PROCEDURE Flush_Page_To_Disk(page P): // Find the most recent LSN of any log record affecting P page_lsn := P.pageLSN // Ensure all log records up to that LSN are on stable storage IF log_buffer.flushed_LSN < page_lsn THEN FORCE_LOG_UP_TO(page_lsn) END IF // NOW it is safe to write the page WRITE_PAGE_TO_DISK(P) RULE 2: REDO Guarantee (Durability)// "Force log at commit" PROCEDURE Commit_Transaction(transaction T): // Create the COMMIT log record commit_record := create_commit_record(T) append_to_log_buffer(commit_record) // Force all log records for T (including COMMIT) to disk FORCE_LOG_UP_TO(commit_record.LSN) // NOW the transaction is committed // Database pages can be written later (or not at all until recovery) RETURN "COMMITTED"Why WAL Works:
Consider what happens during recovery after a crash:
If an uncommitted change is in the database: The log contains the before-image (because of Rule 1). We can UNDO the change.
If a committed change is NOT in the database: The log contains the after-image (because of Rule 2). We can REDO the change.
WAL ensures that the log is always ahead of the database. The log contains everything we need to reconstruct any consistent state.
The Log Sequence Number (LSN):
LSNs are monotonically increasing identifiers assigned to each log record. Every database page also maintains a pageLSN indicating the most recent log record that modified it. This connection between pages and log records is essential for recovery:
Never, under any circumstances, write a database page to disk if the log records describing its modifications aren't already on stable storage. Violating WAL can cause unrecoverable data corruption—changes in the database with no corresponding log entries.
Under immediate update, log records must carry more information than in deferred update systems. Both UNDO and REDO operations require sufficient data to reconstruct or reverse any modification.
Essential Log Record Components:
| Record Type | Contents | Purpose |
|---|---|---|
| [START, T] | Transaction ID, Timestamp | Marks transaction beginning |
| [WRITE, T, P, offset, before, after] | Full modification details | Enables both UNDO and REDO |
| [COMMIT, T] | Transaction ID, Commit timestamp | Marks successful completion |
| [ABORT, T] | Transaction ID | Marks intentional rollback |
| [CLR, T, P, ...] | Compensation Log Record | Records UNDO actions (prevents re-UNDO) |
| [CHECKPOINT] | Active transaction list, dirty pages | Bounds recovery work (discussed in ARIES) |
Example Log Sequence:
Consider transaction T1 updating account A (balance: $500 → $400) and account B (balance: $200 → $300):
LSN=100: [START, T1]
LSN=101: [WRITE, T1, Page7, offset=24, before=500, after=400] // A: 500→400
LSN=102: [WRITE, T1, Page12, offset=48, before=200, after=300] // B: 200→300
LSN=103: [COMMIT, T1]
Unlike deferred update, we record BOTH the before-image (500, 200) AND the after-image (400, 300). This allows:
The PrevLSN Chain:
Each log record includes a pointer to the previous log record for the same transaction. This creates per-transaction chains that simplify rollback:
LSN=100: [START, T1, prevLSN=null]
LSN=101: [WRITE, T1, ..., prevLSN=100]
LSN=102: [WRITE, T1, ..., prevLSN=101]
LSN=103: [COMMIT, T1, prevLSN=102]
To roll back T1, follow the chain backwards from the most recent record, undoing each WRITE.
Log records in immediate update are approximately twice the size of deferred update records because they include both before and after images. For write-heavy workloads with large modifications, this can significantly increase log I/O and storage requirements.
The key advantage of immediate update is buffer management flexibility. The buffer manager can operate under a STEAL policy:
STEAL: The buffer manager may 'steal' a buffer frame by writing its dirty page to disk, even if the page contains modifications from uncommitted transactions.
This is the opposite of the NO-STEAL policy required by deferred update. With STEAL:
FORCE vs NO-FORCE:
Orthogonal to STEAL/NO-STEAL is another policy choice:
Most modern systems use STEAL with NO-FORCE:
This combination (STEAL/NO-FORCE) requires both UNDO and REDO during recovery, which is exactly what immediate update provides.
| Policy | FORCE | NO-FORCE |
|---|---|---|
| NO-STEAL | No UNDO, No REDO (impractical) | No UNDO, REDO needed (Deferred Update) |
| STEAL | UNDO needed, No REDO (uncommon) | UNDO and REDO needed (Immediate Update) |
The STEAL/NO-FORCE combination is considered optimal for general-purpose databases: it removes transaction size limits (STEAL) while providing fast commits without synchronous data page writes (NO-FORCE). This is why ARIES, which supports STEAL/NO-FORCE, became the industry standard.
Recovery in an immediate update system is more complex than deferred update because we must handle two distinct problems:
The classic recovery algorithm uses two passes:
Pass 1: REDO (Forward Scan)
Scan the log from the beginning (or last checkpoint) to the end, replaying ALL write operations regardless of whether the transaction committed. This restores the database to its exact pre-crash state, including uncommitted modifications.
Pass 2: UNDO (Backward Scan)
Scan the log backwards, undoing all modifications from transactions that didn't commit. This removes the effects of incomplete transactions.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354
PROCEDURE Recover_Database(): // Phase 1: Analysis - Identify transaction states committed := {} // Transactions with COMMIT active := {} // Transactions without COMMIT (will be undone) FOR EACH log_record IN log (forward): CASE log_record.type OF: START: active.add(log_record.txn_id) COMMIT: committed.add(log_record.txn_id) active.remove(log_record.txn_id) ABORT: active.remove(log_record.txn_id) END CASE END FOR // Phase 2: REDO - Replay all modifications (forward scan) // This restores database to pre-crash state FOR EACH log_record IN log (forward): IF log_record.type = WRITE THEN page := read_page(log_record.page_id) IF page.pageLSN < log_record.LSN THEN // Page doesn't have this update, apply it apply_redo(log_record, page) page.pageLSN := log_record.LSN write_page(page) END IF // If page.pageLSN >= record.LSN, update already present END IF END FOR // Phase 3: UNDO - Reverse uncommitted modifications (backward scan) undo_list := collect_write_records_for(active) WHILE undo_list is not empty: record := undo_list.pop_highest_LSN() page := read_page(record.page_id) // Apply UNDO using before-image apply_undo(record, page) // Write CLR to log (prevents re-undo on subsequent crash) write_CLR(record) write_page(page) END WHILE // All active transactions are now rolled back FOR EACH txn_id IN active: write_abort_record(txn_id) END FOR RETURN successRecovery Example:
Consider this log state at crash time:
LSN=100: [START, T1]
LSN=101: [WRITE, T1, A, before=500, after=400]
LSN=102: [COMMIT, T1]
LSN=103: [START, T2]
LSN=104: [WRITE, T2, B, before=200, after=300]
LSN=105: [START, T3]
LSN=106: [WRITE, T3, C, before=100, after=150]
LSN=107: [WRITE, T2, D, before=50, after=75]
LSN=108: [COMMIT, T3]
--- CRASH ---
Recovery Analysis:
| Transaction | Has COMMIT? | Action |
|---|---|---|
| T1 | Yes | REDO if needed |
| T2 | No | REDO, then UNDO |
| T3 | Yes | REDO if needed |
REDO Phase (forward): Apply all writes (LSN 101, 104, 106, 107)
UNDO Phase (backward): Undo T2's writes in reverse order:
When we undo a modification, we write a CLR to record this action. CLRs ensure that if the system crashes during recovery, we don't undo the same operation twice. CLRs are 'redo-only'—they're replayed during REDO but never undone themselves.
The basic UNDO/REDO algorithm can be optimized in several ways. Understanding these optimizations provides insight into how production systems achieve practical recovery times.
LSN-Based REDO Optimization:
Not every log record needs to be applied during REDO. If a page's pageLSN is greater than or equal to a log record's LSN, that modification is already present—the page was flushed after the modification occurred. We can skip the redo:
if page.pageLSN >= log_record.LSN:
skip // Already applied
else:
apply_redo(log_record) // Need to redo
This optimization is crucial because it means:
Checkpointing to Bound Recovery:
Without checkpoints, recovery must scan the entire log—potentially gigabytes or terabytes of data. Checkpoints create 'safe points' that limit how far back we need to look:
During recovery:
Parallel REDO:
Modern systems parallelize REDO by:
UNDO Optimization:
UNDO is inherently serial for each transaction (must undo in reverse order), but:
| Phase | Direction | Scope | Parallelization |
|---|---|---|---|
| Analysis | Forward | Checkpoint to crash | Single-threaded (fast) |
| REDO | Forward | Min recLSN to crash | Highly parallelizable |
| UNDO | Backward | Per-transaction chains | Parallel across transactions |
Modern databases aim for sub-minute recovery times even after crashes with significant dirty data. This is achieved through aggressive checkpointing, efficient log structure, and parallel recovery. ARIES, which we'll study next, incorporates all these optimizations.
Before discussing recovery further, it's important to understand how immediate update handles transaction rollback during normal operation (not crash recovery). When a transaction explicitly aborts or must be rolled back due to conflicts, the system must undo its modifications.
The Rollback Process:
The CLRs serve two purposes:
123456789101112131415161718192021222324252627282930313233343536373839
PROCEDURE Rollback_Transaction(transaction T): // Start from T's most recent log record current_LSN := T.lastLSN WHILE current_LSN != NULL: record := read_log_record(current_LSN) IF record.type = WRITE THEN // Undo this modification page := read_page(record.page_id) page[record.offset] := record.before_image // Write CLR (Compensation Log Record) clr := { type: CLR, txn_id: T.id, page_id: record.page_id, offset: record.offset, undo_value: record.before_image, undoNextLSN: record.prevLSN // Points to next record to undo } write_log_record(clr) // Flush page (respecting WAL) page.pageLSN := clr.LSN // Page may be flushed now or later END IF // Move to previous record in T's chain current_LSN := record.prevLSN END WHILE // Write ABORT record write_log_record({type: ABORT, txn_id: T.id}) // Release locks release_all_locks(T) RETURN successWhy CLRs are Essential:
Consider what happens if the system crashes during a rollback:
Without CLRs, recovery would:
With CLRs:
The undoNextLSN Pointer:
Each CLR includes an undoNextLSN field pointing to the next log record that needs to be undone. This creates a 'skip chain' that allows UNDO to efficiently navigate past already-compensated operations.
Transaction rollback in immediate update requires I/O proportional to the number of modifications. For a transaction that modified 10,000 pages, rollback must read and write each page. This is fundamentally different from deferred update where abort is nearly free. In immediate update, long transactions are expensive to abort.
The immediate update (UNDO/REDO) algorithm provides the flexibility needed by modern, general-purpose database systems. Let's consolidate the key concepts:
What's Next:
In the next page, we'll explore Shadow Paging—a fundamentally different approach to recovery that avoids logging entirely by maintaining shadow copies of modified pages. While elegant in concept, shadow paging has practical limitations that explain why logging-based approaches dominate.
You now understand immediate update recovery: why WAL is critical, how STEAL/NO-FORCE enables flexible buffer management, the two-phase recovery procedure, and the role of CLRs in crash-safe rollback. This knowledge directly prepares you for understanding ARIES, the industry-standard recovery algorithm.