Recovery Algorithm Types - Learning Module

Loading content...

0/252

Immediate Update (UNDO/REDO)

The Freedom to Write Early

What if transactions could write changes to the database before committing? This simple question opens the door to the Immediate Update technique—also known as UNDO/REDO—which represents the dominant approach in modern database management systems.

Unlike deferred update, which holds all modifications in memory until commit, immediate update allows the buffer manager to flush modified pages to disk at any time, even while transactions are still active. This flexibility solves the critical limitation of deferred update: transactions are no longer constrained by buffer pool size.

What You Will Learn

By the end of this page, you will understand how immediate update enables scalable transaction processing, why Write-Ahead Logging (WAL) is essential, how recovery handles both incomplete and committed transactions, and the trade-offs that come with this additional flexibility.

The Fundamental Concept

The Immediate Update technique relaxes the constraint imposed by deferred update:

Uncommitted data may be written to the stable database before a transaction commits.

This means that at any point during execution, the database on disk may contain a mixture of:

Committed, persisted changes (normal, expected state)
Uncommitted changes from active transactions (in-flight modificationss)
Partial changes from transactions that will never commit (due to future abort or failure)

Why 'UNDO/REDO'?

The name describes the recovery operations required:

UNDO: Changes from uncommitted transactions may be in the database and must be reversed
REDO: Changes from committed transactions may NOT be in the database and must be reapplied

Both operations are necessary because we lose control over the relationship between transaction state and database state.

Immediate Update Policy Summary
Aspect	Policy	Rationale
Write Timing	Anytime (before or after commit)	Flexible buffer management
Buffer Management	Pages can be flushed at will (STEAL)	No buffer pool size constraints
Log Content	Both old and new values	Enables both UNDO and REDO
Recovery Action	UNDO uncommitted, REDO committed	Handles both failure scenarios
Transaction Size	Unlimited (can exceed buffer pool)	Supports bulk operations

The Trade-off:

Immediate update trades recovery simplicity for operational flexibility. While deferred update offers a clean 'REDO-only' recovery, immediate update requires a more complex recovery procedure that must both undo incomplete work and redo lost committed work. This complexity is the price of supporting arbitrary transaction sizes and flexible memory management.

The Dominant Approach

Immediate update (with UNDO/REDO recovery) is the foundation of virtually all modern commercial DBMS implementations, including Oracle, PostgreSQL, MySQL InnoDB, SQL Server, and DB2. The ARIES algorithm, which we'll study later, is an optimized form of immediate update.

Write-Ahead Logging (WAL)

With uncommitted changes potentially on disk, we face a critical question: How do we ensure we can always undo them? The answer is Write-Ahead Logging (WAL)—the most important rule in database recovery.

The WAL Protocol:

Before any modification is written to the database, the corresponding log record must first be written to stable storage.

This rule has two components:

UNDO Rule (WAL for atomicity): Before writing a dirty page to disk, all log records for that page (containing before-images) must be on stable storage. This ensures we can undo any uncommitted change found in the database.
REDO Rule (WAL for durability): Before acknowledging a commit, all log records for the transaction (including the COMMIT record) must be on stable storage. This ensures we can redo any committed change not yet in the database.

wal_protocol.pseudo

Pseudocode

// The WAL Protocol: Two fundamental rules
 
RULE 1: UNDO Guarantee (Atomicity)
// "Log before data" for dirty pages
 
PROCEDURE Flush_Page_To_Disk(page P):
    // Find the most recent LSN of any log record affecting P
    page_lsn := P.pageLSN
    
    // Ensure all log records up to that LSN are on stable storage
    IF log_buffer.flushed_LSN < page_lsn THEN
        FORCE_LOG_UP_TO(page_lsn)
    END IF
    
    // NOW it is safe to write the page
    WRITE_PAGE_TO_DISK(P)
 
 
RULE 2: REDO Guarantee (Durability)
// "Force log at commit"
 
PROCEDURE Commit_Transaction(transaction T):
    // Create the COMMIT log record
    commit_record := create_commit_record(T)
    append_to_log_buffer(commit_record)
    
    // Force all log records for T (including COMMIT) to disk
    FORCE_LOG_UP_TO(commit_record.LSN)
    
    // NOW the transaction is committed
    // Database pages can be written later (or not at all until recovery)
    
    RETURN "COMMITTED"

Why WAL Works:

Consider what happens during recovery after a crash:

If an uncommitted change is in the database: The log contains the before-image (because of Rule 1). We can UNDO the change.
If a committed change is NOT in the database: The log contains the after-image (because of Rule 2). We can REDO the change.

WAL ensures that the log is always ahead of the database. The log contains everything we need to reconstruct any consistent state.

The Log Sequence Number (LSN):

LSNs are monotonically increasing identifiers assigned to each log record. Every database page also maintains a pageLSN indicating the most recent log record that modified it. This connection between pages and log records is essential for recovery:

When flushing a page, we check: is the log flushed past this page's LSN?
During REDO, we check: is this page already updated past this log record's LSN?

The Cardinal Rule

Never, under any circumstances, write a database page to disk if the log records describing its modifications aren't already on stable storage. Violating WAL can cause unrecoverable data corruption—changes in the database with no corresponding log entries.

Log Record Structure

Under immediate update, log records must carry more information than in deferred update systems. Both UNDO and REDO operations require sufficient data to reconstruct or reverse any modification.

Essential Log Record Components:

LSN (Log Sequence Number): Unique, monotonically increasing identifier
Transaction ID: The transaction that generated this record
Previous LSN: LSN of the previous log record for this transaction (enables transaction-specific processing)
Record Type: START, WRITE, COMMIT, ABORT, CLR, etc.
Page ID: For WRITE records, identifies the affected page
Offset/Slot: Position within the page
Before-Image: The original value (for UNDO)
After-Image: The new value (for REDO)

Log Record Types in Immediate Update
Record Type	Contents	Purpose
[START, T]	Transaction ID, Timestamp	Marks transaction beginning
[WRITE, T, P, offset, before, after]	Full modification details	Enables both UNDO and REDO
[COMMIT, T]	Transaction ID, Commit timestamp	Marks successful completion
[ABORT, T]	Transaction ID	Marks intentional rollback
[CLR, T, P, ...]	Compensation Log Record	Records UNDO actions (prevents re-UNDO)
[CHECKPOINT]	Active transaction list, dirty pages	Bounds recovery work (discussed in ARIES)

Example Log Sequence:

Consider transaction T1 updating account A (balance: $500 → $400) and account B (balance: $200 → $300):

LSN=100: [START, T1]
LSN=101: [WRITE, T1, Page7, offset=24, before=500, after=400]  // A: 500→400
LSN=102: [WRITE, T1, Page12, offset=48, before=200, after=300] // B: 200→300
LSN=103: [COMMIT, T1]

Unlike deferred update, we record BOTH the before-image (500, 200) AND the after-image (400, 300). This allows:

UNDO: If T1 must be rolled back, use before-images to restore original values
REDO: If committed but changes lost, use after-images to reapply

The PrevLSN Chain:

Each log record includes a pointer to the previous log record for the same transaction. This creates per-transaction chains that simplify rollback:

LSN=100: [START, T1, prevLSN=null]
LSN=101: [WRITE, T1, ..., prevLSN=100]
LSN=102: [WRITE, T1, ..., prevLSN=101]
LSN=103: [COMMIT, T1, prevLSN=102]

To roll back T1, follow the chain backwards from the most recent record, undoing each WRITE.

Storage Cost

Log records in immediate update are approximately twice the size of deferred update records because they include both before and after images. For write-heavy workloads with large modifications, this can significantly increase log I/O and storage requirements.

Buffer Management: The STEAL Policy

The key advantage of immediate update is buffer management flexibility. The buffer manager can operate under a STEAL policy:

STEAL: The buffer manager may 'steal' a buffer frame by writing its dirty page to disk, even if the page contains modifications from uncommitted transactions.

This is the opposite of the NO-STEAL policy required by deferred update. With STEAL:

The buffer pool is never 'blocked' by active transactions
Any page can be evicted when memory is needed
Transactions can modify more data than fits in memory
The system gracefully handles memory pressure

STEAL Policy Benefits

•No transaction size limit
•Efficient buffer utilization
•Supports bulk operations
•Handles memory pressure gracefully
•Enables very large transactions

STEAL Policy Costs

•Requires UNDO during recovery
•Must log before-images
•Larger log records
•More complex recovery algorithm
•Potential cascading abort issues

FORCE vs NO-FORCE:

Orthogonal to STEAL/NO-STEAL is another policy choice:

FORCE: All modified pages must be written to disk before commit completes
NO-FORCE: Modified pages may remain in buffer after commit (written lazily)

Most modern systems use STEAL with NO-FORCE:

STEAL enables unlimited transaction sizes
NO-FORCE reduces commit latency (just force the log, not all data pages)

This combination (STEAL/NO-FORCE) requires both UNDO and REDO during recovery, which is exactly what immediate update provides.

Buffer Policy Combinations
Policy	FORCE	NO-FORCE
NO-STEAL	No UNDO, No REDO (impractical)	No UNDO, REDO needed (Deferred Update)
STEAL	UNDO needed, No REDO (uncommon)	UNDO and REDO needed (Immediate Update)

The ARIES Sweet Spot

The STEAL/NO-FORCE combination is considered optimal for general-purpose databases: it removes transaction size limits (STEAL) while providing fast commits without synchronous data page writes (NO-FORCE). This is why ARIES, which supports STEAL/NO-FORCE, became the industry standard.

Recovery Procedure

Recovery in an immediate update system is more complex than deferred update because we must handle two distinct problems:

Uncommitted changes in the database: Must be undone
Committed changes NOT in the database: Must be redone

The classic recovery algorithm uses two passes:

Pass 1: REDO (Forward Scan)

Scan the log from the beginning (or last checkpoint) to the end, replaying ALL write operations regardless of whether the transaction committed. This restores the database to its exact pre-crash state, including uncommitted modifications.

Pass 2: UNDO (Backward Scan)

Scan the log backwards, undoing all modifications from transactions that didn't commit. This removes the effects of incomplete transactions.

immediate_update_recovery.pseudo

Pseudocode

PROCEDURE Recover_Database():
    // Phase 1: Analysis - Identify transaction states
    committed := {}      // Transactions with COMMIT
    active := {}         // Transactions without COMMIT (will be undone)
    
    FOR EACH log_record IN log (forward):
        CASE log_record.type OF:
            START:
                active.add(log_record.txn_id)
            COMMIT:
                committed.add(log_record.txn_id)
                active.remove(log_record.txn_id)
            ABORT:
                active.remove(log_record.txn_id)
        END CASE
    END FOR
    
    // Phase 2: REDO - Replay all modifications (forward scan)
    // This restores database to pre-crash state
    FOR EACH log_record IN log (forward):
        IF log_record.type = WRITE THEN
            page := read_page(log_record.page_id)
            IF page.pageLSN < log_record.LSN THEN
                // Page doesn't have this update, apply it
                apply_redo(log_record, page)
                page.pageLSN := log_record.LSN
                write_page(page)
            END IF
            // If page.pageLSN >= record.LSN, update already present
        END IF
    END FOR
    
    // Phase 3: UNDO - Reverse uncommitted modifications (backward scan)
    undo_list := collect_write_records_for(active)
    
    WHILE undo_list is not empty:
        record := undo_list.pop_highest_LSN()
        page := read_page(record.page_id)
        
        // Apply UNDO using before-image
        apply_undo(record, page)
        
        // Write CLR to log (prevents re-undo on subsequent crash)
        write_CLR(record)
        
        write_page(page)
    END WHILE
    
    // All active transactions are now rolled back
    FOR EACH txn_id IN active:
        write_abort_record(txn_id)
    END FOR
    
    RETURN success

Recovery Example:

Consider this log state at crash time:

LSN=100: [START, T1]
LSN=101: [WRITE, T1, A, before=500, after=400]
LSN=102: [COMMIT, T1]
LSN=103: [START, T2]
LSN=104: [WRITE, T2, B, before=200, after=300]
LSN=105: [START, T3]
LSN=106: [WRITE, T3, C, before=100, after=150]
LSN=107: [WRITE, T2, D, before=50, after=75]
LSN=108: [COMMIT, T3]
--- CRASH ---

Recovery Analysis:

Transaction	Has COMMIT?	Action
T1	Yes	REDO if needed
T2	No	REDO, then UNDO
T3	Yes	REDO if needed

REDO Phase (forward): Apply all writes (LSN 101, 104, 106, 107)

UNDO Phase (backward): Undo T2's writes in reverse order:

Undo LSN 107: D restored to 50
Undo LSN 104: B restored to 200

Compensation Log Records (CLRs)

When we undo a modification, we write a CLR to record this action. CLRs ensure that if the system crashes during recovery, we don't undo the same operation twice. CLRs are 'redo-only'—they're replayed during REDO but never undone themselves.

Optimization Considerations

The basic UNDO/REDO algorithm can be optimized in several ways. Understanding these optimizations provides insight into how production systems achieve practical recovery times.

LSN-Based REDO Optimization:

Not every log record needs to be applied during REDO. If a page's pageLSN is greater than or equal to a log record's LSN, that modification is already present—the page was flushed after the modification occurred. We can skip the redo:

if page.pageLSN >= log_record.LSN:
    skip  // Already applied
else:
    apply_redo(log_record)  // Need to redo

This optimization is crucial because it means:

Pages flushed shortly before the crash need minimal redo
Hot pages that were updated frequently benefit most
Clean pages (unmodified since last flush) are untouched

Checkpointing to Bound Recovery:

Without checkpoints, recovery must scan the entire log—potentially gigabytes or terabytes of data. Checkpoints create 'safe points' that limit how far back we need to look:

Record all active transactions
Record all dirty pages and their minimum LSN
Write a CHECKPOINT record

During recovery:

Start REDO from the minimum recLSN of any dirty page at checkpoint time
Only consider transactions active at or after the checkpoint

Parallel REDO:

Modern systems parallelize REDO by:

Dividing the log into segments processed concurrently
Partitioning pages across recovery threads
Using page-level locking during recovery

UNDO Optimization:

UNDO is inherently serial for each transaction (must undo in reverse order), but:

Multiple transactions can be undone in parallel
Transactions with few modifications complete quickly
Most transactions are short, limiting UNDO work

Recovery Phase Characteristics
Phase	Direction	Scope	Parallelization
Analysis	Forward	Checkpoint to crash	Single-threaded (fast)
REDO	Forward	Min recLSN to crash	Highly parallelizable
UNDO	Backward	Per-transaction chains	Parallel across transactions

Recovery Time Objective (RTO)

Modern databases aim for sub-minute recovery times even after crashes with significant dirty data. This is achieved through aggressive checkpointing, efficient log structure, and parallel recovery. ARIES, which we'll study next, incorporates all these optimizations.

Transaction Rollback (Normal Operation)

Before discussing recovery further, it's important to understand how immediate update handles transaction rollback during normal operation (not crash recovery). When a transaction explicitly aborts or must be rolled back due to conflicts, the system must undo its modifications.

The Rollback Process:

Follow the transaction's log chain backwards (using prevLSN pointers)
For each WRITE record encountered, apply the UNDO operation using the before-image
Write a CLR for each undone operation
Write an ABORT record to finalize

The CLRs serve two purposes:

They record the rollback actions (for recovery if we crash mid-rollback)
They point backwards, indicating how far we've undone

transaction_rollback.pseudo

Pseudocode

PROCEDURE Rollback_Transaction(transaction T):
    // Start from T's most recent log record
    current_LSN := T.lastLSN
    
    WHILE current_LSN != NULL:
        record := read_log_record(current_LSN)
        
        IF record.type = WRITE THEN
            // Undo this modification
            page := read_page(record.page_id)
            page[record.offset] := record.before_image
            
            // Write CLR (Compensation Log Record)
            clr := {
                type: CLR,
                txn_id: T.id,
                page_id: record.page_id,
                offset: record.offset,
                undo_value: record.before_image,
                undoNextLSN: record.prevLSN  // Points to next record to undo
            }
            write_log_record(clr)
            
            // Flush page (respecting WAL)
            page.pageLSN := clr.LSN
            // Page may be flushed now or later
        END IF
        
        // Move to previous record in T's chain
        current_LSN := record.prevLSN
    END WHILE
    
    // Write ABORT record
    write_log_record({type: ABORT, txn_id: T.id})
    
    // Release locks
    release_all_locks(T)
    
    RETURN success

Why CLRs are Essential:

Consider what happens if the system crashes during a rollback:

Transaction T was being rolled back
Some WRITE records were undone, CLRs written
CRASH occurs
Recovery begins

Without CLRs, recovery would:

REDO all of T's writes (restoring T's modifications)
UNDO all of T's writes (since T never committed)
But some writes were already undone before the crash!

With CLRs:

REDO replays the CLRs, applying the undo operations
UNDO sees that these operations are already compensated
No double-undo occurs

The undoNextLSN Pointer:

Each CLR includes an undoNextLSN field pointing to the next log record that needs to be undone. This creates a 'skip chain' that allows UNDO to efficiently navigate past already-compensated operations.

Rollback Cost

Transaction rollback in immediate update requires I/O proportional to the number of modifications. For a transaction that modified 10,000 pages, rollback must read and write each page. This is fundamentally different from deferred update where abort is nearly free. In immediate update, long transactions are expensive to abort.

Summary: Immediate Update

The immediate update (UNDO/REDO) algorithm provides the flexibility needed by modern, general-purpose database systems. Let's consolidate the key concepts:

Key Takeaways

•Core flexibility: Uncommitted changes may be written to the database, enabling unlimited transaction sizes
•WAL is essential: Log records must reach stable storage before corresponding database changes to ensure recoverability
•Log structure: Both before-images (for UNDO) and after-images (for REDO) are recorded
•STEAL policy: Buffer manager can evict dirty pages from uncommitted transactions
•NO-FORCE policy: Data pages needn't be forced at commit, only the log (fast commits)
•Two-phase recovery: REDO to restore pre-crash state, UNDO to remove uncommitted changes
•CLRs prevent double-undo: Compensation Log Records ensure crash-safe rollback
•Industry standard: The STEAL/NO-FORCE combination with UNDO/REDO is the foundation of ARIES and all major DBMS

What's Next:

In the next page, we'll explore Shadow Paging—a fundamentally different approach to recovery that avoids logging entirely by maintaining shadow copies of modified pages. While elegant in concept, shadow paging has practical limitations that explain why logging-based approaches dominate.

Page Complete

You now understand immediate update recovery: why WAL is critical, how STEAL/NO-FORCE enables flexible buffer management, the two-phase recovery procedure, and the role of CLRs in crash-safe rollback. This knowledge directly prepares you for understanding ARIES, the industry-standard recovery algorithm.

Immediate Update (UNDO/REDO)

The Freedom to Write Early

What You Will Learn

The Fundamental Concept

The Immediate Update technique relaxes the constraint imposed by deferred update:

Uncommitted data may be written to the stable database before a transaction commits.

This means that at any point during execution, the database on disk may contain a mixture of:

Committed, persisted changes (normal, expected state)
Uncommitted changes from active transactions (in-flight modificationss)
Partial changes from transactions that will never commit (due to future abort or failure)

Why 'UNDO/REDO'?

The name describes the recovery operations required:

UNDO: Changes from uncommitted transactions may be in the database and must be reversed
REDO: Changes from committed transactions may NOT be in the database and must be reapplied

Both operations are necessary because we lose control over the relationship between transaction state and database state.

Immediate Update Policy Summary
Aspect	Policy	Rationale
Write Timing	Anytime (before or after commit)	Flexible buffer management
Buffer Management	Pages can be flushed at will (STEAL)	No buffer pool size constraints
Log Content	Both old and new values	Enables both UNDO and REDO
Recovery Action	UNDO uncommitted, REDO committed	Handles both failure scenarios
Transaction Size	Unlimited (can exceed buffer pool)	Supports bulk operations

The Trade-off:

The Dominant Approach

Write-Ahead Logging (WAL)

The WAL Protocol:

Before any modification is written to the database, the corresponding log record must first be written to stable storage.

This rule has two components:

UNDO Rule (WAL for atomicity): Before writing a dirty page to disk, all log records for that page (containing before-images) must be on stable storage. This ensures we can undo any uncommitted change found in the database.
REDO Rule (WAL for durability): Before acknowledging a commit, all log records for the transaction (including the COMMIT record) must be on stable storage. This ensures we can redo any committed change not yet in the database.

wal_protocol.pseudo

Pseudocode

// The WAL Protocol: Two fundamental rules
 
RULE 1: UNDO Guarantee (Atomicity)
// "Log before data" for dirty pages
 
PROCEDURE Flush_Page_To_Disk(page P):
    // Find the most recent LSN of any log record affecting P
    page_lsn := P.pageLSN
    
    // Ensure all log records up to that LSN are on stable storage
    IF log_buffer.flushed_LSN < page_lsn THEN
        FORCE_LOG_UP_TO(page_lsn)
    END IF
    
    // NOW it is safe to write the page
    WRITE_PAGE_TO_DISK(P)
 
 
RULE 2: REDO Guarantee (Durability)
// "Force log at commit"
 
PROCEDURE Commit_Transaction(transaction T):
    // Create the COMMIT log record
    commit_record := create_commit_record(T)
    append_to_log_buffer(commit_record)
    
    // Force all log records for T (including COMMIT) to disk
    FORCE_LOG_UP_TO(commit_record.LSN)
    
    // NOW the transaction is committed
    // Database pages can be written later (or not at all until recovery)
    
    RETURN "COMMITTED"

Why WAL Works:

Consider what happens during recovery after a crash:

If an uncommitted change is in the database: The log contains the before-image (because of Rule 1). We can UNDO the change.
If a committed change is NOT in the database: The log contains the after-image (because of Rule 2). We can REDO the change.

WAL ensures that the log is always ahead of the database. The log contains everything we need to reconstruct any consistent state.

The Log Sequence Number (LSN):

When flushing a page, we check: is the log flushed past this page's LSN?
During REDO, we check: is this page already updated past this log record's LSN?

The Cardinal Rule

Log Record Structure

Under immediate update, log records must carry more information than in deferred update systems. Both UNDO and REDO operations require sufficient data to reconstruct or reverse any modification.

Essential Log Record Components:

LSN (Log Sequence Number): Unique, monotonically increasing identifier
Transaction ID: The transaction that generated this record
Previous LSN: LSN of the previous log record for this transaction (enables transaction-specific processing)
Record Type: START, WRITE, COMMIT, ABORT, CLR, etc.
Page ID: For WRITE records, identifies the affected page
Offset/Slot: Position within the page
Before-Image: The original value (for UNDO)
After-Image: The new value (for REDO)

Log Record Types in Immediate Update
Record Type	Contents	Purpose
[START, T]	Transaction ID, Timestamp	Marks transaction beginning
[WRITE, T, P, offset, before, after]	Full modification details	Enables both UNDO and REDO
[COMMIT, T]	Transaction ID, Commit timestamp	Marks successful completion
[ABORT, T]	Transaction ID	Marks intentional rollback
[CLR, T, P, ...]	Compensation Log Record	Records UNDO actions (prevents re-UNDO)
[CHECKPOINT]	Active transaction list, dirty pages	Bounds recovery work (discussed in ARIES)

Example Log Sequence:

Consider transaction T1 updating account A (balance: $500 → $400) and account B (balance: $200 → $300):

LSN=100: [START, T1]
LSN=101: [WRITE, T1, Page7, offset=24, before=500, after=400]  // A: 500→400
LSN=102: [WRITE, T1, Page12, offset=48, before=200, after=300] // B: 200→300
LSN=103: [COMMIT, T1]

Unlike deferred update, we record BOTH the before-image (500, 200) AND the after-image (400, 300). This allows:

UNDO: If T1 must be rolled back, use before-images to restore original values
REDO: If committed but changes lost, use after-images to reapply

The PrevLSN Chain:

Each log record includes a pointer to the previous log record for the same transaction. This creates per-transaction chains that simplify rollback:

LSN=100: [START, T1, prevLSN=null]
LSN=101: [WRITE, T1, ..., prevLSN=100]
LSN=102: [WRITE, T1, ..., prevLSN=101]
LSN=103: [COMMIT, T1, prevLSN=102]

To roll back T1, follow the chain backwards from the most recent record, undoing each WRITE.

Storage Cost

Buffer Management: The STEAL Policy

The key advantage of immediate update is buffer management flexibility. The buffer manager can operate under a STEAL policy:

STEAL: The buffer manager may 'steal' a buffer frame by writing its dirty page to disk, even if the page contains modifications from uncommitted transactions.

This is the opposite of the NO-STEAL policy required by deferred update. With STEAL:

The buffer pool is never 'blocked' by active transactions
Any page can be evicted when memory is needed
Transactions can modify more data than fits in memory
The system gracefully handles memory pressure

STEAL Policy Benefits

•No transaction size limit
•Efficient buffer utilization
•Supports bulk operations
•Handles memory pressure gracefully
•Enables very large transactions

STEAL Policy Costs

•Requires UNDO during recovery
•Must log before-images
•Larger log records
•More complex recovery algorithm
•Potential cascading abort issues

FORCE vs NO-FORCE:

Orthogonal to STEAL/NO-STEAL is another policy choice:

FORCE: All modified pages must be written to disk before commit completes
NO-FORCE: Modified pages may remain in buffer after commit (written lazily)

Most modern systems use STEAL with NO-FORCE:

STEAL enables unlimited transaction sizes
NO-FORCE reduces commit latency (just force the log, not all data pages)

This combination (STEAL/NO-FORCE) requires both UNDO and REDO during recovery, which is exactly what immediate update provides.

Buffer Policy Combinations
Policy	FORCE	NO-FORCE
NO-STEAL	No UNDO, No REDO (impractical)	No UNDO, REDO needed (Deferred Update)
STEAL	UNDO needed, No REDO (uncommon)	UNDO and REDO needed (Immediate Update)

The ARIES Sweet Spot

Recovery Procedure

Recovery in an immediate update system is more complex than deferred update because we must handle two distinct problems:

Uncommitted changes in the database: Must be undone
Committed changes NOT in the database: Must be redone

The classic recovery algorithm uses two passes:

Pass 1: REDO (Forward Scan)

Pass 2: UNDO (Backward Scan)

Scan the log backwards, undoing all modifications from transactions that didn't commit. This removes the effects of incomplete transactions.

immediate_update_recovery.pseudo

Pseudocode

PROCEDURE Recover_Database():
    // Phase 1: Analysis - Identify transaction states
    committed := {}      // Transactions with COMMIT
    active := {}         // Transactions without COMMIT (will be undone)
    
    FOR EACH log_record IN log (forward):
        CASE log_record.type OF:
            START:
                active.add(log_record.txn_id)
            COMMIT:
                committed.add(log_record.txn_id)
                active.remove(log_record.txn_id)
            ABORT:
                active.remove(log_record.txn_id)
        END CASE
    END FOR
    
    // Phase 2: REDO - Replay all modifications (forward scan)
    // This restores database to pre-crash state
    FOR EACH log_record IN log (forward):
        IF log_record.type = WRITE THEN
            page := read_page(log_record.page_id)
            IF page.pageLSN < log_record.LSN THEN
                // Page doesn't have this update, apply it
                apply_redo(log_record, page)
                page.pageLSN := log_record.LSN
                write_page(page)
            END IF
            // If page.pageLSN >= record.LSN, update already present
        END IF
    END FOR
    
    // Phase 3: UNDO - Reverse uncommitted modifications (backward scan)
    undo_list := collect_write_records_for(active)
    
    WHILE undo_list is not empty:
        record := undo_list.pop_highest_LSN()
        page := read_page(record.page_id)
        
        // Apply UNDO using before-image
        apply_undo(record, page)
        
        // Write CLR to log (prevents re-undo on subsequent crash)
        write_CLR(record)
        
        write_page(page)
    END WHILE
    
    // All active transactions are now rolled back
    FOR EACH txn_id IN active:
        write_abort_record(txn_id)
    END FOR
    
    RETURN success

Recovery Example:

Consider this log state at crash time:

LSN=100: [START, T1]
LSN=101: [WRITE, T1, A, before=500, after=400]
LSN=102: [COMMIT, T1]
LSN=103: [START, T2]
LSN=104: [WRITE, T2, B, before=200, after=300]
LSN=105: [START, T3]
LSN=106: [WRITE, T3, C, before=100, after=150]
LSN=107: [WRITE, T2, D, before=50, after=75]
LSN=108: [COMMIT, T3]
--- CRASH ---

Recovery Analysis:

Transaction	Has COMMIT?	Action
T1	Yes	REDO if needed
T2	No	REDO, then UNDO
T3	Yes	REDO if needed

REDO Phase (forward): Apply all writes (LSN 101, 104, 106, 107)

UNDO Phase (backward): Undo T2's writes in reverse order:

Undo LSN 107: D restored to 50
Undo LSN 104: B restored to 200

Compensation Log Records (CLRs)

Optimization Considerations

The basic UNDO/REDO algorithm can be optimized in several ways. Understanding these optimizations provides insight into how production systems achieve practical recovery times.

LSN-Based REDO Optimization:

if page.pageLSN >= log_record.LSN:
    skip  // Already applied
else:
    apply_redo(log_record)  // Need to redo

This optimization is crucial because it means:

Pages flushed shortly before the crash need minimal redo
Hot pages that were updated frequently benefit most
Clean pages (unmodified since last flush) are untouched

Checkpointing to Bound Recovery:

Without checkpoints, recovery must scan the entire log—potentially gigabytes or terabytes of data. Checkpoints create 'safe points' that limit how far back we need to look:

Record all active transactions
Record all dirty pages and their minimum LSN
Write a CHECKPOINT record

During recovery:

Start REDO from the minimum recLSN of any dirty page at checkpoint time
Only consider transactions active at or after the checkpoint

Parallel REDO:

Modern systems parallelize REDO by:

Dividing the log into segments processed concurrently
Partitioning pages across recovery threads
Using page-level locking during recovery

UNDO Optimization:

UNDO is inherently serial for each transaction (must undo in reverse order), but:

Multiple transactions can be undone in parallel
Transactions with few modifications complete quickly
Most transactions are short, limiting UNDO work

Recovery Phase Characteristics
Phase	Direction	Scope	Parallelization
Analysis	Forward	Checkpoint to crash	Single-threaded (fast)
REDO	Forward	Min recLSN to crash	Highly parallelizable
UNDO	Backward	Per-transaction chains	Parallel across transactions

Recovery Time Objective (RTO)

Transaction Rollback (Normal Operation)

The Rollback Process:

Follow the transaction's log chain backwards (using prevLSN pointers)
For each WRITE record encountered, apply the UNDO operation using the before-image
Write a CLR for each undone operation
Write an ABORT record to finalize

The CLRs serve two purposes:

They record the rollback actions (for recovery if we crash mid-rollback)
They point backwards, indicating how far we've undone

transaction_rollback.pseudo

Pseudocode

PROCEDURE Rollback_Transaction(transaction T):
    // Start from T's most recent log record
    current_LSN := T.lastLSN
    
    WHILE current_LSN != NULL:
        record := read_log_record(current_LSN)
        
        IF record.type = WRITE THEN
            // Undo this modification
            page := read_page(record.page_id)
            page[record.offset] := record.before_image
            
            // Write CLR (Compensation Log Record)
            clr := {
                type: CLR,
                txn_id: T.id,
                page_id: record.page_id,
                offset: record.offset,
                undo_value: record.before_image,
                undoNextLSN: record.prevLSN  // Points to next record to undo
            }
            write_log_record(clr)
            
            // Flush page (respecting WAL)
            page.pageLSN := clr.LSN
            // Page may be flushed now or later
        END IF
        
        // Move to previous record in T's chain
        current_LSN := record.prevLSN
    END WHILE
    
    // Write ABORT record
    write_log_record({type: ABORT, txn_id: T.id})
    
    // Release locks
    release_all_locks(T)
    
    RETURN success

Why CLRs are Essential:

Consider what happens if the system crashes during a rollback:

Transaction T was being rolled back
Some WRITE records were undone, CLRs written
CRASH occurs
Recovery begins

Without CLRs, recovery would:

REDO all of T's writes (restoring T's modifications)
UNDO all of T's writes (since T never committed)
But some writes were already undone before the crash!

With CLRs:

REDO replays the CLRs, applying the undo operations
UNDO sees that these operations are already compensated
No double-undo occurs

The undoNextLSN Pointer:

Rollback Cost

Summary: Immediate Update

The immediate update (UNDO/REDO) algorithm provides the flexibility needed by modern, general-purpose database systems. Let's consolidate the key concepts:

Key Takeaways

•Core flexibility: Uncommitted changes may be written to the database, enabling unlimited transaction sizes
•WAL is essential: Log records must reach stable storage before corresponding database changes to ensure recoverability
•Log structure: Both before-images (for UNDO) and after-images (for REDO) are recorded
•STEAL policy: Buffer manager can evict dirty pages from uncommitted transactions
•NO-FORCE policy: Data pages needn't be forced at commit, only the log (fast commits)
•Two-phase recovery: REDO to restore pre-crash state, UNDO to remove uncommitted changes
•CLRs prevent double-undo: Compensation Log Records ensure crash-safe rollback
•Industry standard: The STEAL/NO-FORCE combination with UNDO/REDO is the foundation of ARIES and all major DBMS

What's Next:

Page Complete