Database Management SystemsRecovery Algorithms

Undo Phase in ARIES Recovery

LevelAdvanced

Duration75 mins

TopicRecovery Algorithms

1 / 5

Undoing Incomplete Transactions

The Final Phase of Recovery

After a system crash, the ARIES recovery algorithm has completed two critical phases: the Analysis Phase has reconstructed the system state at the time of failure, and the Redo Phase has replayed all logged operations to bring the database to its pre-crash state. But there's a fundamental problem remaining: some transactions were still active when the crash occurred.

These incomplete transactions—sometimes called "loser transactions"—represent a serious threat to database consistency. They may have modified data that they never committed, violating the atomicity guarantee: either all of a transaction's changes should be visible, or none of them should be. The Undo Phase exists to solve this problem by systematically reversing every change made by transactions that never reached the commit point.

What You Will Learn

By the end of this page, you will understand why incomplete transactions must be undone, how the undo phase identifies which transactions require reversal, the relationship between undo and the ACID atomicity property, and the fundamental mechanics of reversing changes through backward log processing.

Why Undo is Necessary

To understand why the undo phase is essential, we must first understand what happens during normal database operation and what state the database may be in after a crash.

The Steal Policy Problem:

ARIES uses a steal buffer management policy, which means dirty pages (pages modified by uncommitted transactions) can be written to disk before the transaction commits. This is done for performance reasons—waiting to flush all dirty pages until commit would create significant I/O bottlenecks.

However, this policy creates a critical challenge: after a crash, the disk may contain modifications from transactions that never committed. These "uncommitted changes on disk" must be removed to maintain atomicity.

Transaction States and Recovery Requirements
Transaction State at Crash	What Redo Does	What Undo Must Do
Committed before crash	Ensures all changes are on disk	Nothing (skip)
In progress at crash	Replays all logged changes	Must reverse ALL changes
Aborted before crash	Replays changes AND undo CLRs	Nothing (already handled)
Prepared (2PC) at crash	Replays all changes	Depends on coordinator decision

The Chain of Reasoning:

Redo is blind: The redo phase replays ALL logged operations, regardless of transaction outcome. This ensures no committed work is lost, but it also means uncommitted changes get replayed.
Uncommitted changes violate atomicity: After redo, the database may contain partial results from transactions that never finished. These partial results could be seen by future transactions, violating isolation and consistency.
Undo restores atomicity: By reversing all changes from uncommitted transactions, the undo phase ensures the database state reflects only complete, committed transactions.

This is the fundamental contract of undo: erase all evidence that an uncommitted transaction ever existed.

Why Not Just Skip Uncommitted Changes in Redo?

You might wonder: why replay uncommitted changes during redo just to undo them afterward? The answer is that redo must reconstruct the exact database state at crash time to correctly handle page-level recovery. Some committed transactions may have modified the same pages as uncommitted ones, and reconstructing the correct page state requires replaying all operations in order. Only after this state is reconstructed can undo safely reverse the uncommitted changes.

Identifying Loser Transactions

The term "loser transaction" refers to any transaction that was active at the time of the crash and did not have a commit record in the log. Identifying these losers is straightforward because the analysis phase has already built the Transaction Table (TT) that tracks the state of all transactions.

How Losers are Identified:

During the analysis phase, ARIES scans the log from the most recent checkpoint to the end, updating the transaction table as it encounters:

Begin records: Add transaction to the table with status ACTIVE
Update records: Update lastLSN for the transaction
Commit records: Change status to COMMITTED, then remove from table (or mark as complete)
Abort records: Change status to ABORTING
End records: Remove transaction from table entirely

At the end of analysis, any transaction remaining in the table with status ACTIVE or ABORTING is a loser that needs to be undone.

transaction_table_after_analysis.sql
Transaction Table State
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-- Transaction Table after Analysis Phase
-- This table was reconstructed by scanning the log
 
-- Sample Transaction Table Contents:
/*
TransactionID | Status     | LastLSN | FirstLSN | UndoNextLSN
--------------+------------+---------+----------+-------------
    T1        | COMMITTED  |  1050   |   1000   |    NULL      -- Winner: will be removed
    T2        | ACTIVE     |  1045   |   1010   |    1045      -- Loser: must undo
    T3        | ACTIVE     |  1060   |   1025   |    1060      -- Loser: must undo
    T4        | ABORTING   |  1055   |   1030   |    1048      -- Loser: continue undo
    T5        | COMMITTED  |  1040   |   1005   |    NULL      -- Winner: will be removed
*/
 
-- After filtering, Undo Phase works with:
-- Loser Set = {T2, T3, T4}
 
-- For each loser, we need:
-- 1. UndoNextLSN: The next log record to undo
-- 2. LastLSN: Pointer to most recent log record
-- 3. The ability to walk backward through the chain

Properties of Loser Transactions

•Status is ACTIVE or ABORTING — No commit record was written to stable storage before the crash
•May have modified multiple pages — Each modification is recorded in a log record with PrevLSN links
•May have partially undone — If aborting at crash time, some CLRs may already exist
•UndoNextLSN tracks progress — Points to the next log record that needs to be undone
•Form a linked list via PrevLSN — Each log record points to the previous one for the same transaction

The UndoNextLSN Field:

This critical field in the transaction table tracks undo progress. For a transaction that was simply active (never started aborting), UndoNextLSN equals LastLSN—we start from the most recent operation and work backward.

For a transaction that was already aborting when the crash occurred, UndoNextLSN may point somewhere in the middle of the log record chain, indicating how far the undo had progressed before the crash. This is essential for avoiding redundant work and ensuring correctness.

The Loser Set and Undo Topology

When the undo phase begins, it constructs a loser set from the transaction table—the set of all transactions requiring undo. But how should these transactions be processed? ARIES makes a crucial design decision here: rather than processing each loser transaction sequentially, it processes them in parallel through a single backward pass of the log.

Why Parallel Processing Matters:

Consider a scenario with three loser transactions that each modified different pages at different times. Sequential processing would require:

Walk T1's log chain backward (many disk seeks)
Then walk T2's log chain backward (more disk seeks)
Then walk T3's log chain backward (more disk seeks)

Instead, ARIES processes ALL losers in a single backward scan, undoing operations in reverse LSN order regardless of which transaction performed them.

Converting Mermaid diagram...

In the diagram above:

Red records are update records from loser transactions that need undoing
Green records are commit records (transactions T1, T5 are winners)
Yellow is an existing CLR (T4 was already aborting)
Teal records are transaction begin records

Single Backward Pass Efficiency:

ARIES maintains a ToUndo set containing the UndoNextLSN for each loser. It repeatedly:

Picks the maximum LSN from ToUndo
Reads that log record
Either undoes it (if it's an update) or skips (if it's a CLR)
Updates ToUndo with the appropriate next pointer
Continues until ToUndo is empty

This approach minimizes random I/O by processing the log in order.

The ToUndo Set

The ToUndo set is typically implemented as a max-heap (priority queue) ordered by LSN. This allows O(log n) extraction of the highest LSN where n is the number of active losers. For small numbers of concurrent transactions, even a simple sorted list works well.

The Mechanics of Undo

When the undo phase encounters an update log record from a loser transaction, it must reverse the effect of that update. This process involves several coordinated steps:

Step 1: Read the Log Record

Each update log record contains:

TransactionID: Which transaction performed the update
PageID: Which database page was modified
PrevLSN: Previous log record for this transaction
BeforeImage: The value before the update (for undo)
AfterImage: The value after the update (for redo)
Additional metadata (record ID, offset, length, etc.)

Step 2: Apply the Before Image

To undo an update, we restore the data to its before-image value. This is conceptually simple but requires care:

The page must be latched (locked in memory) during the update
The page's PageLSN might be examined (though undo is unconditional)
The before-image is written to the appropriate location on the page

undo_single_record.pseudo

Pseudo Code

PROCEDURE UndoSingleRecord(logRecord):
    // Step 1: Extract information from log record
    txnId = logRecord.transactionId
    pageId = logRecord.pageId
    beforeImage = logRecord.beforeImage
    afterImage = logRecord.afterImage
    recordOffset = logRecord.offset
    
    // Step 2: Fetch and latch the page
    page = bufferPool.FetchPage(pageId)
    page.AcquireExclusiveLatch()
    
    TRY:
        // Step 3: Apply the before-image (reverse the change)
        page.WriteAt(recordOffset, beforeImage)
        
        // Step 4: Write a Compensation Log Record (CLR)
        clrLSN = writeCompensationLogRecord(
            transactionId: txnId,
            pageId: pageId,
            undoNextLSN: logRecord.prevLSN,  // Skip over what we just undid
            redoInfo: beforeImage            // So redo can re-apply this undo
        )
        
        // Step 5: Update the page's LSN to the CLR's LSN
        page.SetPageLSN(clrLSN)
        
        // Step 6: Update transaction table
        transactionTable[txnId].undoNextLSN = logRecord.prevLSN
        transactionTable[txnId].lastLSN = clrLSN
    
    FINALLY:
        // Step 7: Release latch (page may still be dirty in buffer)
        page.ReleaseExclusiveLatch()
    
    RETURN logRecord.prevLSN  // Next record to undo for this txn

Step 3: Write a Compensation Log Record (CLR)

Critically, every undo operation is itself logged. This CLR serves multiple purposes:

Makes the undo operation durable (if we crash during undo, we can redo the CLR)
Tracks progress (the undoNextLSN field says what's left to undo)
Avoids duplicate work on subsequent crashes

We'll explore CLRs in depth on the next page.

Step 4: Update Page and Transaction Metadata

After the undo:

The page's PageLSN is updated to the CLR's LSN
The transaction table's undoNextLSN advances to the previous record
The transaction table's lastLSN updates to point to the CLR

Physical vs Logical Undo

ARIES uses physiological logging, so undo is physical within a page but logical across pages. The before-image captures the exact byte changes needed to reverse an update. Some systems use logical undo (re-executing inverse operations), but this is more complex and can fail if the database structure has changed.

Handling Different Log Record Types

Not all log records encountered during the backward scan require the same handling. The undo phase must distinguish between several types:

Update Records (Need Undo): These are the primary targets of the undo phase. Each update record from a loser transaction must be reversed by applying the before-image and writing a CLR.

CLR Records (Do Not Re-Undo): Compensation log records represent undos that already happened. When encountered, we don't undo them again—that would undo the undo! Instead, we follow the CLR's undoNextLSN to continue the undo chain from where it left off.

Begin Records (Transaction Complete): When we reach a transaction's begin record, that transaction's undo is complete. We write an END record and remove the transaction from the loser set.

Log Record Type Handling During Undo
Record Type	Action in Undo Phase	What Happens to ToUndo
UPDATE (loser txn)	Apply before-image, write CLR	Replace with PrevLSN (or remove if NULL)
CLR	Do NOT undo again	Replace with UndoNextLSN from CLR
BEGIN (loser txn)	Write END record for transaction	Remove transaction from ToUndo set
COMMIT	Should not appear (txn would be winner)	N/A
ABORT	Continue processing (txn was aborting)	No change (continue processing records)
CHECKPOINT	Ignore	No change

undo_phase_main_loop.pseudo

Undo Phase Algorithm

PROCEDURE UndoPhase(transactionTable):
    // Initialize ToUndo with UndoNextLSN for all losers
    ToUndo = MaxHeap()  // Ordered by LSN (highest first)
    
    FOR EACH txn IN transactionTable:
        IF txn.status IN {ACTIVE, ABORTING}:
            IF txn.undoNextLSN != NULL:
                ToUndo.Insert(txn.undoNextLSN, txn.transactionId)
    
    // Main undo loop: process in reverse LSN order
    WHILE NOT ToUndo.IsEmpty():
        (currentLSN, txnId) = ToUndo.ExtractMax()
        logRecord = Log.Read(currentLSN)
        
        IF logRecord.type == UPDATE:
            // Undo this update
            nextLSN = UndoSingleRecord(logRecord)
            IF nextLSN != NULL:
                ToUndo.Insert(nextLSN, txnId)
            ELSE:
                // Reached BEGIN - transaction fully undone
                Log.Write(END_RECORD, txnId)
                transactionTable.Remove(txnId)
                
        ELSE IF logRecord.type == CLR:
            // Don't re-undo; follow the undoNextLSN
            IF logRecord.undoNextLSN != NULL:
                ToUndo.Insert(logRecord.undoNextLSN, txnId)
            ELSE:
                // CLR points to nothing - transaction fully undone
                Log.Write(END_RECORD, txnId)
                transactionTable.Remove(txnId)
                
        ELSE IF logRecord.type == BEGIN:
            // Transaction fully undone
            Log.Write(END_RECORD, txnId)
            transactionTable.Remove(txnId)
    
    // Undo phase complete - all losers rolled back

Why CLRs Are Never Undone

CLRs are redo-only log records. They exist to record the fact that an undo occurred, allowing that undo to be redone if there's a subsequent crash. If we "undid" a CLR, we would be reversing a reversal—essentially re-applying the original change that we were trying to undo. The undoNextLSN pointer in CLRs allows us to skip them correctly during undo.

Maintaining Consistency During Undo

The undo phase operates while the database is being recovered—it's not yet open for normal operations. However, maintaining consistency during this phase is still critical because crashes can occur during recovery itself.

The Crash-During-Recovery Problem:

Imagine we're halfway through undoing a transaction when power fails again. What happens?

On the next restart, we run the full ARIES recovery again
Analysis phase sees the incomplete transaction (still in the log as active)
Redo phase replays all operations, including the CLRs we wrote before the second crash
Undo phase resumes, but now the CLRs tell us how far we got

The system correctly continues from where it left off, without re-undoing operations that were already undone.

Consistency Guarantees

•Idempotent redo: CLRs can be redone safely multiple times
•Progress tracking: UndoNextLSN ensures no duplicate undo
•WAL compliance: CLRs are force-written before page modifications
•Atomic completion: END record marks clean transaction termination

Potential Issues Without CLRs

•Double undo: Same operation reversed twice
•Lost progress: Restarting undo from the beginning
•Inconsistent state: Partial undo with no record
•Unbounded recovery: Recovery might never complete

The Role of the END Record:

When a transaction's undo is complete—either by reaching its BEGIN record or by processing a CLR with null undoNextLSN—an END record is written to the log. This serves as definitive proof that the transaction has been fully rolled back.

The END record is important for log truncation: once a transaction's END record is written, all of its log records (including CLRs) are eligible for eventual removal. Without END records, the system couldn't determine which portions of the log are safe to reclaim.

Concurrent Undo Considerations:

While the basic ARIES undo is single-threaded (one backward scan), some implementations parallelize undo by assigning different loser transactions to different worker threads. This requires careful coordination:

Different transactions should access different pages (or use latching)
CLR writing must still be serialized to maintain LSN order
The transaction table must be thread-safe

When Undo Completes

The undo phase terminates when the ToUndo set becomes empty, meaning every loser transaction has been fully rolled back. At this point:

Database State:

All modifications from committed transactions are present
All modifications from uncommitted transactions have been reversed
The database reflects a consistent state as if only committed transactions ever ran

Log State:

Contains END records for all loser transactions
Contains CLRs documenting all undo operations
Is ready for normal operation and future checkpointing

System State:

Transaction table is empty (or contains only committed entries)
Dirty page table reflects current buffer pool state
The system can begin accepting new transactions

recovery_complete.pseudo

Recovery Completion

PROCEDURE CompleteRecovery():
    // At this point, all three phases are done:
    // - Analysis: Reconstructed DPT and TT
    // - Redo: Replayed all logged operations
    // - Undo: Rolled back all loser transactions
    
    // Verify completion
    ASSERT transactionTable.IsEmpty() OR
           ALL(txn IN transactionTable: txn.status == COMMITTED)
    
    // Optional: force a checkpoint now
    // This reduces future recovery time
    IF config.CHECKPOINT_AFTER_RECOVERY:
        WriteCheckpoint()
    
    // Enable normal operations
    SetSystemStatus(ONLINE)
    
    // Begin accepting new connections
    OpenForTransactions()
    
    Log.Info("Recovery complete. Database is online.")
    Log.Info("Undo phase rolled back {count} loser transactions.")
    Log.Info("Recovery time: {elapsed} seconds.")

The ARIES Guarantee

After ARIES recovery completes, the database is in a state where: (1) Every committed transaction's effects are fully present, honoring durability. (2) Every uncommitted transaction's effects are completely absent, honoring atomicity. (3) The database satisfies all integrity constraints, honoring consistency. This is the fundamental promise of the recovery system.

Summary: Undoing Incomplete Transactions

This page established the foundational concepts for the undo phase of ARIES recovery. Let's consolidate the key insights:

Key Takeaways

•Undo restores atomicity — The steal policy allows uncommitted changes on disk; undo removes them to honor the all-or-nothing guarantee.
•Loser transactions are identified during analysis — Any transaction without a commit record is a loser that needs undo.
•Single backward scan processes all losers — Using a ToUndo set ordered by LSN minimizes disk I/O during undo.
•Before-images reverse updates — Each update log record contains the information needed to undo its effect.
•Different record types need different handling — Updates are undone, CLRs are followed (not undone), and BEGIN records signal completion.
•CLRs make undo crash-resistant — If a crash occurs during undo, the CLRs ensure progress is preserved.
•END records mark transaction completion — They enable log truncation and confirm full rollback.

What's Next:

Now that we understand the fundamentals of undoing incomplete transactions, the next page dives deep into Compensation Log Records (CLRs)—the mechanism that makes undo operations themselves durable and crash-proof. We'll explore their structure, why they're "redo-only," and how they form a chain that tracks undo progress.

Page Complete

You now understand why incomplete transactions must be undone, how loser transactions are identified, and the fundamental mechanics of the undo phase. The next page explores Compensation Log Records (CLRs) in detail—the key mechanism that makes undo operations themselves recoverable.

1 / 5

Loading learning content...

Database Management SystemsRecovery Algorithms

Undo Phase in ARIES Recovery

LevelAdvanced

Duration75 mins

TopicRecovery Algorithms

1 / 5

Undoing Incomplete Transactions

The Final Phase of Recovery

What You Will Learn

Why Undo is Necessary

To understand why the undo phase is essential, we must first understand what happens during normal database operation and what state the database may be in after a crash.

The Steal Policy Problem:

Transaction States and Recovery Requirements
Transaction State at Crash	What Redo Does	What Undo Must Do
Committed before crash	Ensures all changes are on disk	Nothing (skip)
In progress at crash	Replays all logged changes	Must reverse ALL changes
Aborted before crash	Replays changes AND undo CLRs	Nothing (already handled)
Prepared (2PC) at crash	Replays all changes	Depends on coordinator decision

The Chain of Reasoning:

Redo is blind: The redo phase replays ALL logged operations, regardless of transaction outcome. This ensures no committed work is lost, but it also means uncommitted changes get replayed.
Uncommitted changes violate atomicity: After redo, the database may contain partial results from transactions that never finished. These partial results could be seen by future transactions, violating isolation and consistency.
Undo restores atomicity: By reversing all changes from uncommitted transactions, the undo phase ensures the database state reflects only complete, committed transactions.

This is the fundamental contract of undo: erase all evidence that an uncommitted transaction ever existed.

Why Not Just Skip Uncommitted Changes in Redo?

Identifying Loser Transactions

How Losers are Identified:

During the analysis phase, ARIES scans the log from the most recent checkpoint to the end, updating the transaction table as it encounters:

Begin records: Add transaction to the table with status ACTIVE
Update records: Update lastLSN for the transaction
Commit records: Change status to COMMITTED, then remove from table (or mark as complete)
Abort records: Change status to ABORTING
End records: Remove transaction from table entirely

At the end of analysis, any transaction remaining in the table with status ACTIVE or ABORTING is a loser that needs to be undone.

transaction_table_after_analysis.sql
Transaction Table State
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-- Transaction Table after Analysis Phase
-- This table was reconstructed by scanning the log
 
-- Sample Transaction Table Contents:
/*
TransactionID | Status     | LastLSN | FirstLSN | UndoNextLSN
--------------+------------+---------+----------+-------------
    T1        | COMMITTED  |  1050   |   1000   |    NULL      -- Winner: will be removed
    T2        | ACTIVE     |  1045   |   1010   |    1045      -- Loser: must undo
    T3        | ACTIVE     |  1060   |   1025   |    1060      -- Loser: must undo
    T4        | ABORTING   |  1055   |   1030   |    1048      -- Loser: continue undo
    T5        | COMMITTED  |  1040   |   1005   |    NULL      -- Winner: will be removed
*/
 
-- After filtering, Undo Phase works with:
-- Loser Set = {T2, T3, T4}
 
-- For each loser, we need:
-- 1. UndoNextLSN: The next log record to undo
-- 2. LastLSN: Pointer to most recent log record
-- 3. The ability to walk backward through the chain

Properties of Loser Transactions

•Status is ACTIVE or ABORTING — No commit record was written to stable storage before the crash
•May have modified multiple pages — Each modification is recorded in a log record with PrevLSN links
•May have partially undone — If aborting at crash time, some CLRs may already exist
•UndoNextLSN tracks progress — Points to the next log record that needs to be undone
•Form a linked list via PrevLSN — Each log record points to the previous one for the same transaction

The UndoNextLSN Field:

The Loser Set and Undo Topology

Why Parallel Processing Matters:

Consider a scenario with three loser transactions that each modified different pages at different times. Sequential processing would require:

Walk T1's log chain backward (many disk seeks)
Then walk T2's log chain backward (more disk seeks)
Then walk T3's log chain backward (more disk seeks)

Instead, ARIES processes ALL losers in a single backward scan, undoing operations in reverse LSN order regardless of which transaction performed them.

Converting Mermaid diagram...

In the diagram above:

Red records are update records from loser transactions that need undoing
Green records are commit records (transactions T1, T5 are winners)
Yellow is an existing CLR (T4 was already aborting)
Teal records are transaction begin records

Single Backward Pass Efficiency:

ARIES maintains a ToUndo set containing the UndoNextLSN for each loser. It repeatedly:

Picks the maximum LSN from ToUndo
Reads that log record
Either undoes it (if it's an update) or skips (if it's a CLR)
Updates ToUndo with the appropriate next pointer
Continues until ToUndo is empty

This approach minimizes random I/O by processing the log in order.

The ToUndo Set

The Mechanics of Undo

When the undo phase encounters an update log record from a loser transaction, it must reverse the effect of that update. This process involves several coordinated steps:

Step 1: Read the Log Record

Each update log record contains:

TransactionID: Which transaction performed the update
PageID: Which database page was modified
PrevLSN: Previous log record for this transaction
BeforeImage: The value before the update (for undo)
AfterImage: The value after the update (for redo)
Additional metadata (record ID, offset, length, etc.)

Step 2: Apply the Before Image

To undo an update, we restore the data to its before-image value. This is conceptually simple but requires care:

The page must be latched (locked in memory) during the update
The page's PageLSN might be examined (though undo is unconditional)
The before-image is written to the appropriate location on the page

undo_single_record.pseudo

Pseudo Code

PROCEDURE UndoSingleRecord(logRecord):
    // Step 1: Extract information from log record
    txnId = logRecord.transactionId
    pageId = logRecord.pageId
    beforeImage = logRecord.beforeImage
    afterImage = logRecord.afterImage
    recordOffset = logRecord.offset
    
    // Step 2: Fetch and latch the page
    page = bufferPool.FetchPage(pageId)
    page.AcquireExclusiveLatch()
    
    TRY:
        // Step 3: Apply the before-image (reverse the change)
        page.WriteAt(recordOffset, beforeImage)
        
        // Step 4: Write a Compensation Log Record (CLR)
        clrLSN = writeCompensationLogRecord(
            transactionId: txnId,
            pageId: pageId,
            undoNextLSN: logRecord.prevLSN,  // Skip over what we just undid
            redoInfo: beforeImage            // So redo can re-apply this undo
        )
        
        // Step 5: Update the page's LSN to the CLR's LSN
        page.SetPageLSN(clrLSN)
        
        // Step 6: Update transaction table
        transactionTable[txnId].undoNextLSN = logRecord.prevLSN
        transactionTable[txnId].lastLSN = clrLSN
    
    FINALLY:
        // Step 7: Release latch (page may still be dirty in buffer)
        page.ReleaseExclusiveLatch()
    
    RETURN logRecord.prevLSN  // Next record to undo for this txn

Step 3: Write a Compensation Log Record (CLR)

Critically, every undo operation is itself logged. This CLR serves multiple purposes:

Makes the undo operation durable (if we crash during undo, we can redo the CLR)
Tracks progress (the undoNextLSN field says what's left to undo)
Avoids duplicate work on subsequent crashes

We'll explore CLRs in depth on the next page.

Step 4: Update Page and Transaction Metadata

After the undo:

The page's PageLSN is updated to the CLR's LSN
The transaction table's undoNextLSN advances to the previous record
The transaction table's lastLSN updates to point to the CLR

Physical vs Logical Undo

Handling Different Log Record Types

Not all log records encountered during the backward scan require the same handling. The undo phase must distinguish between several types:

Update Records (Need Undo): These are the primary targets of the undo phase. Each update record from a loser transaction must be reversed by applying the before-image and writing a CLR.

Begin Records (Transaction Complete): When we reach a transaction's begin record, that transaction's undo is complete. We write an END record and remove the transaction from the loser set.

Log Record Type Handling During Undo
Record Type	Action in Undo Phase	What Happens to ToUndo
UPDATE (loser txn)	Apply before-image, write CLR	Replace with PrevLSN (or remove if NULL)
CLR	Do NOT undo again	Replace with UndoNextLSN from CLR
BEGIN (loser txn)	Write END record for transaction	Remove transaction from ToUndo set
COMMIT	Should not appear (txn would be winner)	N/A
ABORT	Continue processing (txn was aborting)	No change (continue processing records)
CHECKPOINT	Ignore	No change

undo_phase_main_loop.pseudo

Undo Phase Algorithm

PROCEDURE UndoPhase(transactionTable):
    // Initialize ToUndo with UndoNextLSN for all losers
    ToUndo = MaxHeap()  // Ordered by LSN (highest first)
    
    FOR EACH txn IN transactionTable:
        IF txn.status IN {ACTIVE, ABORTING}:
            IF txn.undoNextLSN != NULL:
                ToUndo.Insert(txn.undoNextLSN, txn.transactionId)
    
    // Main undo loop: process in reverse LSN order
    WHILE NOT ToUndo.IsEmpty():
        (currentLSN, txnId) = ToUndo.ExtractMax()
        logRecord = Log.Read(currentLSN)
        
        IF logRecord.type == UPDATE:
            // Undo this update
            nextLSN = UndoSingleRecord(logRecord)
            IF nextLSN != NULL:
                ToUndo.Insert(nextLSN, txnId)
            ELSE:
                // Reached BEGIN - transaction fully undone
                Log.Write(END_RECORD, txnId)
                transactionTable.Remove(txnId)
                
        ELSE IF logRecord.type == CLR:
            // Don't re-undo; follow the undoNextLSN
            IF logRecord.undoNextLSN != NULL:
                ToUndo.Insert(logRecord.undoNextLSN, txnId)
            ELSE:
                // CLR points to nothing - transaction fully undone
                Log.Write(END_RECORD, txnId)
                transactionTable.Remove(txnId)
                
        ELSE IF logRecord.type == BEGIN:
            // Transaction fully undone
            Log.Write(END_RECORD, txnId)
            transactionTable.Remove(txnId)
    
    // Undo phase complete - all losers rolled back

Why CLRs Are Never Undone

Maintaining Consistency During Undo

The Crash-During-Recovery Problem:

Imagine we're halfway through undoing a transaction when power fails again. What happens?

On the next restart, we run the full ARIES recovery again
Analysis phase sees the incomplete transaction (still in the log as active)
Redo phase replays all operations, including the CLRs we wrote before the second crash
Undo phase resumes, but now the CLRs tell us how far we got

The system correctly continues from where it left off, without re-undoing operations that were already undone.

Consistency Guarantees

•Idempotent redo: CLRs can be redone safely multiple times
•Progress tracking: UndoNextLSN ensures no duplicate undo
•WAL compliance: CLRs are force-written before page modifications
•Atomic completion: END record marks clean transaction termination

Potential Issues Without CLRs

•Double undo: Same operation reversed twice
•Lost progress: Restarting undo from the beginning
•Inconsistent state: Partial undo with no record
•Unbounded recovery: Recovery might never complete

The Role of the END Record:

Concurrent Undo Considerations:

Different transactions should access different pages (or use latching)
CLR writing must still be serialized to maintain LSN order
The transaction table must be thread-safe

When Undo Completes

The undo phase terminates when the ToUndo set becomes empty, meaning every loser transaction has been fully rolled back. At this point:

Database State:

All modifications from committed transactions are present
All modifications from uncommitted transactions have been reversed
The database reflects a consistent state as if only committed transactions ever ran

Log State:

Contains END records for all loser transactions
Contains CLRs documenting all undo operations
Is ready for normal operation and future checkpointing

System State:

Transaction table is empty (or contains only committed entries)
Dirty page table reflects current buffer pool state
The system can begin accepting new transactions

recovery_complete.pseudo

Recovery Completion

PROCEDURE CompleteRecovery():
    // At this point, all three phases are done:
    // - Analysis: Reconstructed DPT and TT
    // - Redo: Replayed all logged operations
    // - Undo: Rolled back all loser transactions
    
    // Verify completion
    ASSERT transactionTable.IsEmpty() OR
           ALL(txn IN transactionTable: txn.status == COMMITTED)
    
    // Optional: force a checkpoint now
    // This reduces future recovery time
    IF config.CHECKPOINT_AFTER_RECOVERY:
        WriteCheckpoint()
    
    // Enable normal operations
    SetSystemStatus(ONLINE)
    
    // Begin accepting new connections
    OpenForTransactions()
    
    Log.Info("Recovery complete. Database is online.")
    Log.Info("Undo phase rolled back {count} loser transactions.")
    Log.Info("Recovery time: {elapsed} seconds.")

The ARIES Guarantee

Summary: Undoing Incomplete Transactions

This page established the foundational concepts for the undo phase of ARIES recovery. Let's consolidate the key insights:

Key Takeaways

•Undo restores atomicity — The steal policy allows uncommitted changes on disk; undo removes them to honor the all-or-nothing guarantee.
•Loser transactions are identified during analysis — Any transaction without a commit record is a loser that needs undo.
•Single backward scan processes all losers — Using a ToUndo set ordered by LSN minimizes disk I/O during undo.
•Before-images reverse updates — Each update log record contains the information needed to undo its effect.
•Different record types need different handling — Updates are undone, CLRs are followed (not undone), and BEGIN records signal completion.
•CLRs make undo crash-resistant — If a crash occurs during undo, the CLRs ensure progress is preserved.
•END records mark transaction completion — They enable log truncation and confirm full rollback.

What's Next:

Page Complete

1 / 5