Loading learning content...
When a database system restarts after a crash, ARIES orchestrates recovery through three carefully choreographed phases: Analysis, Redo, and Undo. Each phase has a specific responsibility, operates on distinct data structures, and must complete before the next can begin. Together, they transform a potentially corrupted database into a consistent one—and they do so correctly even if the system crashes again during recovery.
This three-phase structure is the architectural backbone of ARIES. Understanding what each phase does, why it does it, and how the phases interact is essential for mastering database recovery.
By the end of this page, you will understand: (1) how the Analysis phase reconstructs the system state from the log, (2) how the Redo phase restores the exact crash-time database state, (3) how the Undo phase rolls back incomplete transactions while protecting against nested crashes, and (4) how these phases interconnect to guarantee correct recovery.
Before diving into each phase, let's understand their relationships and responsibilities at a high level.
Sequential Dependency
The three phases execute in strict sequence:
Each phase depends on outputs from the previous phase. Skipping or reordering phases would break correctness.
Distinct Responsibilities
| Phase | Question Answered | Primary Actions | Key Outputs |
|---|---|---|---|
| Analysis | What was the state at crash time? | Scan log forward from checkpoint | Transaction Table, Dirty Page Table, RedoLSN |
| Redo | How do we restore crash-time state? | Re-apply logged operations | Database restored to crash-time state |
| Undo | How do we roll back incomplete work? | Reverse uncommitted operations | Consistent database (only committed effects remain) |
Time Complexity Characteristics
Each phase has different time characteristics:
In practice, Analysis is usually the fastest phase, Redo is predictably bounded, and Undo can be the wildcard—a single long-running uncommitted transaction can mean extended undo time.
Analysis → Redo → Undo isn't arbitrary. Analysis determines WHAT to redo. Redo restores the state needed for correct undo. Undo requires the exact crash-time state because logical undo operations (like decrementing a counter) only work correctly if the page is in the expected state. Reordering would break these dependencies.
The Analysis phase answers a fundamental question: What was the state of the database system when it crashed?
At the moment of crash, we have no in-memory structures—all volatile state is lost. But the log on stable storage contains a complete record of operations. The Analysis phase reconstructs:
Starting Point: The Last Checkpoint
Analysis doesn't scan from the beginning of the log—that could be enormous. Instead, it starts from the last checkpoint record. A checkpoint contains snapshots of the Transaction Table and Dirty Page Table at checkpoint time. Analysis uses these as a starting point, then updates them by scanning forward through the log.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970
// ARIES Analysis Phase Algorithmfunction AnalysisPhase(log: Log, lastCheckpoint: CheckpointRecord): AnalysisResult { // Initialize from checkpoint let TransactionTable = clone(lastCheckpoint.TransactionTable); let DirtyPageTable = clone(lastCheckpoint.DirtyPageTable); // Start scanning from checkpoint's LSN let currentLSN = lastCheckpoint.LSN; // Scan forward through the log to end while (hasMoreRecords(log, currentLSN)) { let record = log.read(currentLSN); if (record.type === "END") { // Transaction completed (committed or rolled back) // Remove from transaction table TransactionTable.remove(record.transactionId); } else if (record.transactionId !== null) { // Any record associated with a transaction // Ensure transaction is in the table if (!TransactionTable.has(record.transactionId)) { TransactionTable.add(record.transactionId, { state: "UNDO", // Assume needs undo until we see COMMIT lastLSN: record.LSN }); } else { // Update last LSN for this transaction TransactionTable.get(record.transactionId).lastLSN = record.LSN; } // Handle COMMIT specially if (record.type === "COMMIT") { TransactionTable.get(record.transactionId).state = "COMMIT"; } } // Track dirty pages if (record.type === "UPDATE" || record.type === "CLR") { let pageId = record.pageId; // Add to DPT if not already present // RecLSN is the FIRST log record that dirtied the page if (!DirtyPageTable.has(pageId)) { DirtyPageTable.add(pageId, { recLSN: record.LSN // First LSN that dirtied this page }); } // Note: we don't update recLSN if page already in DPT // because we want the OLDEST dirtying LSN } currentLSN = nextLSN(log, currentLSN); } // Calculate RedoLSN: earliest recLSN in DPT // This is where redo phase will start let RedoLSN = min(DirtyPageTable.values().map(e => e.recLSN)); if (RedoLSN === undefined) { RedoLSN = lastCheckpoint.LSN; // No dirty pages, start from checkpoint } return { TransactionTable, DirtyPageTable, RedoLSN };}Key Observations About Analysis
Forward scan only: Analysis reads the log from checkpoint to end, never backward.
Conservative dirty page tracking: If a page appears in an UPDATE or CLR record, it's added to the DPT. The page might actually be clean (already flushed to disk), but we don't know—we'll discover this during redo.
Transaction state inference: Any transaction without a COMMIT or END record is assumed to need undo. Even if the transaction was committing when the crash occurred, if END wasn't written, we treat it as uncommitted.
RecLSN captures first modification: The DPT's recLSN for each page is the first log record that dirtied it after the checkpoint. This determines where redo must start for that page.
RedoLSN is the minimum recLSN: Redo must start early enough to capture all necessary operations. The earliest recLSN in the DPT ensures this.
Analysis errs on the side of caution. It may include pages in the DPT that are actually clean (flushed to disk). It may start redo earlier than strictly necessary. This conservatism is intentional—it's always safe to redo an operation, but missing one could cause corruption. Redo phase will skip already-applied operations via LSN comparison.
Let's examine the two critical data structures that Analysis produces:
Transaction Table (TT)
The Transaction Table tracks every transaction that was active at crash time. For each transaction, it records:
| Field | Type | Description | Example Values |
|---|---|---|---|
| TransactionID | integer | Unique identifier for the transaction | T1, T2, T3... |
| State | enum | Current transaction state | UNDO (active), COMMIT (committed but not ended) |
| LastLSN | LSN | Most recent log record written by this transaction | LSN 5000, 7250, etc. |
| UndoNextLSN | LSN | Next record to undo for this transaction | Initially same as LastLSN |
State Transitions During Analysis
As Analysis scans forward:
After Analysis, only transactions that need undo (state = UNDO) or are in the process of committing (state = COMMIT, but no END yet) remain.
Dirty Page Table (DPT)
The Dirty Page Table identifies which pages may contain unflushed modifications:
| Field | Type | Description | Example Values |
|---|---|---|---|
| PageID | identifier | Unique identifier for the database page | Page 42, Page 1023, etc. |
| RecLSN | LSN | LSN of the first log record that dirtied this page (since checkpoint) | LSN 3500, 4200, etc. |
Why RecLSN Matters
The RecLSN is the crucial value in the DPT. It represents the earliest point from which redo might be necessary for this page. Consider:
The RecLSN answers: "What's the earliest log record that might not be reflected on this page's disk version?"
Computing RedoLSN
The RedoLSN is simply the minimum RecLSN across all entries in the DPT:
RedoLSN = min(RecLSN for all pages in DPT)
This ensures we start redo early enough to capture all necessary operations. Even if most pages need redo from LSN 5000, if one page might need it from LSN 3500, we start from 3500.
If the DPT is empty (no dirty pages), RedoLSN is the checkpoint's LSN—but redo will likely skip everything via LSN comparison.
The checkpoint record contains the Transaction Table and Dirty Page Table at checkpoint time. Analysis starts with these as initial values, then updates them by scanning forward. This is why checkpoints accelerate recovery—we don't reconstruct from the beginning of time, just from the last checkpoint.
The Redo phase implements ARIES's signature principle: Repeat History. Starting from RedoLSN, it re-applies logged operations to restore the database to its exact crash-time state. This includes operations from transactions that will be rolled back—we redo everything, then undo the losers.
Why Repeat History?
This approach might seem wasteful. Why redo work that we'll undo immediately after? The answer lies in page-level consistency:
Physical state restoration: A page might have been modified by multiple transactions. To correctly undo one transaction, the page must be in its crash-time state, including changes from other transactions.
CLR handling: If the system crashed during rollback, CLRs in the log record partial undo progress. Redo must apply these CLRs to restore the interrupted rollback state.
Simplicity and correctness: Redo doesn't need to track transaction status. It mechanically re-applies operations. This separation of concerns makes the algorithm easier to verify.
Redo Algorithm
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374
// ARIES Redo Phase Algorithmfunction RedoPhase( log: Log, RedoLSN: LSN, DirtyPageTable: Map<PageID, DPTEntry>, Buffer: BufferManager): void { let currentLSN = RedoLSN; // Scan forward from RedoLSN to end of log while (hasMoreRecords(log, currentLSN)) { let record = log.read(currentLSN); // Only redo UPDATE and CLR records // (COMMIT, ABORT, END, CHECKPOINT don't modify pages) if (record.type === "UPDATE" || record.type === "CLR") { let pageId = record.pageId; let shouldRedo = true; // Optimization 1: Skip if page not in DPT // If page isn't in DPT, it was clean at checkpoint // and not modified after, so no redo needed if (!DirtyPageTable.has(pageId)) { shouldRedo = false; } // Optimization 2: Check if record LSN < page's RecLSN // If this record is older than the first post-checkpoint // modification to this page, the page wasn't dirty from // this record's perspective else if (record.LSN < DirtyPageTable.get(pageId).recLSN) { shouldRedo = false; } // Optimization 3: Compare with PageLSN on disk // If PageLSN >= record LSN, this change is already on disk else { let page = Buffer.fetchPage(pageId); if (page.PageLSN >= record.LSN) { // Already applied, skip redo shouldRedo = false; } } // Actually perform redo if needed if (shouldRedo) { let page = Buffer.fetchPage(pageId); // Apply the redo information from the log record applyRedo(page, record.redoInfo); // Update the page's LSN to this record's LSN page.PageLSN = record.LSN; // Note: Page is NOT forced to disk here // Buffer manager will flush later per normal operation } } currentLSN = nextLSN(log, currentLSN); } // After redo: database is in exact crash-time state} // Helper: Apply redo information to a pagefunction applyRedo(page: Page, redoInfo: RedoData): void { // For physical logging: copy bytes directly // For physiological: re-execute operation // For logical: evaluate operation in current context // (Implementation depends on logging type)}The LSN Comparison Trick
The most elegant optimization in ARIES redo is the PageLSN comparison. Every database page carries a PageLSN in its header—the LSN of the last log record that modified it. During redo:
PageLSN >= RecordLSN, the modification is already on the page (was flushed to disk before crash)PageLSN < RecordLSN, the modification was lost (page had old version on disk)This single comparison replaces complex bookkeeping. We don't maintain which operations were applied—the page's LSN tells us directly.
Three-Level Redo Filtering
Redo uses three levels of filtering to minimize unnecessary work:
Even without these optimizations, redo would be correct—just slower. Applying the same operation twice produces the same result (idempotency). The optimizations prevent redundant work, but correctness doesn't depend on them.
Let's trace through a redo scenario to solidify understanding. Consider this initial state:
| Item | Value |
|---|---|
| Last Checkpoint LSN | 1000 |
| Crash occurred at LSN | 5000 |
| RedoLSN (from Analysis) | 2000 |
| PageID | RecLSN |
|---|---|
| Page A | 2000 |
| Page B | 3000 |
| Page C | 3500 |
| PageID | PageLSN on Disk | Interpretation |
|---|---|---|
| Page A | 2500 | Has changes up to LSN 2500, missing 3000-5000 |
| Page B | 4000 | Has changes up to LSN 4000, missing 4500-5000 |
| Page C | 3000 | Has changes up to LSN 3000, missing 3500-5000 |
Log Records from LSN 2000 to 5000:
| LSN | Type | Page | Action | Redo Decision |
|---|---|---|---|---|
| 2000 | UPDATE | Page A | Set X = 10 | RecLSN = 2000, PageLSN = 2500 ≥ 2000 → Skip |
| 2500 | UPDATE | Page A | Set Y = 20 | RecLSN = 2000, PageLSN = 2500 ≥ 2500 → Skip |
| 3000 | UPDATE | Page B | Set Z = 30 | RecLSN = 3000, PageLSN = 4000 ≥ 3000 → Skip |
| 3000 | UPDATE | Page A | Set X = 15 | RecLSN = 2000, PageLSN = 2500 < 3000 → REDO |
| 3500 | UPDATE | Page C | Set W = 40 | RecLSN = 3500, PageLSN = 3000 < 3500 → REDO |
| 4000 | UPDATE | Page B | Set Z = 35 | RecLSN = 3000, PageLSN = 4000 ≥ 4000 → Skip |
| 4500 | UPDATE | Page B | Set Z = 40 | RecLSN = 3000, PageLSN = 4000 < 4500 → REDO |
| 5000 | UPDATE | Page A | Set Y = 25 | RecLSN = 2000, PageLSN now = 3000 < 5000 → REDO |
Redo Trace:
After Redo:
The database is now in exactly the state it was at the moment of crash.
After redo, every page reflects every operation that was logged before the crash. This includes uncommitted transactions—we haven't judged what to keep yet. That's the Undo phase's job. Redo just repeats history faithfully.
The Undo phase completes recovery by rolling back all transactions that were active at crash time but hadn't committed. These are the loser transactions—their effects must be removed to achieve a consistent final state.
Identifying Loser Transactions
After Analysis, the Transaction Table contains entries for all transactions that were active at crash. Any transaction with state = UNDO (no COMMIT record) is a loser. These transactions' modifications are currently in the database (restored by redo) but must be removed.
Undo Direction: Backward
Unlike redo (which scans forward chronologically), undo proceeds backward—from most recent to oldest. We use the LastLSN from each transaction's TT entry as the starting point, then follow PrevLSN pointers backward through that transaction's log records.
Compensation Log Records (CLRs)
Here's where ARIES becomes truly resilient: when we undo an operation, we write a Compensation Log Record (CLR) describing the undo action. CLRs are crucial for crash-during-recovery robustness:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374
// ARIES Undo Phase Algorithmfunction UndoPhase( log: Log, TransactionTable: Map<TransactionID, TTEntry>, Buffer: BufferManager): void { // Collect all LSNs that need to be undone // These form the "ToUndo" set let ToUndo: MaxHeap<LSN> = new MaxHeap(); for (let [txnId, entry] of TransactionTable.entries()) { if (entry.state === "UNDO") { // Add the last LSN of each loser transaction ToUndo.insert(entry.lastLSN); } } // Process in reverse LSN order (largest first) while (!ToUndo.isEmpty()) { let lsnToUndo = ToUndo.extractMax(); let record = log.read(lsnToUndo); if (record.type === "CLR") { // CLR records are never undone themselves! // Just follow the UndoNextLSN pointer if (record.UndoNextLSN !== null) { ToUndo.insert(record.UndoNextLSN); } // If UndoNextLSN is null, this transaction's undo is complete } else if (record.type === "UPDATE") { // Undo this update // 1. Read the page let page = Buffer.fetchPage(record.pageId); // 2. Apply the undo operation applyUndo(page, record.undoInfo); // 3. Write a CLR for this undo action let clr = { type: "CLR", transactionId: record.transactionId, pageId: record.pageId, redoInfo: computeRedoForUndo(record), // How to redo this undo UndoNextLSN: record.PrevLSN, // Where to continue undoing PrevLSN: getLastLSN(record.transactionId) }; let clrLSN = log.append(clr); // 4. Update page LSN page.PageLSN = clrLSN; // 5. Add the PrevLSN to continue undoing this transaction if (record.PrevLSN !== null) { ToUndo.insert(record.PrevLSN); } } // Other record types (COMMIT, BEGIN, etc.) don't need undo } // Write END records for all loser transactions for (let [txnId, entry] of TransactionTable.entries()) { if (entry.state === "UNDO") { log.append({ type: "END", transactionId: txnId }); } } // Recovery complete!}The ToUndo Set and Processing Order
Undo maintains a set (typically a max-heap) of LSNs that need processing. Processing proceeds in descending LSN order—we undo the most recent operations first. This is important because:
Dependency ordering: Later operations might depend on earlier ones. Undoing in reverse order respects these dependencies.
Page state: An UPDATE might assume the page is in a certain state. That state was produced by earlier operations. Undoing the latest first maintains consistency.
CLR chaining: When we undo an operation and write a CLR, the CLR's UndoNextLSN points to the next record to undo. Processing in order ensures we can follow this chain correctly.
What About CLR Records?
When undo encounters a CLR record, it does not undo it. CLRs are redo-only records. Instead, undo follows the CLR's UndoNextLSN pointer to find the next record that needs undoing. This is how ARIES handles crash-during-recovery: CLRs bookmark our undo progress.
A critical invariant: we never undo a CLR. CLRs record that we've undone something—undoing them would mean re-doing the original operation, which contradicts the rollback. Instead, CLRs are redone if we crash during recovery, then we resume undo from where we left off via UndoNextLSN.
Let's continue our earlier example. After redo, the database is in crash-time state. Now we must undo the loser transactions.
| TransactionID | State | LastLSN |
|---|---|---|
| T1 | COMMIT | 4000 |
| T2 | UNDO | 5000 |
| T3 | UNDO | 4500 |
T1 is committed (we saw its COMMIT record), so it's a winner. T2 and T3 are losers—they must be undone.
Log Records for Loser Transactions:
| LSN | TxnID | Type | Page | PrevLSN | Action |
|---|---|---|---|---|---|
| 2000 | T2 | UPDATE | Page A | null | Set X from 0 to 10 |
| 2500 | T3 | UPDATE | Page B | null | Set Y from 0 to 20 |
| 3000 | T2 | UPDATE | Page A | 2000 | Set X from 10 to 15 |
| 4500 | T3 | UPDATE | Page B | 2500 | Set Y from 20 to 25 |
| 5000 | T2 | UPDATE | Page A | 3000 | Set X from 15 to 18 |
Undo Processing (Descending LSN Order):
Step 1: Process LSN 5000 (T2)
Step 2: Process LSN 4500 (T3)
Step 3: Process LSN 3000 (T2)
Step 4: Process LSN 2500 (T3)
Step 5: Process LSN 2000 (T2)
Step 6: Write END Records
Final State:
After undo, the database is consistent. Only T1's committed effects remain. T2 and T3's modifications have been rolled back. CLRs ensure that if we crash during this undo process, we can resume correctly without re-undoing already-undone work.
One of ARIES's defining strengths is nested crash handling—the ability to recover correctly even if the system crashes during recovery. Let's trace how this works.
Scenario: Crash During Undo
Suppose in our previous example, the system crashes after Step 3 (after writing CLR at LSN 5003, before completing T3's undo).
State at Second Crash:
Recovery from Second Crash:
Analysis Phase:
Redo Phase:
Undo Phase:
The Key Insight: CLRs as Undo Progress Markers
Notice what happened with CLRs during recovery from the nested crash:
CLRs act as undo progress markers. After a crash-during-recovery:
Why CLRs Can't Be Undone
If we tried to undo a CLR, we'd be undoing an undo—which means redoing the original operation. This would violate our goal of rolling back the transaction. The rule is absolute: CLRs are redo-only, never undone.
Because CLRs are redone and undo progress is preserved, nested crashes don't cause exponential work. Each recovery attempt makes progress. Even with arbitrary numbers of crashes during recovery, total work is bounded by the original log size plus the work needed for one complete undo pass.
We've explored the three phases of ARIES recovery in depth. Let's consolidate the key concepts:
What's Next
Now that we understand the three phases, we'll dive deeper into the Log Sequence Number (LSN) concept—the backbone identifier system that makes ARIES work efficiently. LSNs enable everything from redo skipping to CLR chaining to checkpoint optimization.
You now understand how ARIES's three phases work together to recover from crashes. Analysis determines state, Redo restores it, and Undo cleans up uncommitted work—all while handling nested crashes gracefully through CLRs.