Analysis Phase - Learning Module

Loading content...

0/241

Reconstructing State: The Analysis Phase Output

The Complete Picture Before Recovery Begins

The Analysis Phase transforms raw log data into a precise reconstruction of system state at crash time. This reconstructed state answers every question the Redo and Undo phases will ask:

Redo Phase: "Which pages might be dirty, and from which LSN should I start processing?"
Undo Phase: "Which transactions didn't commit, and where do I start rolling them back?"

This final page of the Analysis Phase module synthesizes everything we've learned. We examine how the Dirty Page Table, Transaction Table, and RedoLSN combine into a coherent recovery plan. We explore the invariants that must hold, the validation checks that ensure correctness, and the handoff to subsequent phases.

What You Will Learn

By the end of this page, you will understand the complete output of the Analysis Phase, how to validate reconstructed state for correctness, how this state integrates with Redo and Undo phases, and how to diagnose recovery issues by examining Analysis Phase output.

The Complete Analysis Phase Output

When the Analysis Phase completes, it produces a structured output that completely characterizes the recovery work needed. Let's examine each component.

analysis_phase_output.h
C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
/**
 * Complete output of the Analysis Phase
 * This structure contains everything needed for Redo and Undo phases
 */
struct AnalysisPhaseResult {
    // === Dirty Page Table (Final State) ===
    // Maps PageID -> DirtyPageEntry
    // Contains all pages that might need redo
    HashMap<PageID, DirtyPageEntry> dirty_page_table;
    
    // === Transaction Table (Final State) ===
    // Maps TransactionID -> TransactionEntry
    // Contains all transactions that were active at crash
    HashMap<TransactionID, TransactionEntry> transaction_table;
    
    // === Redo Control ===
    LSN redo_lsn;          // Where Redo Phase starts scanning
    LSN end_of_log_lsn;    // Last valid LSN in the log (where analysis stopped)
    
    // === Undo Control ===
    vector<UndoTask> undo_list;   // Transactions requiring rollback, sorted
    set<TransactionID> winners;    // Transactions that committed (for verification)
    
    // === Metadata ===
    LSN checkpoint_lsn;            // The checkpoint we started from
    uint64 log_records_scanned;    // Number of records processed
    uint64 analysis_duration_ms;   // Time taken for analysis
    
    // === Validation ===
    bool is_valid;                 // Did analysis complete successfully?
    string error_message;          // If not valid, what went wrong
};
 
/**
 * Each UndoTask specifies a transaction to undo and where to start
 */
struct UndoTask {
    TransactionID txn_id;
    LSN           undo_next_lsn;  // Start undoing from this LSN
    int           estimated_records; // Approximate log records to process
};

Analysis Phase Output Components
Component	Purpose	Used By
Dirty Page Table	Pages potentially needing redo	Redo Phase
Transaction Table	Active transactions at crash	Undo Phase
RedoLSN	Starting point for log replay	Redo Phase
Undo List	Transactions to roll back, ordered	Undo Phase
Winners Set	Committed transactions for verification	Validation, logging
Metadata	Performance metrics and diagnostics	Monitoring, tuning

State Invariants and Consistency Checks

The reconstructed state must satisfy certain invariants for recovery to proceed correctly. Validating these invariants catches bugs in both the logging subsystem and the recovery implementation.

Critical Invariants:

Invariants That Must Hold

•Invariant 1: RedoLSN ≤ min(recLSN) for all pages in DPT. By definition, RedoLSN cannot be after the earliest dirty point.
•Invariant 2: Every transaction in the undo list has undoNextLSN ≤ lastLSN. You can't undo beyond what was done.
•Invariant 3: No transaction appears in both undo list and winners set. A transaction either committed or didn't.
•Invariant 4: RedoLSN ≤ end_of_log_lsn. We can't start redo beyond the log's end.
•Invariant 5: For each undo task, undoNextLSN ≠ INVALID_LSN unless the transaction has no operations. We must know where to start undo.

validate_analysis_result.cpp
C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
bool AnalysisPhaseResult::validate() {
    is_valid = true;
    error_message = "";
    
    // Invariant 1: RedoLSN ≤ min(recLSN)
    for (const auto& [pid, entry] : dirty_page_table) {
        if (redo_lsn > entry.recovery_lsn) {
            is_valid = false;
            error_message = format(
                "Invariant 1 violated: RedoLSN %d > recLSN %d for page %d",
                redo_lsn, entry.recovery_lsn, pid);
            return false;
        }
    }
    
    // Invariant 2: undoNextLSN ≤ lastLSN for all undo tasks
    for (const auto& task : undo_list) {
        auto it = transaction_table.find(task.txn_id);
        if (it != transaction_table.end()) {
            if (task.undo_next_lsn > it->second.last_lsn) {
                is_valid = false;
                error_message = format(
                    "Invariant 2 violated: T%d undoNextLSN %d > lastLSN %d",
                    task.txn_id, task.undo_next_lsn, it->second.last_lsn);
                return false;
            }
        }
    }
    
    // Invariant 3: Undo list and winners are disjoint
    for (const auto& task : undo_list) {
        if (winners.contains(task.txn_id)) {
            is_valid = false;
            error_message = format(
                "Invariant 3 violated: T%d in both undo list and winners",
                task.txn_id);
            return false;
        }
    }
    
    // Invariant 4: RedoLSN ≤ end of log
    if (redo_lsn > end_of_log_lsn && !dirty_page_table.empty()) {
        is_valid = false;
        error_message = format(
            "Invariant 4 violated: RedoLSN %d > end_of_log %d",
            redo_lsn, end_of_log_lsn);
        return false;
    }
    
    // Invariant 5: UndoNextLSN valid for all undo tasks
    for (const auto& task : undo_list) {
        auto it = transaction_table.find(task.txn_id);
        if (it != transaction_table.end() && 
            it->second.log_records > 0 && 
            task.undo_next_lsn == INVALID_LSN) {
            is_valid = false;
            error_message = format(
                "Invariant 5 violated: T%d has %d records but invalid undoNextLSN",
                task.txn_id, it->second.log_records);
            return false;
        }
    }
    
    log_info("Analysis result validation passed");
    return true;
}

Validation Failures

If validation fails, the database MUST NOT proceed with recovery using this state. Invariant violations indicate serious bugs in logging, checkpointing, or analysis—proceeding could cause data corruption. The DBA must investigate using log dumps and potentially restore from backup.

Analyzing the Reconstructed Dirty Page Table

The reconstructed DPT at the end of Analysis Phase tells us everything about page-level recovery needs. Let's examine what we can learn from it.

DPT Characteristics:

analyze_dpt.cpp
C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
void AnalysisPhaseResult::analyzeDPT() {
    log_info("=== Dirty Page Table Analysis ===");
    log_info("Total dirty pages: %d", dirty_page_table.size());
    
    // Find LSN range
    LSN min_rec_lsn = MAX_LSN;
    LSN max_rec_lsn = 0;
    
    for (const auto& [pid, entry] : dirty_page_table) {
        min_rec_lsn = min(min_rec_lsn, entry.recovery_lsn);
        max_rec_lsn = max(max_rec_lsn, entry.recovery_lsn);
    }
    
    log_info("RecLSN range: [%d, %d]", min_rec_lsn, max_rec_lsn);
    log_info("Span: %d LSN units", max_rec_lsn - min_rec_lsn);
    
    // Categorize pages by age
    LSN quarter = (max_rec_lsn - min_rec_lsn) / 4;
    int old_pages = 0, medium_pages = 0, new_pages = 0, very_new_pages = 0;
    
    for (const auto& [pid, entry] : dirty_page_table) {
        LSN age = entry.recovery_lsn - min_rec_lsn;
        if (age < quarter) old_pages++;
        else if (age < 2 * quarter) medium_pages++;
        else if (age < 3 * quarter) new_pages++;
        else very_new_pages++;
    }
    
    log_info("Page age distribution:");
    log_info("  Oldest quarter: %d pages", old_pages);
    log_info("  2nd quarter: %d pages", medium_pages);
    log_info("  3rd quarter: %d pages", new_pages);
    log_info("  Newest quarter: %d pages", very_new_pages);
    
    // Check for problematic patterns
    if (old_pages > dirty_page_table.size() / 2) {
        log_warn("Many old dirty pages - consider more aggressive flushing");
    }
    
    // Estimate redo work
    LSN redo_span = end_of_log_lsn - redo_lsn;
    log_info("Redo will scan %d LSN units", redo_span);
    log_info("Estimated redo time: %.1f seconds", 
             estimateRedoTime(redo_span, dirty_page_table.size()));
}

DPT Patterns and Their Implications
Pattern	Meaning	Action
Many old recLSNs	Long-dirty pages, poor flushing	Investigate flushing policy
All recent recLSNs	Aggressive checkpoint/flushing	Good configuration
Clustered pages	Hot tables/indexes	Consider partitioning
Scattered pages	Random workload	Normal for OLTP
Very large DPT	Large buffer pool, infrequent flush	May need tuning

Analyzing the Reconstructed Transaction Table

The Transaction Table at the end of Analysis contains only transactions that need attention—those that didn't fully complete before the crash. Let's analyze what this tells us.

Transaction Classifications:

analyze_transaction_table.cpp
C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
void AnalysisPhaseResult::analyzeTransactionTable() {
    log_info("=== Transaction Table Analysis ===");
    log_info("Total transactions requiring attention: %d", transaction_table.size());
    
    int active_count = 0;
    int aborting_count = 0;
    int committing_count = 0;
    
    for (const auto& [tid, entry] : transaction_table) {
        switch (entry.state) {
            case TransactionState::UNDO:
            case TransactionState::ACTIVE:
                active_count++;
                break;
            case TransactionState::ABORTING:
                aborting_count++;
                break;
            case TransactionState::COMMITTING:
                committing_count++;
                break;
            default:
                log_warn("Unexpected state %d for T%d", entry.state, tid);
        }
    }
    
    log_info("State distribution:");
    log_info("  Active (need undo): %d", active_count);
    log_info("  Aborting (continue undo): %d", aborting_count);
    log_info("  Committing (need END): %d", committing_count);
    
    // Analyze undo work
    log_info("\nUndo work estimation:");
    long total_undo_records = 0;
    
    for (const auto& task : undo_list) {
        auto it = transaction_table.find(task.txn_id);
        if (it != transaction_table.end()) {
            long records = it->second.log_records;
            total_undo_records += records;
            
            if (records > 1000) {
                log_info("  T%d: ~%d records (large transaction)", 
                        task.txn_id, records);
            }
        }
    }
    
    log_info("Total estimated undo records: %d", total_undo_records);
    log_info("Estimated undo time: %.1f seconds",
             estimateUndoTime(total_undo_records));
    
    // Check for concerning patterns
    for (const auto& [tid, entry] : transaction_table) {
        // Very long-running transaction
        LSN age = end_of_log_lsn - entry.first_lsn;
        if (age > 1000000) {  // Arbitrary threshold
            log_warn("T%d is very old (span: %d LSN units) - may slow recovery", 
                    tid, age);
        }
        
        // Transaction with many operations
        if (entry.log_records > 100000) {
            log_warn("T%d has %d log records - undo will be slow",
                    tid, entry.log_records);
        }
    }
}

Winners vs Losers

Transactions are called 'winners' if they committed (COMMIT record seen) and 'losers' if they didn't. Only losers require undo. The winners set isn't strictly needed for recovery but is useful for logging, auditing, and understanding what happened at crash time.

Building and Ordering the Undo List

The undo list is the ordered set of transactions that must be rolled back. The ordering is critical for efficiency—ARIES processes undo in a specific manner to minimize log reads.

Undo Processing Order:

ARIES uses a single backward pass through the log for all undo operations. To enable this, the undo list is ordered by undoNextLSN in descending order. As undo proceeds, it encounters each transaction's next-to-undo record in sequence.

build_undo_list.cpp
C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
vector<UndoTask> AnalysisPhaseResult::buildUndoList() {
    vector<UndoTask> undo_list;
    
    // Collect all transactions needing undo
    for (const auto& [tid, entry] : transaction_table) {
        bool needs_undo = false;
        
        switch (entry.state) {
            case TransactionState::ACTIVE:
            case TransactionState::UNDO:
                // Active transaction that never committed
                needs_undo = true;
                break;
                
            case TransactionState::ABORTING:
                // Transaction was in the middle of rollback
                needs_undo = true;
                break;
                
            case TransactionState::COMMITTING:
                // Transaction committed - just needs END record
                // Add to "needs cleanup" but not undo
                needs_undo = false;
                winners.insert(tid);  // Track as winner
                break;
                
            default:
                log_warn("Unexpected transaction state for T%d", tid);
        }
        
        if (needs_undo && entry.undo_next_lsn != INVALID_LSN) {
            UndoTask task;
            task.txn_id = tid;
            task.undo_next_lsn = entry.undo_next_lsn;
            task.estimated_records = entry.log_records;
            
            undo_list.push_back(task);
        }
    }
    
    // Sort by undoNextLSN DESCENDING
    // This enables efficient single-pass backward log scan
    sort(undo_list.begin(), undo_list.end(),
         [](const UndoTask& a, const UndoTask& b) {
             return a.undo_next_lsn > b.undo_next_lsn;
         });
    
    log_info("Undo list built: %d transactions, max undoNextLSN=%d",
             undo_list.size(),
             undo_list.empty() ? 0 : undo_list[0].undo_next_lsn);
    
    return undo_list;
}

Converting Mermaid diagram...

Why Descending Order?

Processing in descending undoNextLSN order means we naturally encounter records in log order during a single backward scan. When we process T3's record at LSN 5000, we're positioned to then process T1's at 4500, then T5's at 4000, etc. No random seeks needed—just continuous backward reading.

Handoff to the Redo Phase

With Analysis complete, the system transitions to the Redo Phase. The Analysis output provides everything Redo needs to efficiently reconstruct crash-time page states.

What Redo Receives:

handoff_to_redo.cpp
C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
class RedoPhase {
public:
    void initialize(const AnalysisPhaseResult& analysis_result) {
        // Store references to analysis output
        this->dirty_page_table = analysis_result.dirty_page_table;
        this->redo_lsn = analysis_result.redo_lsn;
        this->end_of_log_lsn = analysis_result.end_of_log_lsn;
        
        // Pre-calculate redo scope
        this->redo_scope = end_of_log_lsn - redo_lsn;
        
        log_info("Redo Phase initialized:");
        log_info("  Start LSN: %d", redo_lsn);
        log_info("  End LSN: %d", end_of_log_lsn);
        log_info("  Scope: %d LSN units", redo_scope);
        log_info("  Dirty pages to consider: %d", dirty_page_table.size());
    }
    
    void run() {
        log_info("Starting Redo Phase...");
        
        LSN current_lsn = redo_lsn;
        int records_processed = 0;
        int records_applied = 0;
        
        while (current_lsn < end_of_log_lsn) {
            LogRecord record = log_manager.readLog(current_lsn);
            records_processed++;
            
            if (shouldRedo(record)) {
                applyRedo(record);
                records_applied++;
            }
            
            current_lsn = record.next_lsn;
            
            // Progress logging
            if (records_processed % 100000 == 0) {
                log_info("Redo progress: %d/%d records, %d applied",
                        records_processed, redo_scope, records_applied);
            }
        }
        
        log_info("Redo Phase complete: %d records processed, %d applied",
                 records_processed, records_applied);
    }
    
private:
    bool shouldRedo(const LogRecord& record) {
        // Only consider redoable record types
        if (!record.isRedoable()) return false;
        
        PageID pid = record.page_id;
        
        // Check 1: Is this page in the DPT?
        if (!dirty_page_table.contains(pid)) {
            // Page not dirty - already on disk
            return false;
        }
        
        // Check 2: Is record LSN >= page's recLSN?
        const DirtyPageEntry& entry = dirty_page_table.get(pid);
        if (record.lsn < entry.recovery_lsn) {
            // Record is before this page became dirty - already on disk
            return false;
        }
        
        // Check 3: Read page and check pageLSN
        Page* page = buffer_manager.getPage(pid);
        if (page->getPageLSN() >= record.lsn) {
            // Page already has this change
            return false;
        }
        
        // All checks passed - this record needs to be redone
        return true;
    }
};

Redo Efficiency

Notice how the DPT enables two levels of filtering before even reading the page: records for clean pages are skipped entirely, and records before a page's recLSN are skipped. The final pageLSN check handles cases where the page was actually flushed but remained in the conservative DPT.

Handoff to the Undo Phase

After Redo completes, the Undo Phase uses the transaction table and undo list to roll back incomplete transactions. The Analysis output ensures Undo knows exactly what work to perform.

What Undo Receives:

handoff_to_undo.cpp
C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
class UndoPhase {
public:
    void initialize(const AnalysisPhaseResult& analysis_result) {
        this->transaction_table = analysis_result.transaction_table;
        this->undo_list = analysis_result.undo_list;
        
        // Calculate total undo work
        int total_records = 0;
        for (const auto& task : undo_list) {
            total_records += task.estimated_records;
        }
        
        log_info("Undo Phase initialized:");
        log_info("  Transactions to undo: %d", undo_list.size());
        log_info("  Estimated records: %d", total_records);
    }
    
    void run() {
        log_info("Starting Undo Phase...");
        
        // Build "ToUndo" set of LSNs from all undo tasks
        // This enables efficient single backward pass
        set<LSN, greater<LSN>> to_undo;  // Descending order
        
        for (const auto& task : undo_list) {
            if (task.undo_next_lsn != INVALID_LSN) {
                to_undo.insert(task.undo_next_lsn);
            }
        }
        
        int records_undone = 0;
        
        // Process in descending LSN order (single backward pass)
        while (!to_undo.empty()) {
            LSN lsn = *to_undo.begin();
            to_undo.erase(to_undo.begin());
            
            LogRecord record = log_manager.readLog(lsn);
            TransactionID tid = record.txn_id;
            
            if (record.type == LogType::CLR) {
                // CLR: follow its undoNextLSN
                if (record.undo_next_lsn != INVALID_LSN) {
                    to_undo.insert(record.undo_next_lsn);
                }
            } else if (record.isUndoable()) {
                // Undoable record: perform undo and log CLR
                performUndo(record);
                
                // Add CLR to log
                LogRecord clr = createCLR(record);
                LSN clr_lsn = log_manager.appendLog(clr);
                
                // Update transaction's lastLSN
                transaction_table.get(tid).last_lsn = clr_lsn;
                
                // Continue undo with record's prevLSN
                if (record.prev_lsn != INVALID_LSN) {
                    to_undo.insert(record.prev_lsn);
                }
                
                records_undone++;
            }
            
            // Check if this transaction is done undoing
            if (record.prev_lsn == INVALID_LSN || 
                !to_undo.contains(record.prev_lsn)) {
                // Transaction fully undone - write END
                writeEndRecord(tid);
            }
        }
        
        log_info("Undo Phase complete: %d records undone", records_undone);
    }
};

Undo Phase Processing for Different Record Types
Record Type	Action	Next LSN to Process
UPDATE	Apply inverse operation, log CLR	record.prevLSN
INSERT	Delete the inserted record, log CLR	record.prevLSN
DELETE	Re-insert the deleted record, log CLR	record.prevLSN
CLR	Nothing (already re-done if needed)	clr.undoNextLSN
BEGIN	Write END record, remove from undo list	Done

The Complete Recovery Flow

Let's visualize how the Analysis Phase output flows through the complete ARIES recovery process.

Converting Mermaid diagram...

complete_recovery.cpp
C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
void RecoveryManager::performRecovery() {
    log_info("========================================");
    log_info("DATABASE RECOVERY STARTED");
    log_info("========================================");
    
    auto start_time = chrono::high_resolution_clock::now();
    
    // Phase 1: Analysis
    log_info("\n=== ANALYSIS PHASE ===");
    AnalysisPhase analysis;
    AnalysisPhaseResult result = analysis.run();
    
    if (!result.validate()) {
        log_error("Analysis validation failed: %s", result.error_message.c_str());
        throw RecoveryError("Invalid analysis result - cannot proceed");
    }
    
    auto analysis_end = chrono::high_resolution_clock::now();
    log_info("Analysis completed in %d ms", 
             chrono::duration_cast<chrono::milliseconds>(analysis_end - start_time).count());
    
    // Phase 2: Redo
    log_info("\n=== REDO PHASE ===");
    if (result.redo_lsn == END_OF_LOG || result.dirty_page_table.empty()) {
        log_info("No redo required - database pages are current");
    } else {
        RedoPhase redo;
        redo.initialize(result);
        redo.run();
    }
    
    auto redo_end = chrono::high_resolution_clock::now();
    
    // Phase 3: Undo
    log_info("\n=== UNDO PHASE ===");
    if (result.undo_list.empty()) {
        log_info("No undo required - all transactions committed");
    } else {
        UndoPhase undo;
        undo.initialize(result);
        undo.run();
    }
    
    auto undo_end = chrono::high_resolution_clock::now();
    
    // Summary
    log_info("\n========================================");
    log_info("RECOVERY COMPLETE");
    log_info("========================================");
    log_info("Analysis time: %d ms", 
             chrono::duration_cast<chrono::milliseconds>(analysis_end - start_time).count());
    log_info("Redo time: %d ms",
             chrono::duration_cast<chrono::milliseconds>(redo_end - analysis_end).count());
    log_info("Undo time: %d ms",
             chrono::duration_cast<chrono::milliseconds>(undo_end - redo_end).count());
    log_info("Total time: %d ms",
             chrono::duration_cast<chrono::milliseconds>(undo_end - start_time).count());
    log_info("Transactions recovered: %d (winners), %d (losers)",
             result.winners.size(), result.undo_list.size());
}

Summary: Analysis Phase Complete

The Analysis Phase reconstructs a complete picture of the database system at crash time. This information forms the foundation for efficient, correct recovery. Let's consolidate everything we've learned:

Key Takeaways

•Complete Output: Analysis produces DPT (for redo), Transaction Table (for undo), RedoLSN (start point), and ordered undo list (rollback targets).
•State Invariants: Critical invariants must hold (RedoLSN ≤ min recLSN, etc.) for recovery correctness—violated invariants indicate system bugs.
•DPT Analysis: Reveals page-level recovery needs, checkpoint effectiveness, and potential flushing issues.
•Transaction Analysis: Identifies losers vs winners, estimates undo work, and flags problematic long-running transactions.
•Undo List Ordering: Descending undoNextLSN enables efficient single backward pass through the log.
•Phase Handoffs: Analysis output directly drives Redo (DPT, RedoLSN) and Undo (Transaction Table, undo list) phases.

Module Complete:

You have now mastered the ARIES Analysis Phase—the critical first step of crash recovery that transforms raw log data into precise recovery instructions. With this knowledge, you understand:

How the Dirty Page Table tracks pages needing redo
How the Transaction Table tracks transactions needing undo
How log scanning reconstructs crash-time state
How the RedoLSN bounds recovery work
How all components integrate for efficient recovery

The next modules cover the Redo Phase (repeating history) and Undo Phase (rolling back losers), completing the ARIES recovery algorithm.

Module Complete

Congratulations! You have completed the Analysis Phase module of ARIES recovery. You now understand how database systems reconstruct their exact state after a crash, enabling the subsequent Redo and Undo phases to restore consistency efficiently. This knowledge is fundamental for database engineers, DBAs, and anyone working with production database systems.

Reconstructing State: The Analysis Phase Output

The Complete Picture Before Recovery Begins

The Analysis Phase transforms raw log data into a precise reconstruction of system state at crash time. This reconstructed state answers every question the Redo and Undo phases will ask:

Redo Phase: "Which pages might be dirty, and from which LSN should I start processing?"
Undo Phase: "Which transactions didn't commit, and where do I start rolling them back?"

What You Will Learn

The Complete Analysis Phase Output

When the Analysis Phase completes, it produces a structured output that completely characterizes the recovery work needed. Let's examine each component.

analysis_phase_output.h
C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
/**
 * Complete output of the Analysis Phase
 * This structure contains everything needed for Redo and Undo phases
 */
struct AnalysisPhaseResult {
    // === Dirty Page Table (Final State) ===
    // Maps PageID -> DirtyPageEntry
    // Contains all pages that might need redo
    HashMap<PageID, DirtyPageEntry> dirty_page_table;
    
    // === Transaction Table (Final State) ===
    // Maps TransactionID -> TransactionEntry
    // Contains all transactions that were active at crash
    HashMap<TransactionID, TransactionEntry> transaction_table;
    
    // === Redo Control ===
    LSN redo_lsn;          // Where Redo Phase starts scanning
    LSN end_of_log_lsn;    // Last valid LSN in the log (where analysis stopped)
    
    // === Undo Control ===
    vector<UndoTask> undo_list;   // Transactions requiring rollback, sorted
    set<TransactionID> winners;    // Transactions that committed (for verification)
    
    // === Metadata ===
    LSN checkpoint_lsn;            // The checkpoint we started from
    uint64 log_records_scanned;    // Number of records processed
    uint64 analysis_duration_ms;   // Time taken for analysis
    
    // === Validation ===
    bool is_valid;                 // Did analysis complete successfully?
    string error_message;          // If not valid, what went wrong
};
 
/**
 * Each UndoTask specifies a transaction to undo and where to start
 */
struct UndoTask {
    TransactionID txn_id;
    LSN           undo_next_lsn;  // Start undoing from this LSN
    int           estimated_records; // Approximate log records to process
};

Analysis Phase Output Components
Component	Purpose	Used By
Dirty Page Table	Pages potentially needing redo	Redo Phase
Transaction Table	Active transactions at crash	Undo Phase
RedoLSN	Starting point for log replay	Redo Phase
Undo List	Transactions to roll back, ordered	Undo Phase
Winners Set	Committed transactions for verification	Validation, logging
Metadata	Performance metrics and diagnostics	Monitoring, tuning

State Invariants and Consistency Checks

The reconstructed state must satisfy certain invariants for recovery to proceed correctly. Validating these invariants catches bugs in both the logging subsystem and the recovery implementation.

Critical Invariants:

Invariants That Must Hold

•Invariant 1: RedoLSN ≤ min(recLSN) for all pages in DPT. By definition, RedoLSN cannot be after the earliest dirty point.
•Invariant 2: Every transaction in the undo list has undoNextLSN ≤ lastLSN. You can't undo beyond what was done.
•Invariant 3: No transaction appears in both undo list and winners set. A transaction either committed or didn't.
•Invariant 4: RedoLSN ≤ end_of_log_lsn. We can't start redo beyond the log's end.
•Invariant 5: For each undo task, undoNextLSN ≠ INVALID_LSN unless the transaction has no operations. We must know where to start undo.

validate_analysis_result.cpp
C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
bool AnalysisPhaseResult::validate() {
    is_valid = true;
    error_message = "";
    
    // Invariant 1: RedoLSN ≤ min(recLSN)
    for (const auto& [pid, entry] : dirty_page_table) {
        if (redo_lsn > entry.recovery_lsn) {
            is_valid = false;
            error_message = format(
                "Invariant 1 violated: RedoLSN %d > recLSN %d for page %d",
                redo_lsn, entry.recovery_lsn, pid);
            return false;
        }
    }
    
    // Invariant 2: undoNextLSN ≤ lastLSN for all undo tasks
    for (const auto& task : undo_list) {
        auto it = transaction_table.find(task.txn_id);
        if (it != transaction_table.end()) {
            if (task.undo_next_lsn > it->second.last_lsn) {
                is_valid = false;
                error_message = format(
                    "Invariant 2 violated: T%d undoNextLSN %d > lastLSN %d",
                    task.txn_id, task.undo_next_lsn, it->second.last_lsn);
                return false;
            }
        }
    }
    
    // Invariant 3: Undo list and winners are disjoint
    for (const auto& task : undo_list) {
        if (winners.contains(task.txn_id)) {
            is_valid = false;
            error_message = format(
                "Invariant 3 violated: T%d in both undo list and winners",
                task.txn_id);
            return false;
        }
    }
    
    // Invariant 4: RedoLSN ≤ end of log
    if (redo_lsn > end_of_log_lsn && !dirty_page_table.empty()) {
        is_valid = false;
        error_message = format(
            "Invariant 4 violated: RedoLSN %d > end_of_log %d",
            redo_lsn, end_of_log_lsn);
        return false;
    }
    
    // Invariant 5: UndoNextLSN valid for all undo tasks
    for (const auto& task : undo_list) {
        auto it = transaction_table.find(task.txn_id);
        if (it != transaction_table.end() && 
            it->second.log_records > 0 && 
            task.undo_next_lsn == INVALID_LSN) {
            is_valid = false;
            error_message = format(
                "Invariant 5 violated: T%d has %d records but invalid undoNextLSN",
                task.txn_id, it->second.log_records);
            return false;
        }
    }
    
    log_info("Analysis result validation passed");
    return true;
}

Validation Failures

Analyzing the Reconstructed Dirty Page Table

The reconstructed DPT at the end of Analysis Phase tells us everything about page-level recovery needs. Let's examine what we can learn from it.

DPT Characteristics:

analyze_dpt.cpp
C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
void AnalysisPhaseResult::analyzeDPT() {
    log_info("=== Dirty Page Table Analysis ===");
    log_info("Total dirty pages: %d", dirty_page_table.size());
    
    // Find LSN range
    LSN min_rec_lsn = MAX_LSN;
    LSN max_rec_lsn = 0;
    
    for (const auto& [pid, entry] : dirty_page_table) {
        min_rec_lsn = min(min_rec_lsn, entry.recovery_lsn);
        max_rec_lsn = max(max_rec_lsn, entry.recovery_lsn);
    }
    
    log_info("RecLSN range: [%d, %d]", min_rec_lsn, max_rec_lsn);
    log_info("Span: %d LSN units", max_rec_lsn - min_rec_lsn);
    
    // Categorize pages by age
    LSN quarter = (max_rec_lsn - min_rec_lsn) / 4;
    int old_pages = 0, medium_pages = 0, new_pages = 0, very_new_pages = 0;
    
    for (const auto& [pid, entry] : dirty_page_table) {
        LSN age = entry.recovery_lsn - min_rec_lsn;
        if (age < quarter) old_pages++;
        else if (age < 2 * quarter) medium_pages++;
        else if (age < 3 * quarter) new_pages++;
        else very_new_pages++;
    }
    
    log_info("Page age distribution:");
    log_info("  Oldest quarter: %d pages", old_pages);
    log_info("  2nd quarter: %d pages", medium_pages);
    log_info("  3rd quarter: %d pages", new_pages);
    log_info("  Newest quarter: %d pages", very_new_pages);
    
    // Check for problematic patterns
    if (old_pages > dirty_page_table.size() / 2) {
        log_warn("Many old dirty pages - consider more aggressive flushing");
    }
    
    // Estimate redo work
    LSN redo_span = end_of_log_lsn - redo_lsn;
    log_info("Redo will scan %d LSN units", redo_span);
    log_info("Estimated redo time: %.1f seconds", 
             estimateRedoTime(redo_span, dirty_page_table.size()));
}

DPT Patterns and Their Implications
Pattern	Meaning	Action
Many old recLSNs	Long-dirty pages, poor flushing	Investigate flushing policy
All recent recLSNs	Aggressive checkpoint/flushing	Good configuration
Clustered pages	Hot tables/indexes	Consider partitioning
Scattered pages	Random workload	Normal for OLTP
Very large DPT	Large buffer pool, infrequent flush	May need tuning

Analyzing the Reconstructed Transaction Table

The Transaction Table at the end of Analysis contains only transactions that need attention—those that didn't fully complete before the crash. Let's analyze what this tells us.

Transaction Classifications:

analyze_transaction_table.cpp
C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
void AnalysisPhaseResult::analyzeTransactionTable() {
    log_info("=== Transaction Table Analysis ===");
    log_info("Total transactions requiring attention: %d", transaction_table.size());
    
    int active_count = 0;
    int aborting_count = 0;
    int committing_count = 0;
    
    for (const auto& [tid, entry] : transaction_table) {
        switch (entry.state) {
            case TransactionState::UNDO:
            case TransactionState::ACTIVE:
                active_count++;
                break;
            case TransactionState::ABORTING:
                aborting_count++;
                break;
            case TransactionState::COMMITTING:
                committing_count++;
                break;
            default:
                log_warn("Unexpected state %d for T%d", entry.state, tid);
        }
    }
    
    log_info("State distribution:");
    log_info("  Active (need undo): %d", active_count);
    log_info("  Aborting (continue undo): %d", aborting_count);
    log_info("  Committing (need END): %d", committing_count);
    
    // Analyze undo work
    log_info("\nUndo work estimation:");
    long total_undo_records = 0;
    
    for (const auto& task : undo_list) {
        auto it = transaction_table.find(task.txn_id);
        if (it != transaction_table.end()) {
            long records = it->second.log_records;
            total_undo_records += records;
            
            if (records > 1000) {
                log_info("  T%d: ~%d records (large transaction)", 
                        task.txn_id, records);
            }
        }
    }
    
    log_info("Total estimated undo records: %d", total_undo_records);
    log_info("Estimated undo time: %.1f seconds",
             estimateUndoTime(total_undo_records));
    
    // Check for concerning patterns
    for (const auto& [tid, entry] : transaction_table) {
        // Very long-running transaction
        LSN age = end_of_log_lsn - entry.first_lsn;
        if (age > 1000000) {  // Arbitrary threshold
            log_warn("T%d is very old (span: %d LSN units) - may slow recovery", 
                    tid, age);
        }
        
        // Transaction with many operations
        if (entry.log_records > 100000) {
            log_warn("T%d has %d log records - undo will be slow",
                    tid, entry.log_records);
        }
    }
}

Winners vs Losers

Building and Ordering the Undo List

The undo list is the ordered set of transactions that must be rolled back. The ordering is critical for efficiency—ARIES processes undo in a specific manner to minimize log reads.

Undo Processing Order:

build_undo_list.cpp
C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
vector<UndoTask> AnalysisPhaseResult::buildUndoList() {
    vector<UndoTask> undo_list;
    
    // Collect all transactions needing undo
    for (const auto& [tid, entry] : transaction_table) {
        bool needs_undo = false;
        
        switch (entry.state) {
            case TransactionState::ACTIVE:
            case TransactionState::UNDO:
                // Active transaction that never committed
                needs_undo = true;
                break;
                
            case TransactionState::ABORTING:
                // Transaction was in the middle of rollback
                needs_undo = true;
                break;
                
            case TransactionState::COMMITTING:
                // Transaction committed - just needs END record
                // Add to "needs cleanup" but not undo
                needs_undo = false;
                winners.insert(tid);  // Track as winner
                break;
                
            default:
                log_warn("Unexpected transaction state for T%d", tid);
        }
        
        if (needs_undo && entry.undo_next_lsn != INVALID_LSN) {
            UndoTask task;
            task.txn_id = tid;
            task.undo_next_lsn = entry.undo_next_lsn;
            task.estimated_records = entry.log_records;
            
            undo_list.push_back(task);
        }
    }
    
    // Sort by undoNextLSN DESCENDING
    // This enables efficient single-pass backward log scan
    sort(undo_list.begin(), undo_list.end(),
         [](const UndoTask& a, const UndoTask& b) {
             return a.undo_next_lsn > b.undo_next_lsn;
         });
    
    log_info("Undo list built: %d transactions, max undoNextLSN=%d",
             undo_list.size(),
             undo_list.empty() ? 0 : undo_list[0].undo_next_lsn);
    
    return undo_list;
}

Converting Mermaid diagram...

Why Descending Order?

Handoff to the Redo Phase

With Analysis complete, the system transitions to the Redo Phase. The Analysis output provides everything Redo needs to efficiently reconstruct crash-time page states.

What Redo Receives:

handoff_to_redo.cpp
C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
class RedoPhase {
public:
    void initialize(const AnalysisPhaseResult& analysis_result) {
        // Store references to analysis output
        this->dirty_page_table = analysis_result.dirty_page_table;
        this->redo_lsn = analysis_result.redo_lsn;
        this->end_of_log_lsn = analysis_result.end_of_log_lsn;
        
        // Pre-calculate redo scope
        this->redo_scope = end_of_log_lsn - redo_lsn;
        
        log_info("Redo Phase initialized:");
        log_info("  Start LSN: %d", redo_lsn);
        log_info("  End LSN: %d", end_of_log_lsn);
        log_info("  Scope: %d LSN units", redo_scope);
        log_info("  Dirty pages to consider: %d", dirty_page_table.size());
    }
    
    void run() {
        log_info("Starting Redo Phase...");
        
        LSN current_lsn = redo_lsn;
        int records_processed = 0;
        int records_applied = 0;
        
        while (current_lsn < end_of_log_lsn) {
            LogRecord record = log_manager.readLog(current_lsn);
            records_processed++;
            
            if (shouldRedo(record)) {
                applyRedo(record);
                records_applied++;
            }
            
            current_lsn = record.next_lsn;
            
            // Progress logging
            if (records_processed % 100000 == 0) {
                log_info("Redo progress: %d/%d records, %d applied",
                        records_processed, redo_scope, records_applied);
            }
        }
        
        log_info("Redo Phase complete: %d records processed, %d applied",
                 records_processed, records_applied);
    }
    
private:
    bool shouldRedo(const LogRecord& record) {
        // Only consider redoable record types
        if (!record.isRedoable()) return false;
        
        PageID pid = record.page_id;
        
        // Check 1: Is this page in the DPT?
        if (!dirty_page_table.contains(pid)) {
            // Page not dirty - already on disk
            return false;
        }
        
        // Check 2: Is record LSN >= page's recLSN?
        const DirtyPageEntry& entry = dirty_page_table.get(pid);
        if (record.lsn < entry.recovery_lsn) {
            // Record is before this page became dirty - already on disk
            return false;
        }
        
        // Check 3: Read page and check pageLSN
        Page* page = buffer_manager.getPage(pid);
        if (page->getPageLSN() >= record.lsn) {
            // Page already has this change
            return false;
        }
        
        // All checks passed - this record needs to be redone
        return true;
    }
};

Redo Efficiency

Handoff to the Undo Phase

After Redo completes, the Undo Phase uses the transaction table and undo list to roll back incomplete transactions. The Analysis output ensures Undo knows exactly what work to perform.

What Undo Receives:

handoff_to_undo.cpp
C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
class UndoPhase {
public:
    void initialize(const AnalysisPhaseResult& analysis_result) {
        this->transaction_table = analysis_result.transaction_table;
        this->undo_list = analysis_result.undo_list;
        
        // Calculate total undo work
        int total_records = 0;
        for (const auto& task : undo_list) {
            total_records += task.estimated_records;
        }
        
        log_info("Undo Phase initialized:");
        log_info("  Transactions to undo: %d", undo_list.size());
        log_info("  Estimated records: %d", total_records);
    }
    
    void run() {
        log_info("Starting Undo Phase...");
        
        // Build "ToUndo" set of LSNs from all undo tasks
        // This enables efficient single backward pass
        set<LSN, greater<LSN>> to_undo;  // Descending order
        
        for (const auto& task : undo_list) {
            if (task.undo_next_lsn != INVALID_LSN) {
                to_undo.insert(task.undo_next_lsn);
            }
        }
        
        int records_undone = 0;
        
        // Process in descending LSN order (single backward pass)
        while (!to_undo.empty()) {
            LSN lsn = *to_undo.begin();
            to_undo.erase(to_undo.begin());
            
            LogRecord record = log_manager.readLog(lsn);
            TransactionID tid = record.txn_id;
            
            if (record.type == LogType::CLR) {
                // CLR: follow its undoNextLSN
                if (record.undo_next_lsn != INVALID_LSN) {
                    to_undo.insert(record.undo_next_lsn);
                }
            } else if (record.isUndoable()) {
                // Undoable record: perform undo and log CLR
                performUndo(record);
                
                // Add CLR to log
                LogRecord clr = createCLR(record);
                LSN clr_lsn = log_manager.appendLog(clr);
                
                // Update transaction's lastLSN
                transaction_table.get(tid).last_lsn = clr_lsn;
                
                // Continue undo with record's prevLSN
                if (record.prev_lsn != INVALID_LSN) {
                    to_undo.insert(record.prev_lsn);
                }
                
                records_undone++;
            }
            
            // Check if this transaction is done undoing
            if (record.prev_lsn == INVALID_LSN || 
                !to_undo.contains(record.prev_lsn)) {
                // Transaction fully undone - write END
                writeEndRecord(tid);
            }
        }
        
        log_info("Undo Phase complete: %d records undone", records_undone);
    }
};

Undo Phase Processing for Different Record Types
Record Type	Action	Next LSN to Process
UPDATE	Apply inverse operation, log CLR	record.prevLSN
INSERT	Delete the inserted record, log CLR	record.prevLSN
DELETE	Re-insert the deleted record, log CLR	record.prevLSN
CLR	Nothing (already re-done if needed)	clr.undoNextLSN
BEGIN	Write END record, remove from undo list	Done

The Complete Recovery Flow

Let's visualize how the Analysis Phase output flows through the complete ARIES recovery process.

Converting Mermaid diagram...

complete_recovery.cpp
C++
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
void RecoveryManager::performRecovery() {
    log_info("========================================");
    log_info("DATABASE RECOVERY STARTED");
    log_info("========================================");
    
    auto start_time = chrono::high_resolution_clock::now();
    
    // Phase 1: Analysis
    log_info("\n=== ANALYSIS PHASE ===");
    AnalysisPhase analysis;
    AnalysisPhaseResult result = analysis.run();
    
    if (!result.validate()) {
        log_error("Analysis validation failed: %s", result.error_message.c_str());
        throw RecoveryError("Invalid analysis result - cannot proceed");
    }
    
    auto analysis_end = chrono::high_resolution_clock::now();
    log_info("Analysis completed in %d ms", 
             chrono::duration_cast<chrono::milliseconds>(analysis_end - start_time).count());
    
    // Phase 2: Redo
    log_info("\n=== REDO PHASE ===");
    if (result.redo_lsn == END_OF_LOG || result.dirty_page_table.empty()) {
        log_info("No redo required - database pages are current");
    } else {
        RedoPhase redo;
        redo.initialize(result);
        redo.run();
    }
    
    auto redo_end = chrono::high_resolution_clock::now();
    
    // Phase 3: Undo
    log_info("\n=== UNDO PHASE ===");
    if (result.undo_list.empty()) {
        log_info("No undo required - all transactions committed");
    } else {
        UndoPhase undo;
        undo.initialize(result);
        undo.run();
    }
    
    auto undo_end = chrono::high_resolution_clock::now();
    
    // Summary
    log_info("\n========================================");
    log_info("RECOVERY COMPLETE");
    log_info("========================================");
    log_info("Analysis time: %d ms", 
             chrono::duration_cast<chrono::milliseconds>(analysis_end - start_time).count());
    log_info("Redo time: %d ms",
             chrono::duration_cast<chrono::milliseconds>(redo_end - analysis_end).count());
    log_info("Undo time: %d ms",
             chrono::duration_cast<chrono::milliseconds>(undo_end - redo_end).count());
    log_info("Total time: %d ms",
             chrono::duration_cast<chrono::milliseconds>(undo_end - start_time).count());
    log_info("Transactions recovered: %d (winners), %d (losers)",
             result.winners.size(), result.undo_list.size());
}

Summary: Analysis Phase Complete

Key Takeaways

•Complete Output: Analysis produces DPT (for redo), Transaction Table (for undo), RedoLSN (start point), and ordered undo list (rollback targets).
•State Invariants: Critical invariants must hold (RedoLSN ≤ min recLSN, etc.) for recovery correctness—violated invariants indicate system bugs.
•DPT Analysis: Reveals page-level recovery needs, checkpoint effectiveness, and potential flushing issues.
•Transaction Analysis: Identifies losers vs winners, estimates undo work, and flags problematic long-running transactions.
•Undo List Ordering: Descending undoNextLSN enables efficient single backward pass through the log.
•Phase Handoffs: Analysis output directly drives Redo (DPT, RedoLSN) and Undo (Transaction Table, undo list) phases.

Module Complete:

You have now mastered the ARIES Analysis Phase—the critical first step of crash recovery that transforms raw log data into precise recovery instructions. With this knowledge, you understand:

How the Dirty Page Table tracks pages needing redo
How the Transaction Table tracks transactions needing undo
How log scanning reconstructs crash-time state
How the RedoLSN bounds recovery work
How all components integrate for efficient recovery

The next modules cover the Redo Phase (repeating history) and Undo Phase (rolling back losers), completing the ARIES recovery algorithm.

Module Complete