Log Based Recovery - Learning Module

Loading content...

0/241

Log Concept

The Silent Guardian of Database Integrity

Imagine a scenario that haunts every database administrator: a power outage strikes at the worst possible moment—right in the middle of a critical banking transaction transferring $50,000 between accounts. The money has been deducted from one account, but before it could be credited to the other, everything went dark. When the system restarts, how does the database know what happened? How does it recover to a consistent state without losing the customer's money or, worse, duplicating it?

The answer lies in one of the most elegant and essential mechanisms in database systems: the transaction log.

The transaction log is the unsung hero of database reliability. While indexes speed up queries and buffer pools optimize performance, it's the log that makes the fundamental promise of databases possible—that once data is committed, it will survive any failure, and if a transaction is incomplete, it will be as if it never happened.

What You Will Learn

By the end of this page, you will understand what a transaction log is, why it's considered the backbone of database recovery, how it records the complete history of database modifications, and why even the most sophisticated modern databases rely on this fundamental concept introduced decades ago.

What Is a Transaction Log

A transaction log (also called a write-ahead log, redo log, or simply log) is a sequential, append-only file that records every modification made to the database. Before any change is applied to the actual data pages on disk, a record of that change is first written to the log. This seemingly simple principle—log first, then modify—is the foundation of all database recovery mechanisms.

Think of the transaction log as a meticulous diary that the database keeps. Every single operation—inserting a row, updating a value, deleting a record—is documented in this diary before it happens. If anything goes wrong, the database can read this diary to understand exactly what happened and determine how to restore consistency.

The Core Principle

The transaction log answers one critical question: "What happened?" It provides a complete, ordered history of all database modifications, enabling the system to either replay operations (redo) or reverse them (undo) during recovery.

Key Characteristics of Transaction Logs:

Sequential Writing: Log records are written in strict chronological order, one after another. This sequential nature makes logging extremely fast—much faster than random disk I/O for data pages.
Append-Only Structure: New records are always added at the end. Existing records are never modified or overwritten (except during log truncation/archival). This immutability is crucial for reliability.
Durable Storage: Logs are written to stable storage (disk, SSD, or other persistent media) and often to multiple locations. The log must survive the same failures it's designed to recover from.
Sufficient Information: Each log record contains enough information to either redo the operation (for completed transactions) or undo it (for incomplete transactions).

Transaction Log vs. Data Files
Characteristic	Transaction Log	Data Files
Write Pattern	Sequential (append-only)	Random (scattered pages)
Write Speed	Very fast (sequential I/O)	Slower (random I/O)
Content	Operations/changes performed	Current state of data
Modification	Append new records only	Update existing pages
Primary Purpose	Recovery and durability	Current data storage
Growth	Continuous until truncated	Based on data volume

Why Logs Are Essential

To understand why transaction logs are absolutely essential, we need to consider the fundamental challenge that databases face: the disparity between volatile and persistent memory.

Databases perform modifications in memory (in the buffer pool) for performance reasons. Writing every change immediately to disk would be prohibitively slow. However, memory is volatile—it loses its contents when power is lost. This creates a dilemma:

For performance: Keep modified pages in memory as long as possible
For durability: Ensure committed data survives failures

Without a transaction log, these goals would be mutually exclusive. The log resolves this tension elegantly.

Critical Functions of the Transaction Log

•Durability Guarantee — The 'D' in ACID. Once a transaction commits, its changes are recorded in the log on stable storage. Even if the system crashes before modified data pages reach disk, the log contains enough information to reconstruct those changes.
•Atomicity Support — The 'A' in ACID. If a transaction fails partway through, the log contains the information needed to undo all its partial changes, restoring the database to its state before the transaction began.
•Crash Recovery — After a system failure, the database examines the log to determine which transactions completed (and need their changes restored) and which were incomplete (and need their changes rolled back).
•Point-in-Time Recovery — By replaying the log, administrators can restore the database to any specific moment in time—essential for recovering from user errors, data corruption, or application bugs.
•Replication — In distributed systems, the log serves as the source of truth for replicating changes to standby servers, enabling high availability and disaster recovery.

The Cost of Skipping Logs

Some systems offer 'unlogged' or 'nologging' modes for performance. While these can dramatically speed up bulk operations, they sacrifice durability. Data written without logging may be lost in a crash, and the database may be left in an inconsistent state. This trade-off is acceptable only for data that can be easily regenerated.

The Anatomy of a Log

A transaction log is organized as a sequence of log records, each documenting a specific action performed by a transaction. Understanding the structure of these records is fundamental to understanding how recovery works.

Each log record contains identifying information that allows the recovery system to process it correctly. While specific implementations vary, conceptually every log record includes these core elements:

Core Components of a Log Record
Component	Description	Example
Log Sequence Number (LSN)	A unique, monotonically increasing identifier for each log record	LSN: 00045782
Transaction ID	Identifies which transaction generated this record	TxnID: T-2847
Operation Type	The type of action being recorded	UPDATE, INSERT, DELETE, COMMIT, ABORT, BEGIN
Object Identifier	Which data item is being modified	Table: accounts, Page: 42, Slot: 17
Before Image	The value(s) before the modification (for undo)	balance: 5000.00
After Image	The value(s) after the modification (for redo)	balance: 4500.00
Previous LSN	Links to this transaction's previous log record	PrevLSN: 00045761

The LSN: The Log's Primary Key

The Log Sequence Number (LSN) deserves special attention. It serves multiple critical purposes:

Unique Identification: Every log record has a unique LSN that never repeats.
Ordering: LSNs are assigned in the order operations occur, providing a total ordering of all database modifications.
Page Tracking: Each data page stores the LSN of the last log record that modified it. This allows the recovery system to quickly determine if a page needs to be updated during redo.
Recovery Coordination: The LSN serves as a reference point for determining what work needs to be done during crash recovery.

The LSN is typically implemented as a combination of log file number and offset within that file, making it both unique and ordered.

Conceptual Log Record Structure
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// Conceptual structure of a log record
// (Actual implementation varies by database system)
 
struct LogRecord {
    // ===== Identification =====
    uint64_t    lsn;              // Log Sequence Number (unique identifier)
    uint32_t    transaction_id;    // ID of the transaction that generated this record
    uint64_t    prev_lsn;          // LSN of this transaction's previous log record
    
    // ===== Operation Details =====
    OperationType  operation;      // BEGIN, UPDATE, INSERT, DELETE, COMMIT, ABORT, etc.
    TableId        table_id;       // Which table is affected
    PageId         page_id;        // Which page is affected
    uint16_t       slot_number;    // Which record within the page
    
    // ===== Undo/Redo Information =====
    byte[]         before_image;   // Old value (for undo operations)
    byte[]         after_image;    // New value (for redo operations)
    
    // ===== Additional Metadata =====
    timestamp      timestamp;      // When this operation occurred
    uint32_t       checksum;       // For integrity verification
}
 
// Example log records for a transfer transaction:
// LSN=001: [T1, BEGIN]
// LSN=002: [T1, UPDATE, accounts, page=5, slot=12, 
//           before={balance:1000}, after={balance:500}]
// LSN=003: [T1, UPDATE, accounts, page=8, slot=3, 
//           before={balance:2000}, after={balance:2500}]
// LSN=004: [T1, COMMIT]

The Log as a Complete Timeline

One of the most powerful ways to understand the transaction log is to view it as a complete timeline of database history. Unlike data files, which only show the current state, the log preserves the sequence of events that led to that state.

Consider this analogy: If the database's data files are like a photograph—showing how things look right now—then the transaction log is like a video recording—capturing every moment of action that brought us to the present.

This timeline perspective reveals why the log is so valuable:

Data Files (Photograph)

•Shows current state only
•Overwrites previous values
•No history preserved
•Cannot determine sequence
•Cannot recover to past point
•Transaction boundaries invisible

Transaction Log (Video)

•Preserves entire history
•Keeps before and after images
•Complete audit trail
•Strict chronological ordering
•Can replay to any point
•Transaction boundaries explicit

Replaying History

Because the log is a complete record of changes, you can theoretically:

Start from an empty database and replay the entire log to arrive at the current state
Start from any backup and replay the log from that point forward
Stop at any point in the log to see what the database looked like at that moment

This capability is fundamental to point-in-time recovery (PITR), where administrators can restore a database to any specific moment—for example, "right before that accidental DELETE was executed."

The Timeline in Practice:

When a crash occurs, the timeline nature of the log enables precise recovery:

Each committed transaction's changes are present in order
Each uncommitted transaction's partial changes are also recorded
The recovery process replays (redoes) committed work
Then it reverses (undoes) uncommitted work
Result: Database returns to a consistent state

Log Retention Trade-offs

While the log provides a complete history, keeping it forever isn't practical. Logs are typically truncated after checkpoints (covered later) or archived for long-term retention. The balance between log retention and storage costs is a key administrative decision.

The Write-Ahead Logging Protocol

The transaction log's power derives from a strict protocol known as Write-Ahead Logging (WAL). This protocol establishes the fundamental guarantee that makes recovery possible:

The WAL Rule: Before any data modification is written to the database files on disk, the log record describing that modification must first be written to stable storage.

This rule is absolute—there are no exceptions. Every database system that provides durability guarantees enforces this rule rigorously.

Why Write Ahead?

By writing the log record first, the database ensures that even if a crash occurs immediately after modifying a data page, the log contains enough information to either complete the operation (redo) or reverse it (undo). Without the log record, the system would have no way to know what was being attempted.

The Flow of a Modification:

Transaction Modifies Data: A transaction updates a row in the buffer pool (in memory)
Log Record Created: A log record is created describing the change (what table, what page, old value, new value)
Log Record Written to Disk: The log record is written to the log buffer and then flushed to stable storage
Data Page Eventually Written: At some later point, the modified data page is written from the buffer pool to the data files on disk

The critical sequence is: Log record to disk → Data page to disk. Never the reverse.

Why This Order Matters:

Crash Scenarios and WAL Protection

•Crash after log write, before data write: Log contains the modification. On recovery, redo the operation from the log. ✓ Safe
•Crash after both log and data write: Log confirms the modification. Data already reflects it. ✓ Safe
•Crash before log write: No record of modification. Data page in memory is lost. Transaction was never committed, so atomicity preserved. ✓ Safe
•Hypothetical: Data written before log: Crash leaves data modified but no log record. Recovery has no way to undo the change or verify it was intentional. ✗ Corrupted state

The Commit Protocol:

WAL has a specific requirement for commits:

When a transaction commits, its commit log record must be written to stable storage before the commit is acknowledged to the application.

This ensures that once an application receives confirmation that a transaction committed, the database guarantees that the transaction's effects will survive any subsequent failure. This is the durability promise in action.

Conceptual Example: Following a Transaction

Let's trace through a complete transaction to see how the log captures its activity. Consider a funds transfer from Account A to Account B:

Transaction: Transfer $500 from Account A to Account B
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
-- Transaction T1: Transfer $500 from Account A to Account B
-- Initial state: Account A = $1000, Account B = $2000
 
BEGIN TRANSACTION;  -- Start of the atomic unit
 
UPDATE accounts 
SET balance = balance - 500 
WHERE account_id = 'A';
-- Account A: $1000 → $500
 
UPDATE accounts 
SET balance = balance + 500 
WHERE account_id = 'B';  
-- Account B: $2000 → $2500
 
COMMIT;  -- Make changes permanent

The Log Records Generated:

As this transaction executes, the database generates a sequence of log records. Here's what the log would conceptually contain:

Log Records for Transaction T1
LSN	Transaction	Operation	Details
001	T1	BEGIN	Transaction T1 started
002	T1	UPDATE	accounts, A: before={balance:1000}, after={balance:500}
003	T1	UPDATE	accounts, B: before={balance:2000}, after={balance:2500}
004	T1	COMMIT	Transaction T1 committed successfully

What the Log Enables:

If crash occurs after LSN 001: Transaction was just starting. No changes to undo.
If crash occurs after LSN 002: First update was logged. During recovery, this uncommitted change will be undone using the before image (balance restored to $1000).
If crash occurs after LSN 003: Both updates logged but no commit. During recovery, both changes will be undone (A → $1000, B → $2000).
If crash occurs after LSN 004: Commit is on stable storage. During recovery, if data pages weren't written yet, the changes will be redone using the after images. The commitment survives.

This is the magic of log-based recovery: regardless of when the crash happens, the database can always reach a consistent state where all committed transactions are fully applied and all uncommitted transactions are fully reversed.

The Consistency Promise

The log ensures that no "partial" state ever persists. Either both updates happen (after commit) or neither happens (if crash before commit). The $500 is never lost or duplicated—atomicity is guaranteed.

Why Not Other Approaches?

Given the complexity that logging adds, you might wonder: are there alternatives? Why not just write every change directly to disk? Or keep backup copies? Let's examine why logging emerged as the dominant approach:

Recovery Approaches Comparison
Approach	Method	Drawbacks
Force-at-Commit	Write all modified data pages to disk before acknowledging commit	Extremely slow—random disk I/O for every commit; impractical for high-throughput systems
Shadow Paging	Maintain two copies of each page; swap pointers atomically	Complex space management; difficult to implement with buffer pools; poor performance for large transactions
Periodic Snapshots	Periodically save entire database state	Loses all work since last snapshot; long recovery times; impractical for large databases
WAL (Write-Ahead Logging)	Log all changes sequentially; write data pages lazily	Some overhead for log writes, but vastly superior performance and recovery characteristics

Why WAL Wins:

Sequential I/O: Logging uses sequential writes, which are dramatically faster than the random writes needed for direct data page updates. Modern storage still shows 10-100x performance differences.
Minimal Commit Latency: A commit only requires writing a small log record, not potentially dozens of scattered data pages.
Buffer Pool Optimization: Data pages can remain in the buffer pool and be written opportunistically, potentially batching multiple modifications to the same page.
Complete History: The log provides capabilities beyond recovery—replication, auditing, point-in-time restore—that other approaches cannot match.
Proven at Scale: WAL has been tested and refined over 40+ years in systems handling billions of transactions. It works.

Modern Storage Considerations

While SSDs have reduced the gap between sequential and random I/O, the fundamental advantages of WAL remain. The ability to minimize commit latency and maintain a complete history makes logging essential even on the fastest storage technologies.

Summary: The Foundation of Recovery

We've established the fundamental concept of transaction logging—the cornerstone of database recovery. Let's consolidate the key insights:

Key Takeaways

•The transaction log is a sequential, append-only record of all database modifications — It serves as the complete history of changes, enabling recovery from any failure.
•Logs enable the separation of durability from performance — Modified pages can stay in memory while the log on disk guarantees recoverability.
•Each log record contains sufficient information for both redo and undo — Before images enable rollback; after images enable replay.
•The Log Sequence Number (LSN) provides ordering and coordination — Every change has a unique position in the log timeline.
•Write-Ahead Logging (WAL) is the protocol that makes it all work — Log before data, commit record before acknowledgment.
•Logging is superior to alternatives in performance and capabilities — Sequential I/O, minimal commit latency, complete history, and proven reliability.

What's Next:

Now that we understand the concept of transaction logging, we'll examine the specific types of log records that make up the log. We'll explore the different record types for data modifications, transaction boundaries, and system operations, and understand how each contributes to recovery.

Page Complete

You now understand the fundamental concept of transaction logging—the silent guardian that makes database durability and recovery possible. Next, we'll dive deeper into the specific types of log records and the rich information they contain.