Loading learning content...
What if we could achieve atomicity and durability without any logging at all? No WAL, no before-images, no after-images, no log force at commit. Shadow Paging offers exactly this—an elegant technique that makes recovery trivial by never modifying committed data in place.
The core idea is deceptively simple: instead of modifying existing pages and logging changes, write modifications to new pages (shadows), then atomically switch to the new version at commit. If a transaction fails, simply discard the shadow pages—the original data remains untouched.
By the end of this page, you will understand the shadow paging mechanism in detail, including dual page tables, atomic commit operations, garbage collection requirements, and the practical limitations that prevent shadow paging from being widely adopted in production DBMS implementations.
Shadow paging operates on a simple principle:
Never modify a page in place. Instead, write changes to a new page location (the shadow), and atomically switch pointers at commit.
This copy-on-write semantics provides built-in transaction atomicity:
The Dual Page Table Architecture:
Shadow paging requires two page tables:
When a transaction begins, the shadow page table is initialized as a copy of the current page table. As the transaction modifies pages, the shadow table is updated to point to new page locations containing the modifications.
| Principle | Description | Benefit |
|---|---|---|
| Copy-on-write | Never modify existing pages in place | Original data always intact |
| Dual page tables | Current (committed) and shadow (in-progress) | Clear boundary between states |
| Atomic commit | Single pointer swap from current to shadow | Instant, failure-atomic commit |
| No logging | Page-level versioning replaces logging | Simpler conceptual model |
| Instant abort | Discard shadow table and pages | No undo operations required |
Shadow paging was popularized by System R at IBM in the 1970s. While System R ultimately moved to a log-based approach (which evolved into today's standard techniques), shadow paging influenced file systems (like ZFS and Btrfs) and some specialized database systems.
Let's trace through a transaction's lifecycle under shadow paging to understand the mechanisms in detail.
Transaction Start:
During Execution (First Write to a Page):
Subsequent Writes to the Same Page:
12345678910111213141516171819202122232425262728293031323334353637
PROCEDURE Begin_Transaction(): // Create shadow page table as copy of current shadow_table := COPY(current_page_table) // Shadow table is now active for this transaction active_page_table := shadow_table RETURN transaction_handle PROCEDURE Write_Page(txn, logical_page_id, data): shadow_entry := shadow_table[logical_page_id] current_entry := current_page_table[logical_page_id] IF shadow_entry.physical_page = current_entry.physical_page THEN // First write to this page - need to create shadow copy new_physical_page := allocate_free_page() // Write modified data to new location WRITE_DISK(new_physical_page, data) // Update shadow table to point to new location shadow_table[logical_page_id].physical_page := new_physical_page ELSE // Shadow already exists - update in place WRITE_DISK(shadow_entry.physical_page, data) END IF RETURN success PROCEDURE Read_Page(txn, logical_page_id): // Always read through shadow table (sees transaction's modifications) physical_page := shadow_table[logical_page_id].physical_page data := READ_DISK(physical_page) RETURN dataVisual Example:
Consider modifying pages 1 and 3:
INITIAL STATE (before transaction):
Current Page Table Physical Disk
+-----------+--------+ +---+---+---+---+---+---+
| Logical | Physical| |P1 |P2 |P3 |P4 | | |
+-----------+--------+ +---+---+---+---+---+---+
| Page 1 | P1 | ^ ^
| Page 2 | P2 | | |
| Page 3 | P3 | +--- Current data ---+
| Page 4 | P4 |
+-----------+--------+
AFTER MODIFICATION (transaction active):
Current Page Table Physical Disk Shadow Page Table
+-----------+--------+ +---+---+---+---+---+---+ +-----------+--------+
| Page 1 | P1 | |P1 |P2 |P3 |P4 |P1'|P3'| | Page 1 | P1' |
| Page 2 | P2 | +---+---+---+---+---+---+ | Page 2 | P2 |
| Page 3 | P3 | ^ ^ ^ ^ | Page 3 | P3' |
| Page 4 | P4 | | | | | | Page 4 | P4 |
+-----------+--------+ | | +---+ +-----------+--------+
| | Shadow pages
+-------+
Original data (unchanged)
Pages 1 and 3 have shadow copies (P1' and P3'). The transaction sees the shadow versions, but the original database (via current page table) is unchanged.
The elegance of shadow paging lies in its commit operation: a single atomic pointer swap makes all changes visible instantaneously.
The Commit Sequence:
Why Is This Atomic?
The critical step (3) modifies a single disk block—the block containing the page table pointer. Disk writes of a single block are atomic at the hardware level (they either complete fully or not at all). This single-block atomicity provides transaction atomicity:
123456789101112131415161718192021222324252627282930313233343536373839
PROCEDURE Commit_Transaction(txn): // Step 1: Force all shadow pages to disk FOR EACH dirty_page IN buffer_pool WHERE page.is_shadow: FORCE_WRITE_TO_DISK(dirty_page) END FOR // Step 2: Force shadow page table to disk FORCE_WRITE_TO_DISK(shadow_table) // Step 3: Atomic pointer swap (THE COMMIT POINT) // This writes to a single disk block (e.g., superblock) // Hardware guarantees single-block writes are atomic ATOMIC_WRITE(db_header.current_page_table_ptr, shadow_table_address) // Transaction is now committed // Any crash after this point sees the committed state // Step 4: Mark old pages and old page table for garbage collection old_pages := pages_only_in(old_current_table, shadow_table) garbage_queue.add(old_pages) garbage_queue.add(old_current_table) RETURN success PROCEDURE Abort_Transaction(txn): // Simply discard the shadow table and shadow pages shadow_pages := pages_only_in(shadow_table, current_page_table) // Return shadow pages to free pool FOR EACH page IN shadow_pages: free_page(page) END FOR // Discard shadow table free_page_table(shadow_table) // Current page table is still valid - no other action needed RETURN successCommit After the Swap:
AFTER COMMIT:
New Current Page Table (was shadow) Physical Disk
+-----------+--------+ +---+---+---+---+---+---+
| Page 1 | P1' |<-- Current |P1 |P2 |P3 |P4 |P1'|P3'|
| Page 2 | P2 | +---+---+---+---+---+---+
| Page 3 | P3' | G G ^ ^
| Page 4 | P4 | | | | |
+-----------+--------+ +-Garbage---+ +---+
New data
Old Page Table (garbage): points to P1, P2, P3, P4
Pages P1 and P3 are now garbage (no table references them)
They can be reclaimed for future shadow pages
The entire transaction—potentially modifying hundreds of pages—commits atomically through a single pointer write. This is a powerful demonstration of reducing complex atomicity to a single atomic operation. The same principle appears in other systems (file system transactions, lock-free data structures).
Recovery in shadow paging is remarkably simple—arguably trivial. After a crash, the system simply:
There is no REDO. There is no UNDO. The current page table always reflects the last committed state.
Why Recovery Is So Simple:
If a transaction was in progress when the crash occurred, its shadow pages were never linked to the current table. They're orphaned on disk—garbage waiting to be reclaimed.
If a transaction had committed, the pointer swap completed atomically, so the current table correctly reflects the committed state.
There's no in-between state. The pointer swap is all-or-nothing.
123456789101112131415161718192021222324252627282930313233
PROCEDURE Recover_Database(): // Read the database header (superblock) header := READ_DISK(SUPERBLOCK_ADDRESS) // The current page table pointer is always valid current_table := READ_DISK(header.current_page_table_ptr) // The database is now consistent and ready // current_table points to all committed data // Optional: Garbage collection of orphaned shadow pages // Can be done lazily after recovery completes SCHEDULE_BACKGROUND_GC() RETURN "Recovery complete" // That's it. Recovery is essentially a single disk read. PROCEDURE Background_Garbage_Collection(): // Find all allocated pages all_allocated := scan_allocation_map() // Find all pages referenced by current page table referenced_pages := collect_all_references(current_table) // Orphaned pages = allocated but not referenced orphaned := all_allocated - referenced_pages // Return orphaned pages to free pool FOR EACH page IN orphaned: mark_as_free(page) END FORRecovery Time Comparison:
| Algorithm | Recovery Operations | Typical Time |
|---|---|---|
| Deferred Update | Scan log, REDO committed | Minutes |
| Immediate Update | Scan log, REDO all, UNDO uncommitted | Minutes to hours |
| Shadow Paging | Read superblock | Milliseconds |
This near-instant recovery is shadow paging's most compelling advantage. After a crash, the database is available almost immediately.
Shadow paging provides the strongest possible recovery guarantee: the database is always consistent, always recoverable, and recovery takes constant time regardless of database size or transaction history. There is no log to scan, no REDO to apply, no UNDO to perform.
Despite its conceptual elegance and instant recovery, shadow paging has significant practical limitations that explain why logging-based approaches dominate in production systems.
1. Data Fragmentation:
As transactions modify pages, shadow copies are written to wherever free space exists. Over time, logically sequential data becomes physically scattered:
2. Concurrency Challenges:
The basic shadow paging model assumes a single transaction at a time. Supporting concurrent transactions is complex:
Solutions exist (multi-version shadow paging) but add significant complexity.
3. Commit Latency:
At commit time, shadow paging must:
This is significantly more I/O than log-based commits where only small log records are forced. A transaction modifying 1000 pages requires 1000 page writes before commit completes.
4. Page Table Size:
A 10 TB database with 8 KB pages has ~1.3 billion pages. The page table must track all of them:
Shadow paging trades log I/O for data I/O—but data writes are scattered random writes, while log writes are sequential appends. Sequential writes are far more efficient, especially on magnetic disks but also on SSDs. This makes the apparent simplicity of 'no logging' actually more expensive in practice.
While pure shadow paging is rare in general-purpose DBMS, its principles appear in several modern contexts where its trade-offs become acceptable or even advantageous.
Copy-on-Write File Systems:
File systems like ZFS, Btrfs, and APFS use shadow paging principles:
File systems benefit because:
In-Memory Databases:
Some in-memory databases use shadow paging because:
LMDB (Lightning Memory-Mapped Database):
LMDB is a notable example of a production database engine using copy-on-write (a form of shadow paging):
LMDB demonstrates that shadow paging can work excellently when the use case matches its strengths: embedded databases, single-writer scenarios, and applications valuing read performance and crash resilience over write throughput.
SQLite Rollback Journal Mode:
SQLite's default rollback journal mode is a variation of shadow paging:
This is 'inverse shadow paging'—shadowing the original instead of the modification.
Many modern systems use hybrid approaches: logging for fine-grained transaction recovery combined with copy-on-write at larger granularities for snapshots and backups. Understanding both approaches helps in designing systems that leverage the best of each.
Understanding when to use shadow paging versus logging requires a clear comparison of their characteristics and trade-offs.
| Characteristic | Shadow Paging | Log-Based (WAL) |
|---|---|---|
| Recovery time | Near-instant (read superblock) | Proportional to log size/checkpoint interval |
| Commit latency | High (force all pages) | Low (force log only) |
| I/O pattern at commit | Random writes (scattered pages) | Sequential append (log) |
| Data locality | Degrades over time | Preserved (in-place updates) |
| Concurrent transactions | Complex/limited | Well-supported |
| Transaction size | Must fit in page table | Unlimited (with checkpoints) |
| Incremental backup | Difficult | Log shipping/archiving |
| Point-in-time recovery | Requires historical snapshots | Replay log to any point |
| Implementation complexity | Conceptually simple | More complex (UNDO/REDO) |
| Snapshot support | Natural (preserve pointers) | Requires additional mechanisms |
| Storage overhead | Shadow pages + old pages | Log (can be truncated) |
The Core Trade-off:
Shadow paging optimizes for recovery simplicity and speed at the cost of normal-operation write efficiency.
Logging optimizes for write efficiency during normal operation at the cost of recovery complexity.
Since systems spend vastly more time in normal operation than in recovery, log-based approaches provide better overall performance for most workloads. However, for systems where instant recovery is critical (embedded devices that crash frequently, safety-critical systems), shadow paging's trade-offs may be acceptable.
Choose shadow paging when: instant recovery is paramount, write workloads are light, concurrency is limited, and snapshot features are valuable. Choose logging when: write performance matters, concurrent transactions are needed, databases are large, and you need point-in-time recovery.
Shadow paging offers an elegant alternative to log-based recovery through copy-on-write semantics. Let's consolidate the key concepts:
What's Next:
In the next page, we'll introduce ARIES (Algorithms for Recovery and Isolation Exploiting Semantics)—the industry-standard recovery algorithm that combines immediate update's flexibility with sophisticated optimizations. ARIES represents the culmination of decades of recovery research and is implemented by virtually every major commercial DBMS.
You now understand shadow paging's mechanism, its elegant atomic commit, its trivial recovery, and its practical limitations. This knowledge helps you appreciate both why shadow paging isn't the dominant approach and why its principles persist in file systems and specialized databases.