Loading learning content...
Every database modification must be recorded in the log with enough information to either redo the change (if committed but not on disk) or undo the change (if not committed but on disk). But how exactly should we record this information?
At one extreme, physical logging records the exact byte-level changes to pages. At the other extreme, logical logging records the high-level operation ("insert row X into table T"). Neither extreme is ideal—physical logging wastes space while logical logging complicates concurrency.
Physiological logging represents ARIES's brilliant synthesis: use physical identification (specific page) combined with logical operation description (what operation, not what bytes). This hybrid captures the benefits of both approaches while avoiding their pitfalls.
By the end of this page, you will understand the full spectrum of logging approaches, why pure physical and pure logical logging are both problematic, how physiological logging works, its interaction with page-level locking and latching, and why this approach is essential to ARIES's efficiency.
To understand physiological logging, we must first understand what it replaces. Database systems have experimented with three fundamental approaches to recording modifications:
Each approach makes different trade-offs between space efficiency, recovery complexity, and concurrency support.
| Aspect | Physical Logging | Logical Logging | Physiological Logging |
|---|---|---|---|
| What's recorded | Before/after byte images | Operation + parameters | Page ID + operation |
| Example | Page 42, offset 100, old: 'abc', new: 'xyz' | INSERT INTO orders VALUES (101, 'Widget', 50) | Page 42: insert slot 7, key=101 |
| Space efficiency | Poor (large images) | Excellent (compact) | Good (compact per page) |
| Redo complexity | Simple (apply bytes) | Complex (re-execute) | Simple (apply to page) |
| Undo complexity | Simple (restore bytes) | Complex (inverse op) | Simple (inverse to page) |
| Concurrency | Problematic (page conflicts) | Problematic (logical conflicts) | Excellent (page-level clarity) |
Let's examine each approach in detail to understand why ARIES chose physiological logging.
Physical logging records the exact before-image and after-image of every byte modified on a page. This is the most intuitive approach: to undo, restore the before-image; to redo, apply the after-image.
Example of a Physical Log Record:
LSN: 1001
Transaction: T42
Type: UPDATE
Page: 157
Offset: 2048
Length: 256
Before-Image: [256 bytes of original data]
After-Image: [256 bytes of new data]
Advantages of Physical Logging:
Idempotent Operations: Redo and undo are always idempotent. Applying the same after-image twice produces the same result. This simplifies recovery.
Simple Recovery Logic: Recovery doesn't need to understand the semantics of operations. Just copy bytes.
Format Independence: The log doesn't depend on higher-level data structures. If the record format changes, old physical logs still work.
No Re-execution: Redo doesn't re-execute complex operations; it directly applies byte changes.
Fundamental Problems with Physical Logging:
Space Inefficiency: A single row insert might modify the page header (updating free space pointers), a slot directory entry, and the actual row data. Physical logging records all these bytes, even though they could be derived from "insert this row."
Worse for Large Operations: Index splits that modify thousands of bytes across multiple pages generate enormous log records.
Byte-Level Conflicts: Two transactions modifying different rows on the same page might have overlapping byte ranges due to page reorganization. Physical logging makes conflict detection complex.
Recovery Sequencing: If Transaction A and Transaction B both modify the same page, their physical log records must be applied in exactly the original order. Any interleaving produces incorrect byte offsets.
Physical logging is extremely sensitive to exact byte positions. If Page 42 is modified by T1 (which shifts data) and then by T2, replaying T2's log record from a clean page would apply changes at wrong offsets. This makes physical logging incompatible with most concurrency scenarios.
The Offset Problem with Physical Logging: Initial Page State (Page 42):┌─────────────────────────────────────────────────────────┐│ Header │ Slot Dir │ Free Space │ Row A │ Row B ││ (100B) │ (50B) │ (500B) │ (100B) │ (250B) │└─────────────────────────────────────────────────────────┘ ↑ T1 inserts Row C here (at offset 650) After T1's Insert:┌──────────────────────────────────────────────────────────────────┐│ Header │ Slot Dir │ Free Space │ Row A │ Row C │ Row B ││ (100B) │ (60B) │ (350B) │ (100B) │(150B) │ (250B) │└──────────────────────────────────────────────────────────────────┘ ↑ Row B shifted to offset 810 Now T2 updates Row B. Physical log records: T2 UPDATE: Page 42, offset 810, before=[...], after=[...] If we replay T2's log on the ORIGINAL page: Offset 810 might be in free space or Row A - WRONG! Physical logging requires replaying T1 first, always.Logical logging swings to the opposite extreme: record the high-level database operation, not the physical changes. To redo, re-execute the operation; to undo, execute the inverse operation.
Example of a Logical Log Record:
LSN: 1001
Transaction: T42
Type: INSERT
Table: orders
Values: (order_id=101, product='Widget', quantity=50)
Index Updates: [derived from operation]
Advantages of Logical Logging:
Extreme Space Efficiency: A complex operation that modifies many pages is recorded once, compactly. An index rebuild affecting millions of bytes might be a single log record.
Operation Semantics: The log is human-readable and semantically meaningful. Debugging and auditing are easier.
Schema Independence: The log doesn't care about page layouts. Migration to different storage formats is simpler.
Natural for Replication: Logical logs (like MySQL's binlog in statement mode) can be replayed on different physical configurations.
Fundamental Problems with Logical Logging:
Concurrency Complexity: If T1 inserts a row and T2 updates it, logical replay must understand the dependency. What if T1 aborts? Does T2's update still make sense? Logical logging requires sophisticated application-level reasoning.
Non-Atomic Operations: A single SQL INSERT might touch multiple index pages. If the system crashes mid-operation, which pages were modified? Logical logging doesn't tell us; we'd need to examine all possibly-affected pages.
Undo Complexity: The inverse of "INSERT row" is "DELETE row," but what if another transaction modified the row in between? Logical undo requires reasoning about current state, not just log contents.
Non-Determinism: Some operations involve system state (current time, random values, sequence numbers). Replaying the operation might produce different results.
With pure logical logging, if a crash occurs during a multi-page operation, recovery doesn't know which pages were modified. It might need to examine the entire database to find incomplete changes. This is unacceptable for large databases.
The Multi-Page Problem with Logical Logging: Operation: INSERT INTO orders VALUES (101, 'Widget', 50) This affects: - Table page (insert the row) - Primary index page (add entry) - Secondary index page 1 (product index) - Secondary index page 2 (order_id index) Logical Log Record: INSERT orders (101, 'Widget', 50) CRASH occurs after modifying table page and primary index,but BEFORE secondary indexes. Recovery sees: "INSERT orders (101, 'Widget', 50)" Questions recovery cannot answer: - Which pages were already modified? - Was the table page written to disk? - Was the primary index written to disk? - Should we redo the entire operation? - How do we avoid inserting duplicate index entries? Without page-level tracking, recovery is extremely complex.The Core Issue: Action vs. Page Boundaries:
Logical logging records actions (SQL statements, operations), but recovery needs to reason about pages (what's on disk vs. what should be). The mismatch creates fundamental problems:
This is the insight that leads to physiological logging.
Physiological logging combines the best of both worlds:
The term "physiological" captures this duality: physical targeting, logical content.
Example of a Physiological Log Record:
LSN: 1001
Transaction: T42
Type: INSERT
PageID: 157
Operation: insert_record
Slot: 7
Record: (101, 'Widget', 50)
Key Properties:
Page-Level Scope: Each log record affects exactly one page. No ambiguity about which pages were modified.
Logical Within Page: The operation (insert, delete, update) is described logically within the page's context. "Insert into slot 7" makes sense for any valid state of page 157.
Slot-Based, Not Offset-Based: Instead of byte offsets, physiological logging uses slot numbers or key values. This survives page reorganization.
Idempotent With LSN Check: Redo checks pageLSN < log record LSN. If already applied, skip. This provides idempotency.
Operation Inverses Are Simple: Undo of "insert slot 7" is "delete slot 7." The inverse is well-defined within the page context.
Physiological Log Record Structure: ┌──────────────────────────────────────────────────────────────────────┐│ PHYSIOLOGICAL LOG RECORD │├──────────────────────────────────────────────────────────────────────┤│ LSN: 1001 ← Log Sequence Number (ordering) ││ PrevLSN: 987 ← Previous LSN for this transaction ││ TransactionID: T42 ← Transaction identifier ││ Type: UPDATE ← Record type ││ ││ ┌────────── PHYSICAL PORTION ──────────┐ ││ │ PageID: 157 │ ← Specific page affected ││ └──────────────────────────────────────┘ ││ ││ ┌────────── LOGICAL PORTION ───────────┐ ││ │ Operation: update_record │ ← What operation ││ │ Slot: 7 │ ← Slot-level addressing ││ │ Before: (100, 'Old', 25) │ ← For UNDO ││ │ After: (100, 'New', 30) │ ← For REDO ││ └──────────────────────────────────────┘ ││ ││ UndoNextLSN: null ← For CLRs, points to next undo LSN │└──────────────────────────────────────────────────────────────────────┘ WHY THIS WORKS:═══════════════ 1. Physical Portion (PageID: 157): - Recovery knows EXACTLY which page to fetch - No ambiguity about affected pages - Can check pageLSN to determine if redo needed 2. Logical Portion (Slot 7): - Survives page compaction and reorganization - Row can move within page; slot remains valid - Simple inverse operation (update_record → restore before)Pages frequently reorganize internally—compacting free space, shifting rows. Slot numbers remain stable through these reorganizations. A row at 'slot 7' might move from byte offset 2048 to byte offset 1536, but it's still 'slot 7'. Physiological logging uses this stable addressing.
Physiological logging provides significant advantages over both pure physical and pure logical approaches. These advantages explain why ARIES and virtually all modern database systems use this technique.
Space Efficiency Example:
Consider inserting a 100-byte row into a page:
Physical Logging Records:
Total Physical Log Size: ~240 bytes
Physiological Logging Records:
Total Physiological Log Size: ~120 bytes
Savings: 50% for this simple case, more for complex operations
Concurrency Advantage:
Suppose two transactions modify the same page concurrently:
With physiological logging:
These log records are independent. During recovery, they can be applied in any order relative to each other (as long as overall LSN order is maintained). The slot-based addressing prevents conflicts.
With byte-level physical logging, the two operations might have overlapping byte ranges (if the INSERT caused row shifts), creating complex dependencies.
During redo, ARIES fetches the page and checks: is pageLSN < log record LSN? If yes, the log record's effect isn't on disk yet—apply it. If no, the effect is already on disk—skip it. This simple comparison provides idempotency without inspecting actual data values.
Physiological logging has an important interaction with ARIES's page latching model. Understanding this interaction is crucial for appreciating how ARIES achieves high concurrency.
The Key Insight: Page-Level Atomicity
ARIES treats each physiological log record as atomic with respect to the page it modifies. This means:
This provides a clear boundary: either the modification happened completely, or it didn't happen at all—there's no in-between state visible to other transactions or to recovery.
12345678910111213141516171819202122232425262728293031323334353637383940
// Physiological update with proper latching function updateRow(pageId: PageId, slot: Slot, newValue: Value) { // Step 1: Acquire page latch (short-term, no deadlock concern) page = bufferManager.getPage(pageId); page.latchExclusive(); try { // Step 2: Read current value (for undo information) oldValue = page.getSlotValue(slot); // Step 3: Generate log record WHILE HOLDING LATCH // This ensures no other modification can interleave logRecord = createUpdateLogRecord( transactionId: currentTxn.id, pageId: pageId, slot: slot, before: oldValue, after: newValue ); lsn = logManager.append(logRecord); // Step 4: Apply modification page.setSlotValue(slot, newValue); page.setPageLSN(lsn); // Record that this LSN is now reflected page.markDirty(); } finally { // Step 5: Release latch page.unlatchExclusive(); }} // CRITICAL INVARIANT:// The modification and log record creation are atomic from the// perspective of other transactions and recovery.//// No one can see the modification without the log record existing,// and no one can see a partial modification.Why This Matters:
Recovery Correctness: Recovery can reason about complete operations. A page is either in the state before the log record, or after—never in between.
Concurrency Safety: Other transactions either see the complete effect or no effect. No partial visibility.
No Torn Page Concerns: Even if a page is half-written to disk, recovery will either redo or undo the complete operation.
Simple Reasoning: Each log record is self-contained for its page. No need to group multiple log records atomically.
Latches vs. Locks:
It's important to distinguish latches from locks:
| Aspect | Latches | Locks |
|---|---|---|
| Purpose | Physical consistency | Logical isolation |
| Duration | Microseconds | Milliseconds to seconds |
| Deadlock | Avoided by coding convention | Detected and resolved |
| Scope | Pages, buffers | Rows, tables |
| Held across I/O? | No | Yes |
Latches protect the physical integrity of data structures during modification. Locks protect the logical isolation of transactions. Physiological logging works at the latch level—each log record represents a single latch-protected modification.
When a page is modified, its pageLSN is updated to the log record's LSN while the page latch is held. This creates an unbreakable relationship: if you see pageLSN = 1001, you know log record 1001's effect is on the page. If pageLSN = 1000, log record 1001's effect might or might not be on disk—but recovery will check and apply if needed.
A single high-level operation often modifies multiple pages. Consider an index split: the original page is modified, a new page is allocated, and a parent page is updated. How does physiological logging handle this?
The Answer: Multiple Log Records
Each page modification generates its own physiological log record. A B+-tree index split might generate:
Each record is independently undo-able and redo-able at the page level.
B+-Tree Page Split: Physiological Logging Before Split: ┌─────────────┐ │ Parent P │ │ [key: 50] │ └──────┬──────┘ │ ┌──────▼──────┐ │ Page A │ │ [10,20,30, │ │ 40,50,60, │ │ 70,80,90] │ ← OVERFLOW! └─────────────┘ After Split (insert key 55): ┌─────────────────┐ │ Parent P │ │ [key: 50, 60] │ ← New separator added └────────┬────────┘ ┌─────────┴─────────┐ ┌──────▼──────┐ ┌──────▼──────┐ │ Page A │ │ Page B │ │ [10,20,30, │ │ [60,70,80, │ │ 40,50,55] │ │ 90] │ └─────────────┘ └─────────────┘ Log Records Generated (in order): LSN 1001: [Page B] insert_slot_range(slots 0-3, keys [60,70,80,90]) UndoInfo: delete_slot_range(slots 0-3) LSN 1002: [Page A] delete_slot_range(slots 5-8, keys [60,70,80,90]) UndoInfo: insert_slot_range(slots 5-8, keys [60,70,80,90]) LSN 1003: [Page A] insert_slot(slot 5, key 55) UndoInfo: delete_slot(slot 5) LSN 1004: [Parent P] insert_separator(key 60, child Page B) UndoInfo: delete_separator(key 60) RECOVERY PROPERTIES:════════════════════- Each log record is self-contained for one page- Can redo any subset based on pageLSN comparisons- Can undo in reverse order for transaction rollback- Crash at any point leaves each page in consistent stateNested Top Actions (Optimization):
Some multi-page operations should not be undone even if the containing transaction aborts. For example, if we allocate a new page for an index split, we don't want transaction abort to "un-allocate" the page—that would corrupt the index structure.
ARIES supports Nested Top Actions for this case:
This allows certain structural operations to commit independently of the containing transaction.
Index splits and page allocations affect structural consistency (the index is navigable). Row inserts affect data consistency (the data is current). ARIES can protect structural changes even when data changes are rolled back, using nested top actions and careful log record ordering.
Physiological logging represents a fundamental innovation in database recovery design. By combining physical page addressing with logical operation description, it achieves the advantages of both approaches while avoiding their pitfalls.
What's Next:
With steal/no-force and physiological logging in place, ARIES needs an efficient way to bound recovery time. The next page examines fuzzy checkpoints—ARIES's approach to checkpointing that avoids disrupting normal processing while still providing a recovery starting point.
You now understand physiological logging—the hybrid approach that ARIES uses to record database modifications efficiently while maintaining clear recovery semantics. This technique is fundamental to achieving both space efficiency and recovery simplicity. Next, we'll explore fuzzy checkpoints.