Loading learning content...
The classic transaction model presents an all-or-nothing choice: either all of a transaction's changes commit, or none of them do. But real-world applications often need finer control. Consider a complex order-processing transaction that fails during shipping calculation—should the entire order be discarded, or just the shipping step?
Nested transactions and savepoints provide this finer granularity. ARIES supports partial rollback through sophisticated extensions to its logging and undo mechanisms. Understanding these capabilities reveals how ARIES evolved from a recovery algorithm into a comprehensive transaction management framework.
By the end of this page, you will understand the concepts of nested transactions and savepoints, how ARIES logs and recovers partial rollbacks, the role of Compensation Log Records (CLRs) in multi-level undo, implementation considerations and limitations, and real-world use cases for nested transaction support.
Traditional flat transactions treat all operations within a transaction as a single atomic unit. This simplicity has costs in real applications.
Scenario: Order Processing
BEGIN TRANSACTION
1. Reserve inventory (succeeds)
2. Calculate pricing (succeeds)
3. Apply discounts (succeeds)
4. Calculate shipping (FAILS - carrier API down)
With flat transactions, the failure at step 4 forces rollback of steps 1-3. The inventory reservation is released, pricing is discarded, and the user must start over. But what if we could retry just step 4?
Savepoints provide the solution:
BEGIN TRANSACTION
1. Reserve inventory (succeeds)
2. Calculate pricing (succeeds)
SAVEPOINT after_pricing
3. Apply discounts (succeeds)
4. Calculate shipping (FAILS)
ROLLBACK TO after_pricing -- undo steps 3-4 only
-- Steps 1-2 are preserved
-- Retry or proceed differently
This allows applications to structure their transactions with fallback points, improving both user experience and system resilience.
| Scenario | Flat Transaction | With Savepoints |
|---|---|---|
| Failure mid-transaction | Complete rollback | Rollback to savepoint |
| Exploratory processing | Must complete fully | Can try alternatives |
| Error handling | Abort entire operation | Recover and continue |
| Long transactions | All-or-nothing risk | Checkpoint progress |
| Complex workflows | Monolithic structure | Modular recovery |
Standard SQL supports savepoints with SAVEPOINT name and ROLLBACK TO [SAVEPOINT] name. These are widely implemented in PostgreSQL, Oracle, SQL Server, and MySQL/InnoDB—all using ARIES-style recovery under the hood.
Nested transactions extend the transaction model to allow transactions within transactions. There are several flavors of nesting, with different semantics and complexity.
Types of Transaction Nesting:
| Type | Description | Commit Behavior | ARIES Support |
|---|---|---|---|
| Flat | No nesting; all-or-nothing | Single commit point | Full support |
| Savepoints | Named rollback points within transaction | Single commit; partial rollback | Full support |
| Closed Nested | Sub-transactions that commit to parent only | Sub-commit to parent; parent commits atomically | Full support |
| Open Nested | Sub-transactions that commit independently | Sub-commit is durable even if parent aborts | Partial support (via nested top actions) |
| Saga (Compensating) | Long-lived with compensating actions | Each step durable; compensate on failure | Application-level typically |
Savepoints (Most Common Form):
A savepoint is a named marker within a transaction that establishes a partial rollback point:
Savepoints can be nested: you can have savepoint B within the scope of savepoint A. Rolling back to A also undoes B's work.
1234567891011121314151617181920212223242526272829303132333435363738394041
-- Nested savepoints exampleBEGIN TRANSACTION; INSERT INTO accounts (id, balance) VALUES (1, 1000);SAVEPOINT sp1; INSERT INTO transactions (account_id, amount) VALUES (1, 100); UPDATE accounts SET balance = balance + 100 WHERE id = 1; SAVEPOINT sp2; INSERT INTO audit_log (action) VALUES ('transfer started'); -- Something goes wrong here ROLLBACK TO SAVEPOINT sp2; -- audit_log insert is undone, but account update remains INSERT INTO audit_log (action) VALUES ('transfer completed'); COMMIT;-- Final state: account 1 has balance 1100, transactions has 1 row,-- audit_log has 'transfer completed' only ------------------------------------------------------------------------------- -- Alternative: rolling back to sp1 undoes everything after sp1BEGIN TRANSACTION; INSERT INTO accounts (id, balance) VALUES (1, 1000);SAVEPOINT sp1; INSERT INTO transactions (account_id, amount) VALUES (1, 100); UPDATE accounts SET balance = balance + 100 WHERE id = 1; -- Oops, wrong account ROLLBACK TO SAVEPOINT sp1; -- Now: account exists with balance 1000, no transactions row INSERT INTO transactions (account_id, amount) VALUES (1, 200); UPDATE accounts SET balance = balance + 200 WHERE id = 1; COMMIT;-- Final state: account 1 has balance 1200, transactions has 1 row (200)Savepoints only affect undo—they don't create isolation barriers. Other transactions can see uncommitted changes (depending on isolation level) regardless of savepoint structure. Savepoints are about rollback granularity, not visibility.
ARIES's logging mechanism naturally supports partial rollback because each log record is independently undo-able. The key is tracking which log records were generated after each savepoint.
Savepoint Implementation:
When a savepoint is created:
When rolling back to a savepoint:
Transaction Log with Savepoints: Transaction T42 Log Records:═══════════════════════════════════════════════════════════════════════════ LSN PrevLSN Type Page Operation──── ─────── ────────────── ──── ─────────────────────────────────1000 null BEGIN - Transaction T42 begins1005 1000 INSERT P1 Insert row (id=1, val=100)1010 1005 UPDATE P2 Update row (id=5, old=50 new=60)1015 1010 SAVEPOINT - *** Savepoint 'sp1' → LSN 1015 ***1020 1015 INSERT P3 Insert row (id=2, val=200)1025 1020 INSERT P1 Insert row (id=3, val=300)1030 1025 SAVEPOINT - *** Savepoint 'sp2' → LSN 1030 ***1035 1030 UPDATE P2 Update row (id=5, old=60 new=70)1040 1035 DELETE P1 Delete row (id=1) ═══════════════════════════════════════════════════════════════════════════ ROLLBACK TO sp2:- Undo LSN 1040: Restore row (id=1) to P1- Undo LSN 1035: Restore row (id=5) to value 60- Stop at LSN 1030 (sp2's LSN)- CLRs are written for each undo operation ROLLBACK TO sp1:- Undo LSN 1040: Restore row (id=1)- Undo LSN 1035: Restore row (id=5) to value 60 - Undo LSN 1030: (SAVEPOINT record - no undo needed)- Undo LSN 1025: Delete row (id=3) from P1- Undo LSN 1020: Delete row (id=2) from P3- Stop at LSN 1015 (sp1's LSN) After rollback to sp1:- Rows at 1005 and 1010 remain- Everything after 1015 is undoneSavepoint Log Records:
ARIES can log savepoints explicitly or track them implicitly:
Explicit Approach:
Implicit Approach:
Most systems use the explicit approach for critical savepoints, implicit for transient ones.
Savepoint state is typically not persistent across crashes. If a crash occurs, the entire transaction is rolled back during recovery—not just to the last savepoint. Savepoints facilitate application-level error handling, not crash recovery.
Compensation Log Records (CLRs) are critical to nested transaction support. They ensure that undo operations are not repeated if a crash occurs during rollback—whether partial or complete.
The Problem CLRs Solve:
Suppose we're rolling back to savepoint sp1 and have undone 5 of 10 log records. A crash occurs. During recovery:
Without CLRs, we might undo the same operation twice, potentially corrupting data.
CLR Structure:
A CLR records that an undo operation was performed:
CLR:
LSN: 1042
Type: CLR (Compensation Log Record)
TransactionID: T42
PageID: P1
Operation: Redo-only insert (restoring deleted row)
UndoNextLSN: 1035 ← KEY FIELD
The UndoNextLSN points to the next log record to undo if rolling back continues. It essentially says: "I have compensated for record X; if you need to undo more, continue from Y."
CLRs are redo-only: They have no undo information because we never want to undo an undo. If we crash after writing a CLR, recovery redoes it (if needed), ensuring the compensation is complete.
CLR Chain During Partial Rollback: Original Log:═══════════════════════════════════════════════════════════════════════LSN PrevLSN Type UndoNextLSN Operation──── ─────── ──────── ─────────── ────────────────────────1000 null BEGIN - Transaction begins1005 1000 UPDATE - X = X + 101010 1005 SAVEPOINT - sp1 created1015 1010 UPDATE - Y = Y + 201020 1015 UPDATE - Z = Z + 30═══════════════════════════════════════════════════════════════════════ ROLLBACK TO sp1: Step 1: Undo LSN 1020 (Z = Z + 30) Write CLR: LSN PrevLSN Type UndoNextLSN Operation1025 1020 CLR 1015 Z = Z - 30 (redo-only) Step 2: Undo LSN 1015 (Y = Y + 20) Write CLR: 1030 1025 CLR 1010 Y = Y - 20 (redo-only) Step 3: Reach sp1 (LSN 1010) - STOP ═══════════════════════════════════════════════════════════════════════Final Log State:═══════════════════════════════════════════════════════════════════════LSN PrevLSN Type UndoNextLSN Operation──── ─────── ──────── ─────────── ────────────────────────1000 null BEGIN - Transaction begins1005 1000 UPDATE - X = X + 101010 1005 SAVEPOINT - sp1 created1015 1010 UPDATE - Y = Y + 201020 1015 UPDATE - Z = Z + 301025 1020 CLR 1015 Z = Z - 30 (compensation)1030 1025 CLR 1010 Y = Y - 20 (compensation) If CRASH occurs now and recovery runs:- Analysis: T42 is active- Redo: Apply 1025 and 1030 CLRs if needed- Undo: Follow UndoNextLSN chain 1030 → UndoNextLSN=1010 (skip to 1010) 1010 is SAVEPOINT, then 1005... Only undoes operations BEFORE sp1 (like 1005)The UndoNextLSN in a CLR 'jumps over' the log record it compensated. This prevents any attempt to undo an already-undone operation. The chain of UndoNextLSN values precisely defines what remains to be undone.
Some operations should not be undone even if the containing transaction aborts. Consider allocating a new page for a B+-tree index split:
Should step 3 be undone? If we "un-allocate" the page, the B+-tree becomes structurally inconsistent. But if we keep the page, T's row shouldn't be in the index.
Nested Top Actions solve this: certain operations commit independently of the containing transaction.
How Nested Top Actions Work:
If the containing transaction aborts:
The key insight: the dummy CLR makes the NTA invisible to undo.
Nested Top Action for Page Split: Transaction T42 inserts row causing page split:═══════════════════════════════════════════════════════════════════════LSN Type UndoNextLSN Operation──── ────────── ─────────── ────────────────────────────────────1000 BEGIN - Transaction T42 begins1005 INSERT - Row inserted (triggers overflow) ------ NESTED TOP ACTION START ------1010 ALLOCATE - Allocate new page P991015 SPLIT - Redistribute entries, P42 → P42 + P991020 UPDATE - Update parent with new separator 1025 DUMMY_CLR 1005 ← UndoNextLSN jumps to BEFORE NTA------ NESTED TOP ACTION END -------- 1030 INSERT - Original row goes to appropriate page═══════════════════════════════════════════════════════════════════════ If T42 ABORTS:1. Start undo at lastLSN (1030)2. Undo 1030: Remove the inserted row3. Follow PrevLSN to 1025 (DUMMY_CLR)4. UndoNextLSN = 1005, skip 1020, 1015, 1010 entirely!5. Undo 1005: Remove original row6. Done Result:- T42's row insertions are undone ✓- Page split (NTA) is preserved ✓- Index structure remains consistent ✓ WHY THIS IS SAFE:═════════════════The page split is a purely structural change. Even if we undo T42's data changes, the split doesn't corrupt anything—we just have a pagewith more free space than necessary. If we undid the split, we'd haveto acquire locks and reorganize the tree, risking deadlocks.Nested top actions must be designed carefully. The operation must be semantically independent—its persistence shouldn't violate any invariants even if surrounding operations are undone. Page allocations and index structure changes fit this criterion; data modifications do not.
ARIES recovery naturally handles nested transactions through its standard three-phase process, with special attention to CLRs and their UndoNextLSN chains.
Analysis Phase:
Redo Phase:
Undo Phase:
Recovery Scenario: Crash During Partial Rollback Before crash, T42 was rolling back to savepoint sp1.Had completed 2 of 4 undo operations when crash occurred. Log at crash time:═══════════════════════════════════════════════════════════════════════LSN Type PrevLSN UndoNextLSN Status──── ────────── ─────── ─────────── ──────────────────────────1000 BEGIN null - 1005 UPDATE 1000 - X = X + 101010 SAVEPOINT 1005 - sp11015 UPDATE 1010 - Y = Y + 201020 UPDATE 1015 - Z = Z + 301025 UPDATE 1020 - W = W + 401030 UPDATE 1025 - V = V + 50 -- Partial rollback started (ROLLBACK TO sp1)1035 CLR 1030 1025 V = V - 50 (compensated)1040 CLR 1035 1020 W = W - 40 (compensated) *** CRASH HERE ***═══════════════════════════════════════════════════════════════════════ RECOVERY PROCESS: 1. ANALYSIS: - T42 is active (no commit record) - T42's lastLSN = 1040 - T42's UndoNextLSN = 1020 (from CLR at 1040) 2. REDO: - Apply 1035 if needed (check pageLSN < 1035) - Apply 1040 if needed (check pageLSN < 1040) - CLRs ensure compensation is complete 3. UNDO: - Start at T42's UndoNextLSN = 1020 (NOT lastLSN!) - Undo 1020: Z = Z - 30, write CLR UndoNextLSN=1015 - Undo 1015: Y = Y - 20, write CLR UndoNextLSN=1010 - Reach 1010 (SAVEPOINT) - this was the target But wait - savepoint state was lost in crash! Recovery continues through savepoint: - Undo 1005: X = X - 10, write CLR UndoNextLSN=1000 - Complete transaction abort NOTE: Savepoint state is NOT preserved across crashes.Partial rollback in progress converts to full rollback.If a crash occurs during a transaction (even after a partial rollback to a savepoint), recovery performs a complete transaction rollback. Savepoints are a runtime convenience for application error handling, not a recovery mechanism.
Implementing nested transaction support involves careful management of savepoint state, CLR generation, and interaction with locking.
Savepoint Management:
Lock Handling During Partial Rollback:
A subtle issue: when rolling back to a savepoint, should locks acquired after the savepoint be released?
Conservative Approach (Most Common):
Aggressive Approach (Rare):
The Safe Rule: Locks are held until transaction commit (or full abort), regardless of savepoints.
Performance Considerations:
CLR Overhead: Each undo operation writes a CLR, consuming log space. Frequent partial rollbacks increase log volume.
Undo Work: Partial rollback performs real work (fetching pages, applying inverse operations). Heavy use can impact throughput.
Memory for Savepoints: Each savepoint consumes memory for tracking. Deep nesting or many savepoints adds overhead.
Recovery Time: More CLRs means more log records to process during recovery. However, CLRs are redo-only, so overhead is limited.
Savepoints are for error handling, not normal flow. Design applications so that savepoints are rarely rolled back. If you're frequently rolling back to savepoints, consider restructuring the transaction logic.
ARIES's support for nested transactions and savepoints extends the basic recovery algorithm into a comprehensive transaction management framework. Through CLRs, UndoNextLSN chains, and nested top actions, ARIES provides fine-grained control over rollback while maintaining correct recovery behavior.
What's Next:
We've examined the major ARIES features individually. The final page of this module explores performance optimizations—techniques that ARIES uses to maximize throughput while maintaining correct recovery behavior, including group commit, log compression, and parallel recovery.
You now understand how ARIES supports nested transactions through savepoints, CLRs, and nested top actions. This capability enables applications to handle errors gracefully without sacrificing correctness. Next, we'll explore performance optimizations in ARIES.