Database Management SystemsBackup Strategies

Point-in-Time Recovery

LevelAdvanced

Duration75 mins

TopicBackup Strategies

4 / 5

Target Time Specification: Precision in Point-in-Time Recovery

The Difference a Second Makes

The batch job that corrupted your database ran from 14:31:47.234 to 14:31:48.891. In that 1.657 seconds, it executed 47 transactions that left your data in an inconsistent state. But the job itself was triggered by a legitimate operation at 14:31:47.119—which you need to preserve.

Your recovery target must be between 14:31:47.119 (include) and 14:31:47.234 (exclude).

That's a window of 115 milliseconds.

This scenario illustrates why target time specification is both an art and a science. It requires understanding not just the technical mechanisms of recovery targeting, but also the investigation techniques to identify the precise moment that separates "good" from "bad" in your database's history.

This page explores the full depth of recovery target specification: the different modes available, the critical considerations around time zones and transaction boundaries, and the investigative techniques that help you identify the optimal recovery point.

What You Will Learn

By the end of this page, you will understand how to specify recovery targets with precision, navigate timezone complexities, work with transaction boundaries, use alternative target modes (LSN, XID, named restore points), and investigate incidents to determine the optimal recovery moment.

Recovery Target Fundamentals

A recovery target defines the point in the database's transaction history where replay should stop. Understanding the fundamental mechanics ensures accurate target specification.

How Target Matching Works

When you specify a recovery target, the database recovery process:

Starts from the base backup checkpoint
Replays each WAL record in sequence
For each transaction commit record, compares against the target
Stops when target condition is met (or after, depending on inclusive setting)

The critical insight: targets are evaluated against transaction commit records, not individual operations.

This means:

A long-running transaction with many operations is either fully included or fully excluded
The "time" of a transaction is its commit time, not when individual statements executed
Rolled-back transactions leave no commit record and are never "visible" at any recovery target

The Inclusive/Exclusive Decision

Most databases offer control over whether the target transaction is included:

Exclusive (Default in PostgreSQL)

recovery_target_inclusive = false

Recovery stops before the target transaction
The transaction whose commit matches the target is NOT applied
Safe default: ensures the "bad" transaction isn't included

Inclusive

recovery_target_inclusive = true

Recovery stops after applying the target transaction
The transaction whose commit matches the target IS applied
Use when you want to include a specific target transaction

Example:

Target: LSN 5/2A000400

WAL Stream:
LSN 5/29FFF800 - Transaction T1 COMMIT
LSN 5/2A000400 - Transaction T2 COMMIT  <- Target LSN
LSN 5/2A000800 - Transaction T3 COMMIT

With inclusive=false: Recovered state includes T1, excludes T2, T3
With inclusive=true:  Recovered state includes T1, T2, excludes T3

Default to Exclusive

When recovering from a known bad transaction, always use exclusive mode (the default). This ensures the problematic transaction isn't applied. Only use inclusive mode when you specifically need to include a particular transaction's effects.

Target Action Options

What should happen when the target is reached?

pause (Recommended for Production)

recovery_target_action = 'pause'

Recovery pauses with database in read-only mode
Allows verification before finalizing
Can adjust target and resume if needed
Use this in production environments

promote

recovery_target_action = 'promote'

Automatically promotes to read-write mode
No verification opportunity
Use only when automation handles verification

shutdown

recovery_target_action = 'shutdown'

Shuts down database cleanly at target
Requires manual restart
Useful for scripted recovery workflows

Timezone Considerations

Timezone handling in recovery targets is a frequent source of errors. Understanding how timestamps are interpreted and stored is essential for accurate recovery.

How Timestamps Are Stored in WAL

Internally, PostgreSQL (and most databases) store commit timestamps in UTC. When a transaction commits, the timestamp recorded is the current UTC time, regardless of the session's timezone setting.

Specifying Recovery Target Time

When you specify a recovery target time, you must account for timezone:

Explicit Timezone (Recommended)

recovery_target_time = '2024-01-15 14:30:00 America/New_York'
recovery_target_time = '2024-01-15 14:30:00-05:00'
recovery_target_time = '2024-01-15 19:30:00 UTC'

No Timezone (Dangerous)

recovery_target_time = '2024-01-15 14:30:00'

Will be interpreted according to database server's timezone
If server timezone differs from your expectation, recovery will be wrong
Never do this in production

The Timezone Trap

A recovery target of '2024-01-15 14:30:00' without a timezone will be interpreted using the recovered database's timezone setting—which may have been different before the incident or between environments. Always specify an explicit timezone.

Daylight Saving Time Complications

DST transitions create ambiguous or non-existent times:

Non-Existent Times (Spring Forward)

In US Eastern, March 10, 2024 jumps from 1:59 AM to 3:00 AM. The time "2:30 AM" doesn't exist.

-- This time doesn't exist!
recovery_target_time = '2024-03-10 02:30:00 America/New_York'

Behavior varies by database. Most will interpret as 3:30 AM or reject the timestamp.

Ambiguous Times (Fall Back)

In US Eastern, November 3, 2024 repeats 1:00 AM to 1:59 AM twice. The time "1:30 AM" occurs twice.

-- Which 1:30 AM? First or second occurrence?
recovery_target_time = '2024-11-03 01:30:00 America/New_York'

Best Practice: During DST transitions, use UTC or explicit offset:

-- Clear and unambiguous
recovery_target_time = '2024-11-03 05:30:00 UTC'
recovery_target_time = '2024-11-03 01:30:00-04:00'  -- Before fall-back
recovery_target_time = '2024-11-03 01:30:00-05:00'  -- After fall-back

Timezone Specification Methods
Method	Example	Risk Level	Recommendation
Explicit zone name	'2024-01-15 14:30:00 America/New_York'	Low	Preferred for operational use
Explicit offset	'2024-01-15 14:30:00-05:00'	Low	Preferred when offset is known
UTC	'2024-01-15 19:30:00 UTC'	Low	Preferred for automation
Implicit (no zone)	'2024-01-15 14:30:00'	High	Never use in production
With 'Z' suffix	'2024-01-15 19:30:00Z'	Low	ISO format, clear UTC indication

Verifying Timezone Interpretation

Before executing recovery, verify how your target time will be interpreted:

-- Check database timezone setting (from backup)
-- This is how timestamps without zones will be interpreted
SHOW timezone;

-- Convert your intended target to UTC to verify
SELECT '2024-01-15 14:30:00 America/New_York'::timestamptz;
-- Expected: 2024-01-15 19:30:00+00

-- If you have a UTC time, convert to local for verification
SELECT '2024-01-15 19:30:00 UTC'::timestamptz 
    AT TIME ZONE 'America/New_York';
-- Expected: 2024-01-15 14:30:00

Transaction Boundaries and Recovery Precision

Recovery targets operate at transaction boundaries—you can never recover to the middle of a transaction. Understanding this constraint is essential for setting realistic expectations.

The Atomic Transaction Constraint

Transactions are atomic: they either happen completely or not at all. This ACID property extends to PITR:

You cannot recover to a state where a transaction is "half-completed"
All operations within a transaction share the same commit timestamp
A transaction's "time" for PITR purposes is when it committed

Example Timeline:

14:30:00.000 - Transaction T1 BEGINS
14:30:00.500 - T1: INSERT row A
14:30:01.000 - T1: UPDATE row B  
14:30:01.500 - T1: DELETE row C
14:30:02.000 - Transaction T1 COMMITS  <- This is T1's time

Recovery to 14:30:01.000:
  - T1 is NOT included (didn't commit until 14:30:02)
  - Row A: not inserted
  - Row B: not updated
  - Row C: not deleted

Recovery to 14:30:02.500:
  - T1 IS included (committed at 14:30:02)
  - Row A: inserted
  - Row B: updated  
  - Row C: deleted

Long-Running Transactions Reduce Precision

Long-running transactions create large 'gaps' in recovery precision. If a transaction runs for 10 minutes, you cannot recover to any point within that 10-minute window—you get either the pre-transaction or post-transaction state.

The Commit Timestamp Resolution

Commit timestamps have limited resolution, typically microseconds. In a high-throughput system, multiple transactions may share the same millisecond.

Practical Implications:

Commit timestamps at 14:30:00.123456:
  - T1 COMMIT at 14:30:00.123456 (LSN: 100)
  - T2 COMMIT at 14:30:00.123456 (LSN: 101) <- Target transaction
  - T3 COMMIT at 14:30:00.123456 (LSN: 102)

Recovery to '14:30:00.123456' with inclusive=true:
  - Which transaction gets included?
  - Behavior may vary by database

When Precision Matters: Use LSN-based targeting:

-- Know exactly which transactions get included
recovery_target_lsn = '5/2A000800'

Identifying Transaction Boundaries

Before specifying a recovery target, investigate the transaction landscape around your intended time:

-- PostgreSQL: Find commits near a timestamp (requires pg_stat_statements/logging)
-- From your application logs or audit tables:
SELECT 
    xact_id,
    commit_time,
    LEAD(commit_time) OVER (ORDER BY commit_time) AS next_commit,
    LEAD(commit_time) OVER (ORDER BY commit_time) - commit_time AS gap_to_next
FROM transaction_audit_log
WHERE commit_time BETWEEN '2024-01-15 14:29:00' AND '2024-01-15 14:32:00'
ORDER BY commit_time;

This reveals:

The exact commit timestamps
Gaps between commits (recovery precision windows)
Whether your target falls between commits or on one

The 'Snap-To' Behavior

When you specify a recovery target time, recovery doesn't stop exactly at that time. It stops at the nearest transaction boundary before that time (with inclusive=false):

Target: 14:30:00.500

Transaction Commits:
T1 at 14:29:59.800  <- Last commit BEFORE target
T2 at 14:30:00.600  <- First commit AFTER target

Recovery stops at 14:29:59.800 (T1 commit time)
Actual recovered state: 14:29:59.800, NOT 14:30:00.500

Documenting Actual Recovery Point:

After recovery pauses, always check the actual recovered point:

SELECT pg_last_xact_replay_timestamp();
-- Returns the actual recovery point, which may differ from target

Alternative Target Modes

While time-based recovery is intuitive, other target modes offer advantages in specific scenarios.

LSN-Based Recovery

Specifying a Log Sequence Number provides the highest precision:

recovery_target_lsn = '5/2A3B7C00'

Advantages:

Exact, reproducible recovery point
No timezone ambiguity
No timestamp resolution issues
Multiple transactions at same timestamp are distinguished

Challenges:

Requires knowing the target LSN
Not intuitive for operators thinking in terms of time
Requires log inspection to determine target

Finding the Target LSN:

-- From WAL inspection tools (pg_waldump)
-- Example output:
rmgr: Transaction len (rec/tot):     34/    34, 
      tx:     123456, lsn: 5/2A3B7C00, 
      prev 5/2A3B7B80, desc: COMMIT 2024-01-15 14:30:00.123456 UTC

-- The lsn value is your target

# PostgreSQL: Dump WAL to find LSNs
pg_waldump /archive/wal/000000010000000500000002A \
    --start=5/2A000000 --end=5/2B000000 \
    | grep -E "(COMMIT|ABORT)" \
    | head -20

Transaction ID (XID) Based Recovery

Target a specific transaction by its internal identifier:

recovery_target_xid = '12847294'

Advantages:

Precise targeting of a specific transaction
Useful when you know exactly which transaction to include/exclude

Challenges:

XIDs may not be easily discoverable without special logging
XIDs wrap around (32-bit counter) and may not be unique over long periods
Not supported by all databases

When to Use:

Forensic recovery where the problematic XID is known
Precise targeting after detailed log analysis

Named Restore Point Recovery

Proactively create restore points for planned operations:

-- Before a risky operation, create a restore point
SELECT pg_create_restore_point('before_schema_migration_v5.2');

-- Then perform the risky operation
ALTER TABLE customers ADD COLUMN loyalty_tier VARCHAR(20);
-- ... more migration steps ...

-- If something goes wrong, recover to the named point
recovery_target_name = 'before_schema_migration_v5.2'

Advantages:

Clear, semantic names instead of cryptic timestamps or LSNs
Created proactively, when you have full information
Optimal precision (exactly at the restore point)

Best Practices for Restore Points:

Name descriptively: Include operation, version, date in name
Document in change management: Record restore point names in deployment tickets
Create before every significant change: Migrations, bulk updates, imports
Verify creation: Confirm the restore point was recorded:

SELECT * FROM pg_catalog.pg_replication_slots;
SELECT pg_current_wal_lsn();  -- Note LSN for reference

Recovery Target Mode Comparison
Target Mode	Precision	Usability	Best For
Time	Snap to nearest transaction	Very intuitive	Most incident responses
LSN	Exact transaction boundary	Requires investigation	Forensic recovery, precise control
XID	Exact transaction	Requires special logging	Specific transaction targeting
Named Point	Exact named position	Very clear	Planned operations

Incident Investigation Techniques

Determining the optimal recovery target requires investigation. The goal is to identify the precise moment that separates "good" state from "bad" state.

The Investigation Framework

Step 1: Establish the Symptom Timeline

Gather all available information about when symptoms appeared:

When was the problem first reported?
When did monitoring alerts fire?
When did application errors begin?

Step 2: Work Backward to Root Cause

Symptoms often appear after the causal event:

Causal Transaction: 14:30:00
  → Database corruption occurs
  → Application reads corrupted data: 14:30:15
  → User reports problem: 14:32:00
  → Support ticket created: 14:45:00

Step 3: Identify the Causal Transaction

Use all available sources:

-- Check application audit tables
SELECT *
FROM audit_log
WHERE table_name = 'affected_table'
  AND change_time BETWEEN '2024-01-15 14:25:00' AND '2024-01-15 14:35:00'
ORDER BY change_time;

-- Check for bulk operations
SELECT *
FROM pg_stat_statements
WHERE query LIKE '%affected_table%'
  AND rows > 1000;  -- Bulk operations

WAL Analysis for Target Identification

For detailed investigation, analyze the WAL directly:

Using pg_waldump (PostgreSQL):

# Find transactions affecting a specific table
pg_waldump /archive/wal/000000010000000500000002A \
    --start=5/2A000000 \
    --rmgr=Heap \
    | grep -E "(INSERT|UPDATE|DELETE)" \
    | grep "rel 16384/16385/12345"  # OID of affected table

Correlating Log Entries with LSN:

# Find the exact LSN of suspicious activity
pg_waldump /archive/wal/* \
    --start=5/29000000 --end=5/2B000000 \
    | grep -B2 -A2 "COMMIT" \
    | less

# Example output interpretation:
# rmgr: Heap   len (rec/tot): 54/54, tx: 123456, lsn: 5/2A3B7B00
#   desc: INSERT off 5
# rmgr: Transaction len: 34, tx: 123456, lsn: 5/2A3B7C00
#   desc: COMMIT 2024-01-15 14:30:00.123
#
# Transaction 123456 commits at LSN 5/2A3B7C00
# To exclude this transaction: recovery_target_lsn = '5/2A3B7B00'

target_investigation.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
#!/usr/bin/env python3
"""
PITR Target Investigation Tool
Helps identify optimal recovery targets by analyzing transaction history
"""
 
import subprocess
import re
from datetime import datetime, timedelta
from dataclasses import dataclass
from typing import Optional, List
import argparse
 
@dataclass
class TransactionCommit:
    """Represents a transaction commit from WAL analysis"""
    lsn: str
    xid: int
    timestamp: datetime
    table_oids: List[str]
    operation_count: int
    
@dataclass
class RecoveryWindow:
    """Represents a valid recovery window"""
    start_lsn: str
    end_lsn: str
    start_time: datetime
    end_time: datetime
    transactions_in_window: int
    
    @property
    def duration(self) -> timedelta:
        return self.end_time - self.start_time
 
class PITRInvestigator:
    """Investigates transaction history to identify PITR targets"""
    
    def __init__(self, archive_path: str, start_time: str, end_time: str):
        self.archive_path = archive_path
        self.start_time = datetime.fromisoformat(start_time)
        self.end_time = datetime.fromisoformat(end_time)
        self.commits: List[TransactionCommit] = []
        
    def analyze_wal_segment(self, segment_path: str) -> List[TransactionCommit]:
        """Parse a WAL segment for transaction commits"""
        result = subprocess.run(
            ['pg_waldump', segment_path, '--rmgr=Transaction'],
            capture_output=True, text=True
        )
        
        commits = []
        commit_pattern = re.compile(
            r'lsn: ([dA-F/]+).*tx:s+(d+).*COMMITs+(d{4}-d{2}-d{2}s+[d:.]+)'
        )
        
        for line in result.stdout.split('\n'):
            match = commit_pattern.search(line)
            if match:
                commit = TransactionCommit(
                    lsn=match.group(1),
                    xid=int(match.group(2)),
                    timestamp=datetime.fromisoformat(match.group(3)),
                    table_oids=[],
                    operation_count=0
                )
                commits.append(commit)
                
        return commits
    
    def find_recovery_windows(
        self, 
        bad_transaction_start: datetime,
        min_window_seconds: float = 1.0
    ) -> List[RecoveryWindow]:
        """
        Find valid recovery windows before a known bad transaction.
        Returns windows where recovery could safely stop.
        """
        windows = []
        sorted_commits = sorted(self.commits, key=lambda c: c.timestamp)
        
        for i, commit in enumerate(sorted_commits):
            if commit.timestamp >= bad_transaction_start:
                break
                
            # Find the next commit
            if i + 1 < len(sorted_commits):
                next_commit = sorted_commits[i + 1]
                gap = (next_commit.timestamp - commit.timestamp).total_seconds()
                
                if gap >= min_window_seconds:
                    window = RecoveryWindow(
                        start_lsn=commit.lsn,
                        end_lsn=next_commit.lsn,
                        start_time=commit.timestamp,
                        end_time=next_commit.timestamp,
                        transactions_in_window=0
                    )
                    windows.append(window)
                    
        return windows
    
    def recommend_target(
        self, 
        bad_transaction_time: datetime,
        safety_margin_seconds: float = 5.0
    ) -> Optional[str]:
        """
        Recommend a recovery target based on analysis.
        Returns recommended LSN target with safety margin.
        """
        target_time = bad_transaction_time - timedelta(seconds=safety_margin_seconds)
        
        # Find last commit before target time
        sorted_commits = sorted(self.commits, key=lambda c: c.timestamp)
        candidate = None
        
        for commit in sorted_commits:
            if commit.timestamp <= target_time:
                candidate = commit
            else:
                break
                
        if candidate:
            return candidate.lsn
        return None
 
def main():
    parser = argparse.ArgumentParser(
        description='PITR Target Investigation Tool'
    )
    parser.add_argument('--archive-path', required=True, 
                        help='Path to WAL archive')
    parser.add_argument('--incident-time', required=True,
                        help='Approximate incident time (ISO format)')
    parser.add_argument('--search-window-minutes', type=int, default=30,
                        help='Minutes before/after incident to analyze')
    
    args = parser.parse_args()
    
    # Initialize investigator
    incident = datetime.fromisoformat(args.incident_time)
    start = incident - timedelta(minutes=args.search_window_minutes)
    end = incident + timedelta(minutes=5)
    
    investigator = PITRInvestigator(
        args.archive_path,
        start.isoformat(),
        end.isoformat()
    )
    
    # Analyze and recommend
    # ... (analysis logic)
    
    print(f"\nRecommended recovery target:")
    print(f"  Time-based: {target_time.isoformat()}")
    print(f"  LSN-based: {recommended_lsn}")
 
if __name__ == "__main__":
    main()

Safety Margins and Best Practices

Even with thorough investigation, uncertainty remains. Safety margins protect against incomplete information.

Why Safety Margins Matter

Several factors can cause the actual problem to occur slightly earlier than estimated:

Clock Skew: Server time may differ from reference time
Transaction Duration: Long transactions commit later than they start
Replication Lag: If analyzing a replica's logs, timing may differ from primary
Investigation Uncertainty: Human error in determining incident timeline

Recommended Safety Margins

Scenario	Minimum Margin	Recommended Margin
High-confidence timing from logs	10 seconds	30 seconds
Estimated from user reports	2 minutes	5 minutes
Unknown timing, narrow corruption	5 minutes	10 minutes
Major incident, unclear scope	15 minutes	30 minutes

Iterative Recovery Strategy

When uncertain about the target, use an iterative approach:

Round 1: Recover with large safety margin

Verify corruption is absent
Note actual recovered point

Round 2 (if needed): Narrow the target

Move target closer to incident
Recover more recent data
Verify still before corruption

Round 3: Final recovery point selection

Balance data recovery vs. safety
May accept some data loss for certainty

# Iterative recovery example
# Round 1: Safe target (5 min before estimated incident)
recovery_target_time = '2024-01-15 14:25:00'
# Result: Clean, but 5 minutes of data lost

# Round 2: Closer target (2 min before)
recovery_target_time = '2024-01-15 14:28:00'
# Result: Still clean, recovered 3 more minutes

# Round 3: Final target (30 sec before)
recovery_target_time = '2024-01-15 14:29:30'
# Result: Clean, optimal recovery

Target Specification Best Practices

•Always use timezone-aware timestamps — Never specify a target without explicit timezone
•Default to exclusive mode — Unless you specifically need to include the target transaction
•Use pause action for verification — Verify state before finalizing recovery
•Include safety margin — Better to lose extra data than to recover past corruption
•Document the chosen target — Record reasoning for post-incident review
•Verify actual recovery point — Check pg_last_xact_replay_timestamp() after recovery
•Consider LSN for precision — When milliseconds matter, use LSN instead of time
•Create restore points proactively — Before risky operations, not after

Common Target Specification Errors

Learning from common mistakes helps avoid them during high-pressure recovery situations.

Error 1: Timezone Mismatch

Scenario: Administrator in New York specifies 2:30 PM thinking it's local time, but server is configured for UTC.

What happens: Recovery goes to 9:30 PM local time instead—hours past the incident.

Prevention:

-- Always verify interpretation
SELECT '2024-01-15 14:30:00'::timestamp AT TIME ZONE current_setting('TimeZone');

-- Always specify timezone
recovery_target_time = '2024-01-15 14:30:00 America/New_York'

Error 2: Confusion About Inclusive Behavior

Scenario: Administrator wants to recover up to and including a specific transaction, but uses default exclusive mode.

What happens: The target transaction is NOT included.

Prevention:

-- Be explicit about inclusive behavior
recovery_target_inclusive = true  -- If you WANT the target transaction
recovery_target_inclusive = false -- If you want to EXCLUDE target (default)

Error 3: Target Before Base Backup

Scenario: Recovery target is earlier than the base backup's checkpoint.

What happens: Recovery cannot proceed—there's no starting point.

Prevention:

# Always check backup's LSN/time before specifying target
cat /backup/backup_label
# START WAL LOCATION: 5/1A000028 (file 000000010000000500000001A)
# START TIME: 2024-01-15 00:00:00 UTC

# Target must be AFTER 2024-01-15 00:00:00 UTC

Common Target Specification Errors
Error	Symptom	Prevention
Timezone mismatch	Recovery at wrong time	Always specify explicit timezone
Inclusive confusion	Target transaction included/excluded unexpectedly	Be explicit about inclusive setting
Target before backup	Recovery fails to start	Verify target is after backup timestamp
Target in future	Recovery completes all WAL (no stop point)	Verify target is in the past
Missing WAL gap	Recovery stops at gap, not target	Verify archive continuity to target
Wrong target mode	Unexpected recovery point	Choose appropriate mode for scenario

Always Verify Before Finalizing

No matter how confident you are about the target, always use recovery_target_action = 'pause' and verify the recovered state before promoting. This is your last opportunity to detect target specification errors.

Summary: Mastering Target Specification

Precise target specification is where PITR theory meets practice. Let's consolidate the key concepts:

Key Takeaways

•Targets operate on transaction boundaries — Recovery precision is limited by transaction commit granularity, not wall-clock time.
•Timezone handling is critical — Always specify explicit timezones; implicit interpretation errors cause major recovery mistakes.
•Multiple target modes exist — Time is intuitive but LSN offers precision; named restore points provide clarity for planned operations.
•Investigation determines the target — Systematic analysis of logs, WAL, and audit data reveals the optimal recovery point.
•Safety margins protect against uncertainty — When in doubt, err toward recovering less data rather than including corruption.
•Verification before finalization is essential — Always pause at target and verify before promoting out of recovery.
•Common errors are preventable — Timezone mismatches, inclusive confusion, and pre-backup targets are avoided with careful specification.

What's next:

Having mastered target specification, we'll examine PITR Limitations. The final page explores the inherent constraints of Point-in-Time Recovery, including RPO boundaries, storage requirements, performance impacts, and scenarios where PITR alone is insufficient—along with complementary strategies that address these limitations.

Page Complete

You now understand the nuances of recovery target specification. You've learned about timezone handling, transaction boundaries, alternative target modes, investigation techniques, and safety margins. Next, we'll explore the inherent limitations of PITR and how to address them.

4 / 5

Loading learning content...

Database Management SystemsBackup Strategies

Point-in-Time Recovery

LevelAdvanced

Duration75 mins

TopicBackup Strategies

4 / 5

Target Time Specification: Precision in Point-in-Time Recovery

The Difference a Second Makes

Your recovery target must be between 14:31:47.119 (include) and 14:31:47.234 (exclude).

That's a window of 115 milliseconds.

What You Will Learn

Recovery Target Fundamentals

A recovery target defines the point in the database's transaction history where replay should stop. Understanding the fundamental mechanics ensures accurate target specification.

How Target Matching Works

When you specify a recovery target, the database recovery process:

Starts from the base backup checkpoint
Replays each WAL record in sequence
For each transaction commit record, compares against the target
Stops when target condition is met (or after, depending on inclusive setting)

The critical insight: targets are evaluated against transaction commit records, not individual operations.

This means:

A long-running transaction with many operations is either fully included or fully excluded
The "time" of a transaction is its commit time, not when individual statements executed
Rolled-back transactions leave no commit record and are never "visible" at any recovery target

The Inclusive/Exclusive Decision

Most databases offer control over whether the target transaction is included:

Exclusive (Default in PostgreSQL)

recovery_target_inclusive = false

Recovery stops before the target transaction
The transaction whose commit matches the target is NOT applied
Safe default: ensures the "bad" transaction isn't included

Inclusive

recovery_target_inclusive = true

Recovery stops after applying the target transaction
The transaction whose commit matches the target IS applied
Use when you want to include a specific target transaction

Example:

Target: LSN 5/2A000400

WAL Stream:
LSN 5/29FFF800 - Transaction T1 COMMIT
LSN 5/2A000400 - Transaction T2 COMMIT  <- Target LSN
LSN 5/2A000800 - Transaction T3 COMMIT

With inclusive=false: Recovered state includes T1, excludes T2, T3
With inclusive=true:  Recovered state includes T1, T2, excludes T3

Default to Exclusive

Target Action Options

What should happen when the target is reached?

pause (Recommended for Production)

recovery_target_action = 'pause'

Recovery pauses with database in read-only mode
Allows verification before finalizing
Can adjust target and resume if needed
Use this in production environments

promote

recovery_target_action = 'promote'

Automatically promotes to read-write mode
No verification opportunity
Use only when automation handles verification

shutdown

recovery_target_action = 'shutdown'

Shuts down database cleanly at target
Requires manual restart
Useful for scripted recovery workflows

Timezone Considerations

Timezone handling in recovery targets is a frequent source of errors. Understanding how timestamps are interpreted and stored is essential for accurate recovery.

How Timestamps Are Stored in WAL

Specifying Recovery Target Time

When you specify a recovery target time, you must account for timezone:

Explicit Timezone (Recommended)

recovery_target_time = '2024-01-15 14:30:00 America/New_York'
recovery_target_time = '2024-01-15 14:30:00-05:00'
recovery_target_time = '2024-01-15 19:30:00 UTC'

No Timezone (Dangerous)

recovery_target_time = '2024-01-15 14:30:00'

Will be interpreted according to database server's timezone
If server timezone differs from your expectation, recovery will be wrong
Never do this in production

The Timezone Trap

Daylight Saving Time Complications

DST transitions create ambiguous or non-existent times:

Non-Existent Times (Spring Forward)

In US Eastern, March 10, 2024 jumps from 1:59 AM to 3:00 AM. The time "2:30 AM" doesn't exist.

-- This time doesn't exist!
recovery_target_time = '2024-03-10 02:30:00 America/New_York'

Behavior varies by database. Most will interpret as 3:30 AM or reject the timestamp.

Ambiguous Times (Fall Back)

In US Eastern, November 3, 2024 repeats 1:00 AM to 1:59 AM twice. The time "1:30 AM" occurs twice.

-- Which 1:30 AM? First or second occurrence?
recovery_target_time = '2024-11-03 01:30:00 America/New_York'

Best Practice: During DST transitions, use UTC or explicit offset:

-- Clear and unambiguous
recovery_target_time = '2024-11-03 05:30:00 UTC'
recovery_target_time = '2024-11-03 01:30:00-04:00'  -- Before fall-back
recovery_target_time = '2024-11-03 01:30:00-05:00'  -- After fall-back

Timezone Specification Methods
Method	Example	Risk Level	Recommendation
Explicit zone name	'2024-01-15 14:30:00 America/New_York'	Low	Preferred for operational use
Explicit offset	'2024-01-15 14:30:00-05:00'	Low	Preferred when offset is known
UTC	'2024-01-15 19:30:00 UTC'	Low	Preferred for automation
Implicit (no zone)	'2024-01-15 14:30:00'	High	Never use in production
With 'Z' suffix	'2024-01-15 19:30:00Z'	Low	ISO format, clear UTC indication

Verifying Timezone Interpretation

Before executing recovery, verify how your target time will be interpreted:

-- Check database timezone setting (from backup)
-- This is how timestamps without zones will be interpreted
SHOW timezone;

-- Convert your intended target to UTC to verify
SELECT '2024-01-15 14:30:00 America/New_York'::timestamptz;
-- Expected: 2024-01-15 19:30:00+00

-- If you have a UTC time, convert to local for verification
SELECT '2024-01-15 19:30:00 UTC'::timestamptz 
    AT TIME ZONE 'America/New_York';
-- Expected: 2024-01-15 14:30:00

Transaction Boundaries and Recovery Precision

Recovery targets operate at transaction boundaries—you can never recover to the middle of a transaction. Understanding this constraint is essential for setting realistic expectations.

The Atomic Transaction Constraint

Transactions are atomic: they either happen completely or not at all. This ACID property extends to PITR:

You cannot recover to a state where a transaction is "half-completed"
All operations within a transaction share the same commit timestamp
A transaction's "time" for PITR purposes is when it committed

Example Timeline:

14:30:00.000 - Transaction T1 BEGINS
14:30:00.500 - T1: INSERT row A
14:30:01.000 - T1: UPDATE row B  
14:30:01.500 - T1: DELETE row C
14:30:02.000 - Transaction T1 COMMITS  <- This is T1's time

Recovery to 14:30:01.000:
  - T1 is NOT included (didn't commit until 14:30:02)
  - Row A: not inserted
  - Row B: not updated
  - Row C: not deleted

Recovery to 14:30:02.500:
  - T1 IS included (committed at 14:30:02)
  - Row A: inserted
  - Row B: updated  
  - Row C: deleted

Long-Running Transactions Reduce Precision

The Commit Timestamp Resolution

Commit timestamps have limited resolution, typically microseconds. In a high-throughput system, multiple transactions may share the same millisecond.

Practical Implications:

Commit timestamps at 14:30:00.123456:
  - T1 COMMIT at 14:30:00.123456 (LSN: 100)
  - T2 COMMIT at 14:30:00.123456 (LSN: 101) <- Target transaction
  - T3 COMMIT at 14:30:00.123456 (LSN: 102)

Recovery to '14:30:00.123456' with inclusive=true:
  - Which transaction gets included?
  - Behavior may vary by database

When Precision Matters: Use LSN-based targeting:

-- Know exactly which transactions get included
recovery_target_lsn = '5/2A000800'

Identifying Transaction Boundaries

Before specifying a recovery target, investigate the transaction landscape around your intended time:

-- PostgreSQL: Find commits near a timestamp (requires pg_stat_statements/logging)
-- From your application logs or audit tables:
SELECT 
    xact_id,
    commit_time,
    LEAD(commit_time) OVER (ORDER BY commit_time) AS next_commit,
    LEAD(commit_time) OVER (ORDER BY commit_time) - commit_time AS gap_to_next
FROM transaction_audit_log
WHERE commit_time BETWEEN '2024-01-15 14:29:00' AND '2024-01-15 14:32:00'
ORDER BY commit_time;

This reveals:

The exact commit timestamps
Gaps between commits (recovery precision windows)
Whether your target falls between commits or on one

The 'Snap-To' Behavior

When you specify a recovery target time, recovery doesn't stop exactly at that time. It stops at the nearest transaction boundary before that time (with inclusive=false):

Target: 14:30:00.500

Transaction Commits:
T1 at 14:29:59.800  <- Last commit BEFORE target
T2 at 14:30:00.600  <- First commit AFTER target

Recovery stops at 14:29:59.800 (T1 commit time)
Actual recovered state: 14:29:59.800, NOT 14:30:00.500

Documenting Actual Recovery Point:

After recovery pauses, always check the actual recovered point:

SELECT pg_last_xact_replay_timestamp();
-- Returns the actual recovery point, which may differ from target

Alternative Target Modes

While time-based recovery is intuitive, other target modes offer advantages in specific scenarios.

LSN-Based Recovery

Specifying a Log Sequence Number provides the highest precision:

recovery_target_lsn = '5/2A3B7C00'

Advantages:

Exact, reproducible recovery point
No timezone ambiguity
No timestamp resolution issues
Multiple transactions at same timestamp are distinguished

Challenges:

Requires knowing the target LSN
Not intuitive for operators thinking in terms of time
Requires log inspection to determine target

Finding the Target LSN:

-- From WAL inspection tools (pg_waldump)
-- Example output:
rmgr: Transaction len (rec/tot):     34/    34, 
      tx:     123456, lsn: 5/2A3B7C00, 
      prev 5/2A3B7B80, desc: COMMIT 2024-01-15 14:30:00.123456 UTC

-- The lsn value is your target

# PostgreSQL: Dump WAL to find LSNs
pg_waldump /archive/wal/000000010000000500000002A \
    --start=5/2A000000 --end=5/2B000000 \
    | grep -E "(COMMIT|ABORT)" \
    | head -20

Transaction ID (XID) Based Recovery

Target a specific transaction by its internal identifier:

recovery_target_xid = '12847294'

Advantages:

Precise targeting of a specific transaction
Useful when you know exactly which transaction to include/exclude

Challenges:

XIDs may not be easily discoverable without special logging
XIDs wrap around (32-bit counter) and may not be unique over long periods
Not supported by all databases

When to Use:

Forensic recovery where the problematic XID is known
Precise targeting after detailed log analysis

Named Restore Point Recovery

Proactively create restore points for planned operations:

-- Before a risky operation, create a restore point
SELECT pg_create_restore_point('before_schema_migration_v5.2');

-- Then perform the risky operation
ALTER TABLE customers ADD COLUMN loyalty_tier VARCHAR(20);
-- ... more migration steps ...

-- If something goes wrong, recover to the named point
recovery_target_name = 'before_schema_migration_v5.2'

Advantages:

Clear, semantic names instead of cryptic timestamps or LSNs
Created proactively, when you have full information
Optimal precision (exactly at the restore point)

Best Practices for Restore Points:

Name descriptively: Include operation, version, date in name
Document in change management: Record restore point names in deployment tickets
Create before every significant change: Migrations, bulk updates, imports
Verify creation: Confirm the restore point was recorded:

SELECT * FROM pg_catalog.pg_replication_slots;
SELECT pg_current_wal_lsn();  -- Note LSN for reference

Recovery Target Mode Comparison
Target Mode	Precision	Usability	Best For
Time	Snap to nearest transaction	Very intuitive	Most incident responses
LSN	Exact transaction boundary	Requires investigation	Forensic recovery, precise control
XID	Exact transaction	Requires special logging	Specific transaction targeting
Named Point	Exact named position	Very clear	Planned operations

Incident Investigation Techniques

Determining the optimal recovery target requires investigation. The goal is to identify the precise moment that separates "good" state from "bad" state.

The Investigation Framework

Step 1: Establish the Symptom Timeline

Gather all available information about when symptoms appeared:

When was the problem first reported?
When did monitoring alerts fire?
When did application errors begin?

Step 2: Work Backward to Root Cause

Symptoms often appear after the causal event:

Causal Transaction: 14:30:00
  → Database corruption occurs
  → Application reads corrupted data: 14:30:15
  → User reports problem: 14:32:00
  → Support ticket created: 14:45:00

Step 3: Identify the Causal Transaction

Use all available sources:

-- Check application audit tables
SELECT *
FROM audit_log
WHERE table_name = 'affected_table'
  AND change_time BETWEEN '2024-01-15 14:25:00' AND '2024-01-15 14:35:00'
ORDER BY change_time;

-- Check for bulk operations
SELECT *
FROM pg_stat_statements
WHERE query LIKE '%affected_table%'
  AND rows > 1000;  -- Bulk operations

WAL Analysis for Target Identification

For detailed investigation, analyze the WAL directly:

Using pg_waldump (PostgreSQL):

# Find transactions affecting a specific table
pg_waldump /archive/wal/000000010000000500000002A \
    --start=5/2A000000 \
    --rmgr=Heap \
    | grep -E "(INSERT|UPDATE|DELETE)" \
    | grep "rel 16384/16385/12345"  # OID of affected table

Correlating Log Entries with LSN:

# Find the exact LSN of suspicious activity
pg_waldump /archive/wal/* \
    --start=5/29000000 --end=5/2B000000 \
    | grep -B2 -A2 "COMMIT" \
    | less

# Example output interpretation:
# rmgr: Heap   len (rec/tot): 54/54, tx: 123456, lsn: 5/2A3B7B00
#   desc: INSERT off 5
# rmgr: Transaction len: 34, tx: 123456, lsn: 5/2A3B7C00
#   desc: COMMIT 2024-01-15 14:30:00.123
#
# Transaction 123456 commits at LSN 5/2A3B7C00
# To exclude this transaction: recovery_target_lsn = '5/2A3B7B00'

target_investigation.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
#!/usr/bin/env python3
"""
PITR Target Investigation Tool
Helps identify optimal recovery targets by analyzing transaction history
"""
 
import subprocess
import re
from datetime import datetime, timedelta
from dataclasses import dataclass
from typing import Optional, List
import argparse
 
@dataclass
class TransactionCommit:
    """Represents a transaction commit from WAL analysis"""
    lsn: str
    xid: int
    timestamp: datetime
    table_oids: List[str]
    operation_count: int
    
@dataclass
class RecoveryWindow:
    """Represents a valid recovery window"""
    start_lsn: str
    end_lsn: str
    start_time: datetime
    end_time: datetime
    transactions_in_window: int
    
    @property
    def duration(self) -> timedelta:
        return self.end_time - self.start_time
 
class PITRInvestigator:
    """Investigates transaction history to identify PITR targets"""
    
    def __init__(self, archive_path: str, start_time: str, end_time: str):
        self.archive_path = archive_path
        self.start_time = datetime.fromisoformat(start_time)
        self.end_time = datetime.fromisoformat(end_time)
        self.commits: List[TransactionCommit] = []
        
    def analyze_wal_segment(self, segment_path: str) -> List[TransactionCommit]:
        """Parse a WAL segment for transaction commits"""
        result = subprocess.run(
            ['pg_waldump', segment_path, '--rmgr=Transaction'],
            capture_output=True, text=True
        )
        
        commits = []
        commit_pattern = re.compile(
            r'lsn: ([dA-F/]+).*tx:s+(d+).*COMMITs+(d{4}-d{2}-d{2}s+[d:.]+)'
        )
        
        for line in result.stdout.split('\n'):
            match = commit_pattern.search(line)
            if match:
                commit = TransactionCommit(
                    lsn=match.group(1),
                    xid=int(match.group(2)),
                    timestamp=datetime.fromisoformat(match.group(3)),
                    table_oids=[],
                    operation_count=0
                )
                commits.append(commit)
                
        return commits
    
    def find_recovery_windows(
        self, 
        bad_transaction_start: datetime,
        min_window_seconds: float = 1.0
    ) -> List[RecoveryWindow]:
        """
        Find valid recovery windows before a known bad transaction.
        Returns windows where recovery could safely stop.
        """
        windows = []
        sorted_commits = sorted(self.commits, key=lambda c: c.timestamp)
        
        for i, commit in enumerate(sorted_commits):
            if commit.timestamp >= bad_transaction_start:
                break
                
            # Find the next commit
            if i + 1 < len(sorted_commits):
                next_commit = sorted_commits[i + 1]
                gap = (next_commit.timestamp - commit.timestamp).total_seconds()
                
                if gap >= min_window_seconds:
                    window = RecoveryWindow(
                        start_lsn=commit.lsn,
                        end_lsn=next_commit.lsn,
                        start_time=commit.timestamp,
                        end_time=next_commit.timestamp,
                        transactions_in_window=0
                    )
                    windows.append(window)
                    
        return windows
    
    def recommend_target(
        self, 
        bad_transaction_time: datetime,
        safety_margin_seconds: float = 5.0
    ) -> Optional[str]:
        """
        Recommend a recovery target based on analysis.
        Returns recommended LSN target with safety margin.
        """
        target_time = bad_transaction_time - timedelta(seconds=safety_margin_seconds)
        
        # Find last commit before target time
        sorted_commits = sorted(self.commits, key=lambda c: c.timestamp)
        candidate = None
        
        for commit in sorted_commits:
            if commit.timestamp <= target_time:
                candidate = commit
            else:
                break
                
        if candidate:
            return candidate.lsn
        return None
 
def main():
    parser = argparse.ArgumentParser(
        description='PITR Target Investigation Tool'
    )
    parser.add_argument('--archive-path', required=True, 
                        help='Path to WAL archive')
    parser.add_argument('--incident-time', required=True,
                        help='Approximate incident time (ISO format)')
    parser.add_argument('--search-window-minutes', type=int, default=30,
                        help='Minutes before/after incident to analyze')
    
    args = parser.parse_args()
    
    # Initialize investigator
    incident = datetime.fromisoformat(args.incident_time)
    start = incident - timedelta(minutes=args.search_window_minutes)
    end = incident + timedelta(minutes=5)
    
    investigator = PITRInvestigator(
        args.archive_path,
        start.isoformat(),
        end.isoformat()
    )
    
    # Analyze and recommend
    # ... (analysis logic)
    
    print(f"\nRecommended recovery target:")
    print(f"  Time-based: {target_time.isoformat()}")
    print(f"  LSN-based: {recommended_lsn}")
 
if __name__ == "__main__":
    main()

Safety Margins and Best Practices

Even with thorough investigation, uncertainty remains. Safety margins protect against incomplete information.

Why Safety Margins Matter

Several factors can cause the actual problem to occur slightly earlier than estimated:

Clock Skew: Server time may differ from reference time
Transaction Duration: Long transactions commit later than they start
Replication Lag: If analyzing a replica's logs, timing may differ from primary
Investigation Uncertainty: Human error in determining incident timeline

Recommended Safety Margins

Scenario	Minimum Margin	Recommended Margin
High-confidence timing from logs	10 seconds	30 seconds
Estimated from user reports	2 minutes	5 minutes
Unknown timing, narrow corruption	5 minutes	10 minutes
Major incident, unclear scope	15 minutes	30 minutes

Iterative Recovery Strategy

When uncertain about the target, use an iterative approach:

Round 1: Recover with large safety margin

Verify corruption is absent
Note actual recovered point

Round 2 (if needed): Narrow the target

Move target closer to incident
Recover more recent data
Verify still before corruption

Round 3: Final recovery point selection

Balance data recovery vs. safety
May accept some data loss for certainty

# Iterative recovery example
# Round 1: Safe target (5 min before estimated incident)
recovery_target_time = '2024-01-15 14:25:00'
# Result: Clean, but 5 minutes of data lost

# Round 2: Closer target (2 min before)
recovery_target_time = '2024-01-15 14:28:00'
# Result: Still clean, recovered 3 more minutes

# Round 3: Final target (30 sec before)
recovery_target_time = '2024-01-15 14:29:30'
# Result: Clean, optimal recovery

Target Specification Best Practices

•Always use timezone-aware timestamps — Never specify a target without explicit timezone
•Default to exclusive mode — Unless you specifically need to include the target transaction
•Use pause action for verification — Verify state before finalizing recovery
•Include safety margin — Better to lose extra data than to recover past corruption
•Document the chosen target — Record reasoning for post-incident review
•Verify actual recovery point — Check pg_last_xact_replay_timestamp() after recovery
•Consider LSN for precision — When milliseconds matter, use LSN instead of time
•Create restore points proactively — Before risky operations, not after

Common Target Specification Errors

Learning from common mistakes helps avoid them during high-pressure recovery situations.

Error 1: Timezone Mismatch

Scenario: Administrator in New York specifies 2:30 PM thinking it's local time, but server is configured for UTC.

What happens: Recovery goes to 9:30 PM local time instead—hours past the incident.

Prevention:

-- Always verify interpretation
SELECT '2024-01-15 14:30:00'::timestamp AT TIME ZONE current_setting('TimeZone');

-- Always specify timezone
recovery_target_time = '2024-01-15 14:30:00 America/New_York'

Error 2: Confusion About Inclusive Behavior

Scenario: Administrator wants to recover up to and including a specific transaction, but uses default exclusive mode.

What happens: The target transaction is NOT included.

Prevention:

-- Be explicit about inclusive behavior
recovery_target_inclusive = true  -- If you WANT the target transaction
recovery_target_inclusive = false -- If you want to EXCLUDE target (default)

Error 3: Target Before Base Backup

Scenario: Recovery target is earlier than the base backup's checkpoint.

What happens: Recovery cannot proceed—there's no starting point.

Prevention:

# Always check backup's LSN/time before specifying target
cat /backup/backup_label
# START WAL LOCATION: 5/1A000028 (file 000000010000000500000001A)
# START TIME: 2024-01-15 00:00:00 UTC

# Target must be AFTER 2024-01-15 00:00:00 UTC

Common Target Specification Errors
Error	Symptom	Prevention
Timezone mismatch	Recovery at wrong time	Always specify explicit timezone
Inclusive confusion	Target transaction included/excluded unexpectedly	Be explicit about inclusive setting
Target before backup	Recovery fails to start	Verify target is after backup timestamp
Target in future	Recovery completes all WAL (no stop point)	Verify target is in the past
Missing WAL gap	Recovery stops at gap, not target	Verify archive continuity to target
Wrong target mode	Unexpected recovery point	Choose appropriate mode for scenario

Always Verify Before Finalizing

Summary: Mastering Target Specification

Precise target specification is where PITR theory meets practice. Let's consolidate the key concepts:

Key Takeaways

•Targets operate on transaction boundaries — Recovery precision is limited by transaction commit granularity, not wall-clock time.
•Timezone handling is critical — Always specify explicit timezones; implicit interpretation errors cause major recovery mistakes.
•Multiple target modes exist — Time is intuitive but LSN offers precision; named restore points provide clarity for planned operations.
•Investigation determines the target — Systematic analysis of logs, WAL, and audit data reveals the optimal recovery point.
•Safety margins protect against uncertainty — When in doubt, err toward recovering less data rather than including corruption.
•Verification before finalization is essential — Always pause at target and verify before promoting out of recovery.
•Common errors are preventable — Timezone mismatches, inclusive confusion, and pre-backup targets are avoided with careful specification.

What's next:

Page Complete

4 / 5