Database Management SystemsBackup Strategies

Backup Best Practices

LevelAdvanced

Duration75 mins

TopicBackup Strategies

2 / 5

Retention Policy

The Economics of Data Protection

Every backup you create raises a fundamental question: How long should you keep it?

Retention policy is where data protection meets economic reality. Keep backups forever, and storage costs spiral into unsustainability. Delete backups too aggressively, and you lose the ability to recover from issues discovered weeks or months after they occurred.

A well-designed retention policy provides the Goldilocks balance—keeping backups long enough to meet recovery needs whilst managing costs, compliance requirements, and operational complexity. The wrong retention policy either exposes the organization to unrecoverable data loss or wastes millions on unnecessary storage.

What You Will Learn

By the end of this page, you will understand how to design retention policies that satisfy regulatory requirements, support various recovery scenarios, optimize storage utilization, and automate backup lifecycle management. You will learn from enterprise retention strategies managing decades of data across hybrid storage tiers.

Retention Policy Fundamentals

A retention policy defines the lifecycle of backup data from creation through eventual deletion. It answers critical questions:

How long are backups available for recovery?
Which backups are kept for compliance versus operational recovery?
When is backup data permanently deleted?
Where is data stored at each stage of its lifecycle?

Retention vs. Recovery Window:

These terms are often confused but represent different concepts:

Retention Period: How long backup data is physically stored
Recovery Window: How far back in time you can perform a restore

A retention period of 30 days doesn't guarantee a 30-day recovery window if incremental chains are broken or base backups are missing.

Retention Policy Components
Component	Definition	Example	Impact
Short-term Retention	High-granularity backups for recent recovery	7-14 days of daily backups	Fast recovery from recent issues, higher storage cost per day
Medium-term Retention	Weekly/monthly points for broader recovery	4-12 weekly backups	Balance between granularity and storage efficiency
Long-term Retention	Archive backups for compliance or rare recovery	7-10 years of annual backups	Lower storage cost, slower access, compliance-driven
Transaction Log Retention	Continuous log backups for point-in-time recovery	24-72 hours of archived logs	Enables granular recovery between backup points

The retention hierarchy:

Modern retention policies implement tiered retention that mirrors the GFS backup schedule:

┌─────────────────────────────────────────────────────────┐
│                   RETENTION PYRAMID                      │
├─────────────────────────────────────────────────────────┤
│                                                         │
│    ▲  Annual (7+ years) - Compliance archives          │
│   ▲▲▲  Monthly (12-24 months) - Long-term recovery     │
│  ▲▲▲▲▲  Weekly (4-8 weeks) - Medium-term recovery      │
│ ▲▲▲▲▲▲▲  Daily (7-14 days) - Operational recovery      │
│▲▲▲▲▲▲▲▲▲  Hourly/Continuous - Immediate recovery       │
│                                                         │
│  Granularity ↑                   Storage Cost ↓         │
└─────────────────────────────────────────────────────────┘

As backups age, retention policy typically prunes granular backups while preserving periodic checkpoints. A 90-day-old backup point might exist, even though day-to-day granularity from that period was deleted weeks ago.

Retention Chain Integrity

With incremental backup strategies, deleting a backup may break the recovery chain for dependent backups. Retention policies must understand backup dependencies—don't delete a full backup while its incrementals are still retained, or those incrementals become useless.

Regulatory and Compliance Requirements

For many organizations, retention policy is not optional—it's legally mandated. Regulatory frameworks specify minimum retention periods, and failure to comply can result in severe penalties, legal exposure, and reputational damage.

Understanding regulatory requirements:

Regulations typically specify:

Minimum retention duration: How long must data be kept?
Data types covered: Which records fall under the regulation?
Access requirements: Must data be retrievable within specific timeframes?
Audit trail: Must you prove data hasn't been modified?

Common Regulatory Retention Requirements
Regulation	Industry	Retention Requirement	Key Data Types
SOX (Sarbanes-Oxley)	Public Companies (US)	7 years	Financial records, audit trails, email communications
HIPAA	Healthcare (US)	6 years from creation/last effective date	Patient records, PHI, access logs
GDPR	Any handling EU data	As long as necessary (minimize)	Personal data (with deletion rights)
PCI-DSS	Payment Card Industry	1 year minimum	Cardholder data, transaction logs, access logs
SEC Rule 17a-4	Broker-Dealers	6 years (3 readily accessible)	Trading records, communications
FINRA	Financial Services	3-6 years depending on record type	Customer records, trading data, communications
IRS Requirements	All US businesses	7 years	Tax records, financial documentation
Basel III	Banking	5 years minimum	Risk data, trading records, audit trails

GDPR: The Retention Paradox

GDPR creates interesting tension with traditional retention policies. While other regulations mandate minimum retention, GDPR mandates data minimization—you must delete personal data when no longer necessary. Organizations must balance deletion requirements against backup retention, potentially implementing backup-level personal data removal or accepting that some backup data falls outside deletion requests.

Legal hold and litigation requirements:

Beyond standard retention, organizations must implement legal hold capabilities—the ability to preserve data indefinitely when litigation is anticipated or underway, even if normal retention would delete it.

Legal hold requirements include:

Immediate preservation: Stop all automated deletion for relevant data
Scope management: Identify all data potentially relevant to the matter
Chain of custody: Prove data hasn't been modified since preservation
Indefinite duration: Hold persists until litigation concludes (potentially years)

Implementation considerations:

-- Example: Tracking backup holds for legal compliance
CREATE TABLE backup_legal_holds (
    hold_id UUID PRIMARY KEY,
    matter_name VARCHAR(255) NOT NULL,
    hold_date TIMESTAMP NOT NULL DEFAULT NOW(),
    release_date TIMESTAMP,
    created_by VARCHAR(100) NOT NULL,
    scope_description TEXT,
    status VARCHAR(50) DEFAULT 'active'
);

CREATE TABLE backup_hold_items (
    hold_id UUID REFERENCES backup_legal_holds(hold_id),
    backup_id UUID REFERENCES backup_catalog(backup_id),
    original_expiry_date TIMESTAMP,
    added_date TIMESTAMP DEFAULT NOW(),
    PRIMARY KEY (hold_id, backup_id)
);

-- Modify retention job to respect holds
-- DELETE FROM backups WHERE expiry_date < NOW() 
--   AND backup_id NOT IN (SELECT backup_id FROM backup_hold_items WHERE ...)

Compliance Retention Pitfalls

•Assuming backup = compliance: Raw database backups may not satisfy regulatory requirements for searchable, readily accessible records.
•Ignoring dependent systems: Email archives, application logs, and related systems may fall under the same retention requirements.
•Geographic data sovereignty: Some regulations require data to be stored within specific geographic boundaries.
•Deletion verification: Simply deleting isn't enough—you may need to prove deletion occurred and data isn't recoverable.
•Encryption key retention: Encrypted backups are useless if encryption keys are deleted. Key retention must match data retention.

Storage Tier Optimization

Retention policy and storage tiering are inseparable. As backups age, their access requirements change, enabling migration to lower-cost storage tiers without compromising recovery capability.

The storage tier hierarchy:

Storage Tiers for Backup Retention

•Hot Storage (0-14 days): High-performance SSD or NVMe storage for immediate recovery. Highest cost but fastest access. Recent backups that may be needed for quick restoration.
•Warm Storage (14-90 days): Standard HDD or object storage. Moderate cost, acceptable access latency. Operational recovery backups accessed occasionally.
•Cool/Archive Storage (90 days - 1 year): Cloud archive (S3 Glacier, Azure Archive) or tape libraries. Low cost, higher access latency (minutes to hours). Accessed rarely for specific recovery needs.
•Deep Archive (1+ years): Coldest storage tier (Glacier Deep Archive, offline tape). Lowest cost, significant retrieval time (hours to days). Compliance and disaster recovery only.

Storage Tier Cost and Performance Comparison (Approximate)
Storage Tier	Cost (per TB/month)	Retrieval Time	Use Case	Durability
Enterprise SSD	$200-400	Milliseconds	Active recovery, recent backups	99.999% (RAID)
Standard HDD Array	$30-80	Seconds	Operational backups (2-4 weeks)	99.99% (RAID)
S3 Standard	$23	Milliseconds	Cloud backup, cross-region DR	99.999999999%
S3 Infrequent Access	$12.50	Milliseconds	Monthly backups, 30-90 days	99.999999999%
S3 Glacier	$4	1-5 minutes	Quarterly/annual archives	99.999999999%
S3 Glacier Deep Archive	$1	12-48 hours	Compliance, 7+ year retention	99.999999999%
LTO-9 Tape	$5-10 + handling	Minutes to days	Air-gapped archive, disaster recovery	99.99%+ (proper storage)

Lifecycle policy automation:

Modern storage systems automate tier migration based on age or access patterns:

// AWS S3 Lifecycle Policy Example
{
  "Rules": [
    {
      "ID": "BackupRetentionPolicy",
      "Status": "Enabled",
      "Filter": { "Prefix": "backups/" },
      "Transitions": [
        {
          "Days": 30,
          "StorageClass": "STANDARD_IA"
        },
        {
          "Days": 90,
          "StorageClass": "GLACIER"
        },
        {
          "Days": 365,
          "StorageClass": "DEEP_ARCHIVE"
        }
      ],
      "Expiration": {
        "Days": 2555  // ~7 years
      }
    }
  ]
}

Cost optimization calculations:

Consider a 10 TB database with daily incrementals (~500 GB/day):

Without tiering (all hot storage @ $200/TB/month):
- Daily backups (14 days): 7 TB → $1,400/month
- Weekly backups (4 weeks): 10 TB → $2,000/month  
- Monthly backups (12 months): 120 TB → $24,000/month
- Total: ~$27,400/month

With tiering:
- Daily backups (hot, 14 days): 7 TB @ $200 → $1,400/month
- Weekly backups (warm, 4 weeks): 10 TB @ $50 → $500/month
- Monthly backups (cool, 12 months): 120 TB @ $12.50 → $1,500/month
- Total: ~$3,400/month (88% savings)

Retrieval Cost Awareness

Archive storage is cheap to store but expensive to retrieve. S3 Glacier Deep Archive charges $0.02 per GB retrieved plus per-request fees. Retrieving a 10 TB backup could cost $200+ plus hours of wait time. Factor retrieval costs into RTO calculations.

Designing Effective Retention Policies

Effective retention policy design balances multiple, often competing requirements. The process requires input from IT, legal, compliance, and business stakeholders.

The retention policy design framework:

Retention Policy Design Steps

•Inventory Data Assets: Catalog all databases and data types requiring backup. Classify by criticality and regulatory applicability.
•Identify Regulatory Requirements: Map each data type to applicable regulations. Determine minimum mandatory retention periods.
•Assess Recovery Scenarios: Define recovery scenarios (accidental deletion, corruption, disaster) and required recovery windows for each.
•Determine Business Requirements: Beyond compliance, what does the business need? Historical reporting? Audit trail? Development data refresh?
•Calculate Storage Requirements: Model storage consumption under proposed retention with growth projections.
•Design Tiered Retention: Create retention tiers with appropriate storage classes, migration triggers, and expiration rules.
•Document and Obtain Approval: Formal policy documentation with stakeholder sign-off. Policy becomes binding organizational standard.
•Implement and Automate: Configure backup systems to enforce policy automatically. Manual adherence fails at scale.

Common retention policy patterns:

Retention Policy Templates by Industry
Scenario	Short-term (Days)	Medium-term (Weeks)	Long-term (Months)	Archive (Years)
E-commerce	14 daily	4 weekly	12 monthly	7 annual (financial)
Healthcare (HIPAA)	7 daily	4 weekly	24 monthly	7+ annual (6 yr minimum)
Financial Services	30 daily (trading)	12 weekly	24 monthly	10 annual (SEC)
SaaS Platform	7 daily	4 weekly	6 monthly	None (beyond compliance)
Government	30 daily	8 weekly	24 monthly	Permanent archive (some records)

retention_policy_config.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
# Enterprise Backup Retention Policy Configuration
# This YAML defines retention rules for automated enforcement
 
version: "1.0"
policy_name: "Enterprise Database Retention Policy"
effective_date: "2024-01-01"
approved_by: "CTO, Legal, Compliance"
 
# Default retention (applies unless overridden)
default_retention:
  daily_backups: 14
  weekly_backups: 8
  monthly_backups: 12
  annual_backups: 7
  transaction_logs: 72  # hours
 
# Data classification-specific retention
classifications:
  tier1_critical:
    description: "Mission-critical production databases"
    databases:
      - "production_orders"
      - "production_customers"
      - "production_financial"
    retention:
      daily_backups: 30
      weekly_backups: 12
      monthly_backups: 24
      annual_backups: 10
      transaction_logs: 168  # 7 days
    storage_tiers:
      - age_days: 0
        storage_class: "hot_ssd"
      - age_days: 14
        storage_class: "standard_hdd"
      - age_days: 90
        storage_class: "cloud_archive"
      - age_days: 365
        storage_class: "deep_archive"
 
  tier2_operational:
    description: "Business operational databases"
    databases:
      - "operations_inventory"
      - "operations_shipping"
      - "analytics_warehouse"
    retention:
      daily_backups: 14
      weekly_backups: 8
      monthly_backups: 12
      annual_backups: 7
      transaction_logs: 72
 
  tier3_development:
    description: "Development and test databases"
    databases:
      - "dev_*"
      - "test_*"
      - "staging_*"
    retention:
      daily_backups: 7
      weekly_backups: 2
      monthly_backups: 0  # No monthly retention
      annual_backups: 0
      transaction_logs: 24
 
# Regulatory overrides (supersede classification defaults)
regulatory_requirements:
  sox_financial:
    applies_to: ["production_financial", "production_orders"]
    minimum_retention_years: 7
    audit_trail_required: true
    
  pci_dss:
    applies_to: ["production_payment*"]
    minimum_retention_years: 1
    encryption_required: true
    access_logging_required: true
 
# Legal hold configuration
legal_hold:
  enabled: true
  notification_email: "legal-team@company.com"
  hold_database: "backup_management.legal_holds"
 
# Expiration and deletion rules
deletion:
  grace_period_days: 7  # Warning before deletion
  require_approval_for:
    - "tier1_critical"
    - files_larger_than_gb: 100
  verification_required: true  # Confirm deletion successful

Retention Policy Automation

Manual retention management is unsustainable at scale. Automation ensures consistent policy enforcement, frees administrator time, and reduces human error in critical deletion decisions.

Retention automation components:

Backup Catalog: Central repository tracking all backups, their metadata, retention classification, and expiration dates
Policy Engine: Rules engine that evaluates backups against retention policies
Lifecycle Manager: Automated processes for tier migration and expiration
Audit Logger: Immutable record of all retention decisions and actions
Exception Handler: Workflow for holds, extensions, and manual overrides

retention_automation.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
#!/usr/bin/env python3
"""
Enterprise Backup Retention Automation System
Implements automated lifecycle management based on retention policies
"""
 
import os
import yaml
import logging
from datetime import datetime, timedelta
from dataclasses import dataclass
from typing import List, Optional
from enum import Enum
import boto3  # For S3 tier migration example
 
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
 
class StorageClass(Enum):
    HOT_SSD = "hot_ssd"
    STANDARD_HDD = "standard_hdd"
    CLOUD_STANDARD = "STANDARD"
    CLOUD_IA = "STANDARD_IA"
    CLOUD_GLACIER = "GLACIER"
    DEEP_ARCHIVE = "DEEP_ARCHIVE"
 
@dataclass
class Backup:
    backup_id: str
    database_name: str
    backup_type: str  # full, incremental, transaction_log
    created_at: datetime
    size_bytes: int
    storage_class: StorageClass
    expiration_date: Optional[datetime]
    legal_hold: bool = False
    parent_backup_id: Optional[str] = None
 
@dataclass
class RetentionAction:
    backup_id: str
    action: str  # migrate, expire, hold
    target_storage_class: Optional[StorageClass]
    reason: str
    scheduled_date: datetime
 
class RetentionPolicyEngine:
    """
    Evaluates backups against retention policies and generates actions
    """
    
    def __init__(self, policy_path: str):
        with open(policy_path, 'r') as f:
            self.policy = yaml.safe_load(f)
        self.s3_client = boto3.client('s3')
        
    def get_classification(self, database_name: str) -> dict:
        """Determine which classification applies to a database"""
        for class_name, class_config in self.policy['classifications'].items():
            for pattern in class_config['databases']:
                if self._matches_pattern(database_name, pattern):
                    return class_config['retention']
        return self.policy['default_retention']
    
    def _matches_pattern(self, name: str, pattern: str) -> bool:
        """Simple wildcard pattern matching"""
        if pattern.endswith('*'):
            return name.startswith(pattern[:-1])
        return name == pattern
    
    def calculate_expiration(self, backup: Backup) -> datetime:
        """Calculate when a backup should expire based on policy"""
        retention = self.get_classification(backup.database_name)
        
        if backup.backup_type == 'transaction_log':
            hours = retention.get('transaction_logs', 72)
            return backup.created_at + timedelta(hours=hours)
        
        # Determine if this is a daily, weekly, monthly, or annual backup
        # This is simplified - real implementation would check backup schedule
        age_days = (datetime.now() - backup.created_at).days
        
        if age_days < 30:
            return backup.created_at + timedelta(days=retention['daily_backups'])
        elif age_days < 90:
            return backup.created_at + timedelta(weeks=retention['weekly_backups'])
        elif age_days < 365:
            return backup.created_at + timedelta(days=30 * retention['monthly_backups'])
        else:
            return backup.created_at + timedelta(days=365 * retention['annual_backups'])
    
    def evaluate_tier_migration(self, backup: Backup) -> Optional[RetentionAction]:
        """Determine if backup should be migrated to different storage tier"""
        age_days = (datetime.now() - backup.created_at).days
        
        # Find appropriate tier based on age
        classification = self._get_classification_config(backup.database_name)
        storage_tiers = classification.get('storage_tiers', [])
        
        target_tier = None
        for tier in sorted(storage_tiers, key=lambda x: x['age_days'], reverse=True):
            if age_days >= tier['age_days']:
                target_tier = StorageClass(tier['storage_class'])
                break
        
        if target_tier and target_tier != backup.storage_class:
            return RetentionAction(
                backup_id=backup.backup_id,
                action='migrate',
                target_storage_class=target_tier,
                reason=f"Age {age_days} days exceeds threshold for current tier",
                scheduled_date=datetime.now()
            )
        return None
    
    def _get_classification_config(self, database_name: str) -> dict:
        """Get full classification config for a database"""
        for class_name, class_config in self.policy['classifications'].items():
            for pattern in class_config['databases']:
                if self._matches_pattern(database_name, pattern):
                    return class_config
        return {'retention': self.policy['default_retention'], 'storage_tiers': []}
    
    def check_legal_holds(self, backup: Backup, holds: List[dict]) -> bool:
        """Check if backup is under any legal hold"""
        for hold in holds:
            if hold['status'] == 'active':
                if backup.database_name in hold.get('scope_databases', []):
                    return True
                if backup.created_at >= hold['hold_date']:
                    # Backup created after hold initiated
                    if backup.database_name in hold.get('scope_databases', []):
                        return True
        return False
    
    def generate_actions(self, backups: List[Backup], 
                         legal_holds: List[dict]) -> List[RetentionAction]:
        """Generate all retention actions for a list of backups"""
        actions = []
        
        for backup in backups:
            # Check legal holds first - they override everything
            if self.check_legal_holds(backup, legal_holds):
                logger.info(f"Backup {backup.backup_id} under legal hold, skipping")
                continue
            
            # Check for expiration
            expiration = self.calculate_expiration(backup)
            if expiration < datetime.now():
                # Check for dependent backups before expiring
                if not self._has_dependents(backup, backups):
                    actions.append(RetentionAction(
                        backup_id=backup.backup_id,
                        action='expire',
                        target_storage_class=None,
                        reason=f"Exceeded retention: expired {expiration}",
                        scheduled_date=datetime.now()
                    ))
                else:
                    logger.warning(
                        f"Backup {backup.backup_id} expired but has dependents"
                    )
            
            # Check for tier migration
            migration = self.evaluate_tier_migration(backup)
            if migration:
                actions.append(migration)
        
        return actions
    
    def _has_dependents(self, backup: Backup, all_backups: List[Backup]) -> bool:
        """Check if any backups depend on this one (for incremental chains)"""
        if backup.backup_type != 'full':
            return False
        return any(
            b.parent_backup_id == backup.backup_id 
            for b in all_backups 
            if b.backup_id != backup.backup_id
        )
 
class RetentionExecutor:
    """
    Executes retention actions and logs results
    """
    
    def __init__(self, dry_run: bool = True):
        self.dry_run = dry_run
        self.s3_client = boto3.client('s3')
        
    def execute(self, actions: List[RetentionAction]) -> dict:
        """Execute a list of retention actions"""
        results = {'succeeded': 0, 'failed': 0, 'skipped': 0}
        
        for action in actions:
            try:
                if action.action == 'expire':
                    self._execute_expiration(action)
                elif action.action == 'migrate':
                    self._execute_migration(action)
                results['succeeded'] += 1
            except Exception as e:
                logger.error(f"Action failed: {action.backup_id}: {e}")
                results['failed'] += 1
        
        return results
    
    def _execute_expiration(self, action: RetentionAction):
        """Delete expired backup"""
        logger.info(f"{'[DRY RUN] ' if self.dry_run else ''}Expiring backup: {action.backup_id}")
        if not self.dry_run:
            # Actual deletion logic here
            pass
    
    def _execute_migration(self, action: RetentionAction):
        """Migrate backup to new storage tier"""
        logger.info(
            f"{'[DRY RUN] ' if self.dry_run else ''}"
            f"Migrating {action.backup_id} to {action.target_storage_class}"
        )
        if not self.dry_run:
            # S3 storage class change example
            # self.s3_client.copy_object(...)
            pass
 
# Main execution
if __name__ == "__main__":
    engine = RetentionPolicyEngine("retention_policy.yaml")
    executor = RetentionExecutor(dry_run=True)
    
    # In practice, backups would come from backup catalog database
    backups = []  # Load from catalog
    legal_holds = []  # Load from legal hold table
    
    actions = engine.generate_actions(backups, legal_holds)
    logger.info(f"Generated {len(actions)} retention actions")
    
    results = executor.execute(actions)
    logger.info(f"Execution results: {results}")

Dry Run First

Always run retention automation in dry-run mode first, especially when implementing new policies. Review proposed actions before enabling automatic execution. A misconfigured retention policy can delete critical backups permanently.

Retention Reporting and Auditing

Retention policies require ongoing monitoring and auditing. Reports demonstrate compliance, identify policy violations, and forecast storage requirements.

Essential retention reports:

Required Retention Reports

•Compliance Status Report: Which databases are meeting retention requirements? Which are at risk?
•Storage Consumption Report: Current consumption by tier, projected growth, cost allocation by database/department.
•Expiration Forecast: Backups scheduled for expiration in next 7/30/90 days. Storage to be reclaimed.
•Legal Hold Report: Active holds, affected backups, storage consumed by holds.
•Policy Exception Report: Backups retained beyond policy (manual extensions, holds) with justification.
•Deletion Audit Log: Complete history of all deletions with timestamps, authorization, and verification.

retention_reports.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
-- Retention Policy Compliance Dashboard Queries
 
-- 1. Overall Retention Compliance Status
SELECT 
    database_classification,
    COUNT(*) AS total_databases,
    SUM(CASE WHEN compliant = true THEN 1 ELSE 0 END) AS compliant_count,
    ROUND(100.0 * SUM(CASE WHEN compliant = true THEN 1 ELSE 0 END) / COUNT(*), 1) AS compliance_pct
FROM (
    SELECT 
        d.database_name,
        d.classification AS database_classification,
        CASE 
            WHEN MAX(b.created_at) > NOW() - INTERVAL '2 days' THEN true
            ELSE false 
        END AS compliant
    FROM databases d
    LEFT JOIN backup_catalog b ON d.database_name = b.database_name
    GROUP BY d.database_name, d.classification
) compliance
GROUP BY database_classification;
 
-- 2. Storage Consumption by Retention Tier
SELECT 
    CASE 
        WHEN age_days <= 14 THEN 'Short-term (0-14 days)'
        WHEN age_days <= 90 THEN 'Medium-term (15-90 days)'
        WHEN age_days <= 365 THEN 'Long-term (91-365 days)'
        ELSE 'Archive (365+ days)'
    END AS retention_tier,
    COUNT(*) AS backup_count,
    pg_size_pretty(SUM(size_bytes)) AS total_size,
    pg_size_pretty(AVG(size_bytes)) AS avg_backup_size,
    storage_class
FROM (
    SELECT 
        *,
        EXTRACT(DAY FROM NOW() - created_at) AS age_days
    FROM backup_catalog
) aged_backups
GROUP BY 
    CASE 
        WHEN age_days <= 14 THEN 'Short-term (0-14 days)'
        WHEN age_days <= 90 THEN 'Medium-term (15-90 days)'
        WHEN age_days <= 365 THEN 'Long-term (91-365 days)'
        ELSE 'Archive (365+ days)'
    END,
    storage_class
ORDER BY 
    CASE retention_tier
        WHEN 'Short-term (0-14 days)' THEN 1
        WHEN 'Medium-term (15-90 days)' THEN 2
        WHEN 'Long-term (91-365 days)' THEN 3
        ELSE 4
    END;
 
-- 3. Expiration Forecast
SELECT 
    DATE(expiration_date) AS expiration_day,
    COUNT(*) AS backups_expiring,
    pg_size_pretty(SUM(size_bytes)) AS storage_reclaimed,
    STRING_AGG(DISTINCT database_name, ', ') AS affected_databases
FROM backup_catalog
WHERE expiration_date BETWEEN NOW() AND NOW() + INTERVAL '30 days'
    AND NOT legal_hold
GROUP BY DATE(expiration_date)
ORDER BY expiration_day;
 
-- 4. Legal Hold Impact Report
SELECT 
    lh.matter_name,
    lh.hold_date,
    COUNT(bhi.backup_id) AS backups_held,
    pg_size_pretty(SUM(b.size_bytes)) AS storage_consumed,
    MIN(b.created_at) AS oldest_backup_held,
    MAX(b.expiration_date) AS original_expiration_latest
FROM backup_legal_holds lh
JOIN backup_hold_items bhi ON lh.hold_id = bhi.hold_id
JOIN backup_catalog b ON bhi.backup_id = b.backup_id
WHERE lh.status = 'active'
GROUP BY lh.hold_id, lh.matter_name, lh.hold_date
ORDER BY storage_consumed DESC;
 
-- 5. Deletion Audit Trail
SELECT 
    deleted_at,
    backup_id,
    database_name,
    backup_type,
    original_created_at,
    original_size_bytes,
    deletion_reason,
    deleted_by,
    verification_status
FROM backup_deletion_audit
WHERE deleted_at > NOW() - INTERVAL '30 days'
ORDER BY deleted_at DESC;

Audit Log Immutability

Deletion audit logs must be immutable and protected from tampering. Store audit logs separately from backup systems, use append-only storage, and consider third-party audit log services for compliance-critical environments. If someone can delete audit logs, the entire audit trail is unreliable.

Retention Policy Governance

Retention policies are not 'set and forget'—they require ongoing governance to remain effective, compliant, and aligned with organizational needs.

Governance framework:

Retention Governance Elements

•Policy Ownership: Designate clear ownership (typically jointly between IT, Legal, and Compliance). Someone must be accountable for policy accuracy and enforcement.
•Regular Review Cadence: Annual (minimum) policy review involving all stakeholders. More frequent reviews when regulations change.
•Change Management: Formal process for policy modifications. Changes require impact analysis, approval workflow, and implementation planning.
•Exception Handling: Documented process for requesting retention extensions or early deletion. Exceptions require justification and approval.
•Training and Awareness: Ensure backup administrators, database teams, and relevant stakeholders understand policy requirements.
•Compliance Monitoring: Regular audits verifying policy adherence. Deviations trigger investigation and remediation.

Policy versioning and documentation:

Maintain complete history of policy changes:

retention_policy:
  version: "2.3"
  effective_date: "2024-07-01"
  supersedes: "2.2"
  
  change_log:
    - version: "2.3"
      date: "2024-07-01"
      changes:
        - "Extended Tier 1 daily retention from 14 to 30 days"
        - "Added GDPR data deletion procedures"
      approved_by: "CTO, General Counsel"
      reason: "Regulatory audit finding, incident recovery assessment"
      
    - version: "2.2"
      date: "2024-01-15"
      changes:
        - "Added transaction log retention requirements"
        - "Defined legal hold procedures"
      approved_by: "CTO, Compliance Officer"
      reason: "SOX audit preparation"

Stakeholder responsibilities:

Role	Responsibilities
IT/DBA Team	Implement and operate retention automation; report compliance status
Legal	Define legal hold requirements; advise on regulatory interpretation
Compliance	Audit policy adherence; report to regulators
Business Owners	Define business recovery requirements; approve data classification
Security	Ensure encryption key retention; access control auditing
Finance	Budget allocation for storage; approve major storage investments

Policy Document Location

Retention policy documentation should be stored in a controlled document management system with version control, access logging, and approval workflows. Avoid keeping authoritative policies in wikis, shared drives, or email attachments where version control is unreliable.

Summary: Mastering Retention Policies

Retention policy is the bridge between backup creation and eventual deletion. A well-designed policy ensures compliance, optimizes costs, and maintains recovery capability across the backup lifecycle.

Key Takeaways

•Retention ≠ Recovery Window: Retention period is how long you store backups; recovery window is how far back you can restore. They're related but distinct.
•Compliance drives minimums: Regulatory requirements set floor for retention. You can exceed but not go below mandatory periods.
•Storage tiering slashes costs: Migrating aging backups to cheaper storage tiers can reduce costs by 80-90% while maintaining availability.
•Automate ruthlessly: Manual retention management fails at scale. Implement policy-driven automation with appropriate safeguards.
•Legal holds override everything: When litigation arises, normal retention rules suspend. Implement hold capabilities before you need them.
•Audit and report continuously: Demonstrate compliance through regular reporting. Maintain immutable deletion audit trails.
•Govern formally: Retention policies require ownership, regular review, change management, and stakeholder alignment.

What's next:

With retention policies defined, we move to offsite storage—the practice of maintaining backup copies in physically separate locations. Offsite storage protects against site-level disasters and is a cornerstone of robust data protection strategy.

Page Complete

You now understand how to design, implement, and govern retention policies that balance compliance requirements, recovery needs, and storage costs. Next, we'll explore offsite storage strategies for geographic resilience.

2 / 5

Loading learning content...

Database Management SystemsBackup Strategies

Backup Best Practices

LevelAdvanced

Duration75 mins

TopicBackup Strategies

2 / 5

Retention Policy

The Economics of Data Protection

Every backup you create raises a fundamental question: How long should you keep it?

What You Will Learn

Retention Policy Fundamentals

A retention policy defines the lifecycle of backup data from creation through eventual deletion. It answers critical questions:

How long are backups available for recovery?
Which backups are kept for compliance versus operational recovery?
When is backup data permanently deleted?
Where is data stored at each stage of its lifecycle?

Retention vs. Recovery Window:

These terms are often confused but represent different concepts:

Retention Period: How long backup data is physically stored
Recovery Window: How far back in time you can perform a restore

A retention period of 30 days doesn't guarantee a 30-day recovery window if incremental chains are broken or base backups are missing.

Retention Policy Components
Component	Definition	Example	Impact
Short-term Retention	High-granularity backups for recent recovery	7-14 days of daily backups	Fast recovery from recent issues, higher storage cost per day
Medium-term Retention	Weekly/monthly points for broader recovery	4-12 weekly backups	Balance between granularity and storage efficiency
Long-term Retention	Archive backups for compliance or rare recovery	7-10 years of annual backups	Lower storage cost, slower access, compliance-driven
Transaction Log Retention	Continuous log backups for point-in-time recovery	24-72 hours of archived logs	Enables granular recovery between backup points

The retention hierarchy:

Modern retention policies implement tiered retention that mirrors the GFS backup schedule:

┌─────────────────────────────────────────────────────────┐
│                   RETENTION PYRAMID                      │
├─────────────────────────────────────────────────────────┤
│                                                         │
│    ▲  Annual (7+ years) - Compliance archives          │
│   ▲▲▲  Monthly (12-24 months) - Long-term recovery     │
│  ▲▲▲▲▲  Weekly (4-8 weeks) - Medium-term recovery      │
│ ▲▲▲▲▲▲▲  Daily (7-14 days) - Operational recovery      │
│▲▲▲▲▲▲▲▲▲  Hourly/Continuous - Immediate recovery       │
│                                                         │
│  Granularity ↑                   Storage Cost ↓         │
└─────────────────────────────────────────────────────────┘

Retention Chain Integrity

Regulatory and Compliance Requirements

Understanding regulatory requirements:

Regulations typically specify:

Minimum retention duration: How long must data be kept?
Data types covered: Which records fall under the regulation?
Access requirements: Must data be retrievable within specific timeframes?
Audit trail: Must you prove data hasn't been modified?

Common Regulatory Retention Requirements
Regulation	Industry	Retention Requirement	Key Data Types
SOX (Sarbanes-Oxley)	Public Companies (US)	7 years	Financial records, audit trails, email communications
HIPAA	Healthcare (US)	6 years from creation/last effective date	Patient records, PHI, access logs
GDPR	Any handling EU data	As long as necessary (minimize)	Personal data (with deletion rights)
PCI-DSS	Payment Card Industry	1 year minimum	Cardholder data, transaction logs, access logs
SEC Rule 17a-4	Broker-Dealers	6 years (3 readily accessible)	Trading records, communications
FINRA	Financial Services	3-6 years depending on record type	Customer records, trading data, communications
IRS Requirements	All US businesses	7 years	Tax records, financial documentation
Basel III	Banking	5 years minimum	Risk data, trading records, audit trails

GDPR: The Retention Paradox

Legal hold and litigation requirements:

Legal hold requirements include:

Immediate preservation: Stop all automated deletion for relevant data
Scope management: Identify all data potentially relevant to the matter
Chain of custody: Prove data hasn't been modified since preservation
Indefinite duration: Hold persists until litigation concludes (potentially years)

Implementation considerations:

-- Example: Tracking backup holds for legal compliance
CREATE TABLE backup_legal_holds (
    hold_id UUID PRIMARY KEY,
    matter_name VARCHAR(255) NOT NULL,
    hold_date TIMESTAMP NOT NULL DEFAULT NOW(),
    release_date TIMESTAMP,
    created_by VARCHAR(100) NOT NULL,
    scope_description TEXT,
    status VARCHAR(50) DEFAULT 'active'
);

CREATE TABLE backup_hold_items (
    hold_id UUID REFERENCES backup_legal_holds(hold_id),
    backup_id UUID REFERENCES backup_catalog(backup_id),
    original_expiry_date TIMESTAMP,
    added_date TIMESTAMP DEFAULT NOW(),
    PRIMARY KEY (hold_id, backup_id)
);

-- Modify retention job to respect holds
-- DELETE FROM backups WHERE expiry_date < NOW() 
--   AND backup_id NOT IN (SELECT backup_id FROM backup_hold_items WHERE ...)

Compliance Retention Pitfalls

•Assuming backup = compliance: Raw database backups may not satisfy regulatory requirements for searchable, readily accessible records.
•Ignoring dependent systems: Email archives, application logs, and related systems may fall under the same retention requirements.
•Geographic data sovereignty: Some regulations require data to be stored within specific geographic boundaries.
•Deletion verification: Simply deleting isn't enough—you may need to prove deletion occurred and data isn't recoverable.
•Encryption key retention: Encrypted backups are useless if encryption keys are deleted. Key retention must match data retention.

Storage Tier Optimization

Retention policy and storage tiering are inseparable. As backups age, their access requirements change, enabling migration to lower-cost storage tiers without compromising recovery capability.

The storage tier hierarchy:

Storage Tiers for Backup Retention

•Hot Storage (0-14 days): High-performance SSD or NVMe storage for immediate recovery. Highest cost but fastest access. Recent backups that may be needed for quick restoration.
•Warm Storage (14-90 days): Standard HDD or object storage. Moderate cost, acceptable access latency. Operational recovery backups accessed occasionally.
•Cool/Archive Storage (90 days - 1 year): Cloud archive (S3 Glacier, Azure Archive) or tape libraries. Low cost, higher access latency (minutes to hours). Accessed rarely for specific recovery needs.
•Deep Archive (1+ years): Coldest storage tier (Glacier Deep Archive, offline tape). Lowest cost, significant retrieval time (hours to days). Compliance and disaster recovery only.

Storage Tier Cost and Performance Comparison (Approximate)
Storage Tier	Cost (per TB/month)	Retrieval Time	Use Case	Durability
Enterprise SSD	$200-400	Milliseconds	Active recovery, recent backups	99.999% (RAID)
Standard HDD Array	$30-80	Seconds	Operational backups (2-4 weeks)	99.99% (RAID)
S3 Standard	$23	Milliseconds	Cloud backup, cross-region DR	99.999999999%
S3 Infrequent Access	$12.50	Milliseconds	Monthly backups, 30-90 days	99.999999999%
S3 Glacier	$4	1-5 minutes	Quarterly/annual archives	99.999999999%
S3 Glacier Deep Archive	$1	12-48 hours	Compliance, 7+ year retention	99.999999999%
LTO-9 Tape	$5-10 + handling	Minutes to days	Air-gapped archive, disaster recovery	99.99%+ (proper storage)

Lifecycle policy automation:

Modern storage systems automate tier migration based on age or access patterns:

// AWS S3 Lifecycle Policy Example
{
  "Rules": [
    {
      "ID": "BackupRetentionPolicy",
      "Status": "Enabled",
      "Filter": { "Prefix": "backups/" },
      "Transitions": [
        {
          "Days": 30,
          "StorageClass": "STANDARD_IA"
        },
        {
          "Days": 90,
          "StorageClass": "GLACIER"
        },
        {
          "Days": 365,
          "StorageClass": "DEEP_ARCHIVE"
        }
      ],
      "Expiration": {
        "Days": 2555  // ~7 years
      }
    }
  ]
}

Cost optimization calculations:

Consider a 10 TB database with daily incrementals (~500 GB/day):

Without tiering (all hot storage @ $200/TB/month):
- Daily backups (14 days): 7 TB → $1,400/month
- Weekly backups (4 weeks): 10 TB → $2,000/month  
- Monthly backups (12 months): 120 TB → $24,000/month
- Total: ~$27,400/month

With tiering:
- Daily backups (hot, 14 days): 7 TB @ $200 → $1,400/month
- Weekly backups (warm, 4 weeks): 10 TB @ $50 → $500/month
- Monthly backups (cool, 12 months): 120 TB @ $12.50 → $1,500/month
- Total: ~$3,400/month (88% savings)

Retrieval Cost Awareness

Designing Effective Retention Policies

Effective retention policy design balances multiple, often competing requirements. The process requires input from IT, legal, compliance, and business stakeholders.

The retention policy design framework:

Retention Policy Design Steps

•Inventory Data Assets: Catalog all databases and data types requiring backup. Classify by criticality and regulatory applicability.
•Identify Regulatory Requirements: Map each data type to applicable regulations. Determine minimum mandatory retention periods.
•Assess Recovery Scenarios: Define recovery scenarios (accidental deletion, corruption, disaster) and required recovery windows for each.
•Determine Business Requirements: Beyond compliance, what does the business need? Historical reporting? Audit trail? Development data refresh?
•Calculate Storage Requirements: Model storage consumption under proposed retention with growth projections.
•Design Tiered Retention: Create retention tiers with appropriate storage classes, migration triggers, and expiration rules.
•Document and Obtain Approval: Formal policy documentation with stakeholder sign-off. Policy becomes binding organizational standard.
•Implement and Automate: Configure backup systems to enforce policy automatically. Manual adherence fails at scale.

Common retention policy patterns:

Retention Policy Templates by Industry
Scenario	Short-term (Days)	Medium-term (Weeks)	Long-term (Months)	Archive (Years)
E-commerce	14 daily	4 weekly	12 monthly	7 annual (financial)
Healthcare (HIPAA)	7 daily	4 weekly	24 monthly	7+ annual (6 yr minimum)
Financial Services	30 daily (trading)	12 weekly	24 monthly	10 annual (SEC)
SaaS Platform	7 daily	4 weekly	6 monthly	None (beyond compliance)
Government	30 daily	8 weekly	24 monthly	Permanent archive (some records)

retention_policy_config.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
# Enterprise Backup Retention Policy Configuration
# This YAML defines retention rules for automated enforcement
 
version: "1.0"
policy_name: "Enterprise Database Retention Policy"
effective_date: "2024-01-01"
approved_by: "CTO, Legal, Compliance"
 
# Default retention (applies unless overridden)
default_retention:
  daily_backups: 14
  weekly_backups: 8
  monthly_backups: 12
  annual_backups: 7
  transaction_logs: 72  # hours
 
# Data classification-specific retention
classifications:
  tier1_critical:
    description: "Mission-critical production databases"
    databases:
      - "production_orders"
      - "production_customers"
      - "production_financial"
    retention:
      daily_backups: 30
      weekly_backups: 12
      monthly_backups: 24
      annual_backups: 10
      transaction_logs: 168  # 7 days
    storage_tiers:
      - age_days: 0
        storage_class: "hot_ssd"
      - age_days: 14
        storage_class: "standard_hdd"
      - age_days: 90
        storage_class: "cloud_archive"
      - age_days: 365
        storage_class: "deep_archive"
 
  tier2_operational:
    description: "Business operational databases"
    databases:
      - "operations_inventory"
      - "operations_shipping"
      - "analytics_warehouse"
    retention:
      daily_backups: 14
      weekly_backups: 8
      monthly_backups: 12
      annual_backups: 7
      transaction_logs: 72
 
  tier3_development:
    description: "Development and test databases"
    databases:
      - "dev_*"
      - "test_*"
      - "staging_*"
    retention:
      daily_backups: 7
      weekly_backups: 2
      monthly_backups: 0  # No monthly retention
      annual_backups: 0
      transaction_logs: 24
 
# Regulatory overrides (supersede classification defaults)
regulatory_requirements:
  sox_financial:
    applies_to: ["production_financial", "production_orders"]
    minimum_retention_years: 7
    audit_trail_required: true
    
  pci_dss:
    applies_to: ["production_payment*"]
    minimum_retention_years: 1
    encryption_required: true
    access_logging_required: true
 
# Legal hold configuration
legal_hold:
  enabled: true
  notification_email: "legal-team@company.com"
  hold_database: "backup_management.legal_holds"
 
# Expiration and deletion rules
deletion:
  grace_period_days: 7  # Warning before deletion
  require_approval_for:
    - "tier1_critical"
    - files_larger_than_gb: 100
  verification_required: true  # Confirm deletion successful

Retention Policy Automation

Manual retention management is unsustainable at scale. Automation ensures consistent policy enforcement, frees administrator time, and reduces human error in critical deletion decisions.

Retention automation components:

Backup Catalog: Central repository tracking all backups, their metadata, retention classification, and expiration dates
Policy Engine: Rules engine that evaluates backups against retention policies
Lifecycle Manager: Automated processes for tier migration and expiration
Audit Logger: Immutable record of all retention decisions and actions
Exception Handler: Workflow for holds, extensions, and manual overrides

retention_automation.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
#!/usr/bin/env python3
"""
Enterprise Backup Retention Automation System
Implements automated lifecycle management based on retention policies
"""
 
import os
import yaml
import logging
from datetime import datetime, timedelta
from dataclasses import dataclass
from typing import List, Optional
from enum import Enum
import boto3  # For S3 tier migration example
 
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
 
class StorageClass(Enum):
    HOT_SSD = "hot_ssd"
    STANDARD_HDD = "standard_hdd"
    CLOUD_STANDARD = "STANDARD"
    CLOUD_IA = "STANDARD_IA"
    CLOUD_GLACIER = "GLACIER"
    DEEP_ARCHIVE = "DEEP_ARCHIVE"
 
@dataclass
class Backup:
    backup_id: str
    database_name: str
    backup_type: str  # full, incremental, transaction_log
    created_at: datetime
    size_bytes: int
    storage_class: StorageClass
    expiration_date: Optional[datetime]
    legal_hold: bool = False
    parent_backup_id: Optional[str] = None
 
@dataclass
class RetentionAction:
    backup_id: str
    action: str  # migrate, expire, hold
    target_storage_class: Optional[StorageClass]
    reason: str
    scheduled_date: datetime
 
class RetentionPolicyEngine:
    """
    Evaluates backups against retention policies and generates actions
    """
    
    def __init__(self, policy_path: str):
        with open(policy_path, 'r') as f:
            self.policy = yaml.safe_load(f)
        self.s3_client = boto3.client('s3')
        
    def get_classification(self, database_name: str) -> dict:
        """Determine which classification applies to a database"""
        for class_name, class_config in self.policy['classifications'].items():
            for pattern in class_config['databases']:
                if self._matches_pattern(database_name, pattern):
                    return class_config['retention']
        return self.policy['default_retention']
    
    def _matches_pattern(self, name: str, pattern: str) -> bool:
        """Simple wildcard pattern matching"""
        if pattern.endswith('*'):
            return name.startswith(pattern[:-1])
        return name == pattern
    
    def calculate_expiration(self, backup: Backup) -> datetime:
        """Calculate when a backup should expire based on policy"""
        retention = self.get_classification(backup.database_name)
        
        if backup.backup_type == 'transaction_log':
            hours = retention.get('transaction_logs', 72)
            return backup.created_at + timedelta(hours=hours)
        
        # Determine if this is a daily, weekly, monthly, or annual backup
        # This is simplified - real implementation would check backup schedule
        age_days = (datetime.now() - backup.created_at).days
        
        if age_days < 30:
            return backup.created_at + timedelta(days=retention['daily_backups'])
        elif age_days < 90:
            return backup.created_at + timedelta(weeks=retention['weekly_backups'])
        elif age_days < 365:
            return backup.created_at + timedelta(days=30 * retention['monthly_backups'])
        else:
            return backup.created_at + timedelta(days=365 * retention['annual_backups'])
    
    def evaluate_tier_migration(self, backup: Backup) -> Optional[RetentionAction]:
        """Determine if backup should be migrated to different storage tier"""
        age_days = (datetime.now() - backup.created_at).days
        
        # Find appropriate tier based on age
        classification = self._get_classification_config(backup.database_name)
        storage_tiers = classification.get('storage_tiers', [])
        
        target_tier = None
        for tier in sorted(storage_tiers, key=lambda x: x['age_days'], reverse=True):
            if age_days >= tier['age_days']:
                target_tier = StorageClass(tier['storage_class'])
                break
        
        if target_tier and target_tier != backup.storage_class:
            return RetentionAction(
                backup_id=backup.backup_id,
                action='migrate',
                target_storage_class=target_tier,
                reason=f"Age {age_days} days exceeds threshold for current tier",
                scheduled_date=datetime.now()
            )
        return None
    
    def _get_classification_config(self, database_name: str) -> dict:
        """Get full classification config for a database"""
        for class_name, class_config in self.policy['classifications'].items():
            for pattern in class_config['databases']:
                if self._matches_pattern(database_name, pattern):
                    return class_config
        return {'retention': self.policy['default_retention'], 'storage_tiers': []}
    
    def check_legal_holds(self, backup: Backup, holds: List[dict]) -> bool:
        """Check if backup is under any legal hold"""
        for hold in holds:
            if hold['status'] == 'active':
                if backup.database_name in hold.get('scope_databases', []):
                    return True
                if backup.created_at >= hold['hold_date']:
                    # Backup created after hold initiated
                    if backup.database_name in hold.get('scope_databases', []):
                        return True
        return False
    
    def generate_actions(self, backups: List[Backup], 
                         legal_holds: List[dict]) -> List[RetentionAction]:
        """Generate all retention actions for a list of backups"""
        actions = []
        
        for backup in backups:
            # Check legal holds first - they override everything
            if self.check_legal_holds(backup, legal_holds):
                logger.info(f"Backup {backup.backup_id} under legal hold, skipping")
                continue
            
            # Check for expiration
            expiration = self.calculate_expiration(backup)
            if expiration < datetime.now():
                # Check for dependent backups before expiring
                if not self._has_dependents(backup, backups):
                    actions.append(RetentionAction(
                        backup_id=backup.backup_id,
                        action='expire',
                        target_storage_class=None,
                        reason=f"Exceeded retention: expired {expiration}",
                        scheduled_date=datetime.now()
                    ))
                else:
                    logger.warning(
                        f"Backup {backup.backup_id} expired but has dependents"
                    )
            
            # Check for tier migration
            migration = self.evaluate_tier_migration(backup)
            if migration:
                actions.append(migration)
        
        return actions
    
    def _has_dependents(self, backup: Backup, all_backups: List[Backup]) -> bool:
        """Check if any backups depend on this one (for incremental chains)"""
        if backup.backup_type != 'full':
            return False
        return any(
            b.parent_backup_id == backup.backup_id 
            for b in all_backups 
            if b.backup_id != backup.backup_id
        )
 
class RetentionExecutor:
    """
    Executes retention actions and logs results
    """
    
    def __init__(self, dry_run: bool = True):
        self.dry_run = dry_run
        self.s3_client = boto3.client('s3')
        
    def execute(self, actions: List[RetentionAction]) -> dict:
        """Execute a list of retention actions"""
        results = {'succeeded': 0, 'failed': 0, 'skipped': 0}
        
        for action in actions:
            try:
                if action.action == 'expire':
                    self._execute_expiration(action)
                elif action.action == 'migrate':
                    self._execute_migration(action)
                results['succeeded'] += 1
            except Exception as e:
                logger.error(f"Action failed: {action.backup_id}: {e}")
                results['failed'] += 1
        
        return results
    
    def _execute_expiration(self, action: RetentionAction):
        """Delete expired backup"""
        logger.info(f"{'[DRY RUN] ' if self.dry_run else ''}Expiring backup: {action.backup_id}")
        if not self.dry_run:
            # Actual deletion logic here
            pass
    
    def _execute_migration(self, action: RetentionAction):
        """Migrate backup to new storage tier"""
        logger.info(
            f"{'[DRY RUN] ' if self.dry_run else ''}"
            f"Migrating {action.backup_id} to {action.target_storage_class}"
        )
        if not self.dry_run:
            # S3 storage class change example
            # self.s3_client.copy_object(...)
            pass
 
# Main execution
if __name__ == "__main__":
    engine = RetentionPolicyEngine("retention_policy.yaml")
    executor = RetentionExecutor(dry_run=True)
    
    # In practice, backups would come from backup catalog database
    backups = []  # Load from catalog
    legal_holds = []  # Load from legal hold table
    
    actions = engine.generate_actions(backups, legal_holds)
    logger.info(f"Generated {len(actions)} retention actions")
    
    results = executor.execute(actions)
    logger.info(f"Execution results: {results}")

Dry Run First

Retention Reporting and Auditing

Retention policies require ongoing monitoring and auditing. Reports demonstrate compliance, identify policy violations, and forecast storage requirements.

Essential retention reports:

Required Retention Reports

•Compliance Status Report: Which databases are meeting retention requirements? Which are at risk?
•Storage Consumption Report: Current consumption by tier, projected growth, cost allocation by database/department.
•Expiration Forecast: Backups scheduled for expiration in next 7/30/90 days. Storage to be reclaimed.
•Legal Hold Report: Active holds, affected backups, storage consumed by holds.
•Policy Exception Report: Backups retained beyond policy (manual extensions, holds) with justification.
•Deletion Audit Log: Complete history of all deletions with timestamps, authorization, and verification.

retention_reports.sql
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
-- Retention Policy Compliance Dashboard Queries
 
-- 1. Overall Retention Compliance Status
SELECT 
    database_classification,
    COUNT(*) AS total_databases,
    SUM(CASE WHEN compliant = true THEN 1 ELSE 0 END) AS compliant_count,
    ROUND(100.0 * SUM(CASE WHEN compliant = true THEN 1 ELSE 0 END) / COUNT(*), 1) AS compliance_pct
FROM (
    SELECT 
        d.database_name,
        d.classification AS database_classification,
        CASE 
            WHEN MAX(b.created_at) > NOW() - INTERVAL '2 days' THEN true
            ELSE false 
        END AS compliant
    FROM databases d
    LEFT JOIN backup_catalog b ON d.database_name = b.database_name
    GROUP BY d.database_name, d.classification
) compliance
GROUP BY database_classification;
 
-- 2. Storage Consumption by Retention Tier
SELECT 
    CASE 
        WHEN age_days <= 14 THEN 'Short-term (0-14 days)'
        WHEN age_days <= 90 THEN 'Medium-term (15-90 days)'
        WHEN age_days <= 365 THEN 'Long-term (91-365 days)'
        ELSE 'Archive (365+ days)'
    END AS retention_tier,
    COUNT(*) AS backup_count,
    pg_size_pretty(SUM(size_bytes)) AS total_size,
    pg_size_pretty(AVG(size_bytes)) AS avg_backup_size,
    storage_class
FROM (
    SELECT 
        *,
        EXTRACT(DAY FROM NOW() - created_at) AS age_days
    FROM backup_catalog
) aged_backups
GROUP BY 
    CASE 
        WHEN age_days <= 14 THEN 'Short-term (0-14 days)'
        WHEN age_days <= 90 THEN 'Medium-term (15-90 days)'
        WHEN age_days <= 365 THEN 'Long-term (91-365 days)'
        ELSE 'Archive (365+ days)'
    END,
    storage_class
ORDER BY 
    CASE retention_tier
        WHEN 'Short-term (0-14 days)' THEN 1
        WHEN 'Medium-term (15-90 days)' THEN 2
        WHEN 'Long-term (91-365 days)' THEN 3
        ELSE 4
    END;
 
-- 3. Expiration Forecast
SELECT 
    DATE(expiration_date) AS expiration_day,
    COUNT(*) AS backups_expiring,
    pg_size_pretty(SUM(size_bytes)) AS storage_reclaimed,
    STRING_AGG(DISTINCT database_name, ', ') AS affected_databases
FROM backup_catalog
WHERE expiration_date BETWEEN NOW() AND NOW() + INTERVAL '30 days'
    AND NOT legal_hold
GROUP BY DATE(expiration_date)
ORDER BY expiration_day;
 
-- 4. Legal Hold Impact Report
SELECT 
    lh.matter_name,
    lh.hold_date,
    COUNT(bhi.backup_id) AS backups_held,
    pg_size_pretty(SUM(b.size_bytes)) AS storage_consumed,
    MIN(b.created_at) AS oldest_backup_held,
    MAX(b.expiration_date) AS original_expiration_latest
FROM backup_legal_holds lh
JOIN backup_hold_items bhi ON lh.hold_id = bhi.hold_id
JOIN backup_catalog b ON bhi.backup_id = b.backup_id
WHERE lh.status = 'active'
GROUP BY lh.hold_id, lh.matter_name, lh.hold_date
ORDER BY storage_consumed DESC;
 
-- 5. Deletion Audit Trail
SELECT 
    deleted_at,
    backup_id,
    database_name,
    backup_type,
    original_created_at,
    original_size_bytes,
    deletion_reason,
    deleted_by,
    verification_status
FROM backup_deletion_audit
WHERE deleted_at > NOW() - INTERVAL '30 days'
ORDER BY deleted_at DESC;

Audit Log Immutability

Retention Policy Governance

Retention policies are not 'set and forget'—they require ongoing governance to remain effective, compliant, and aligned with organizational needs.

Governance framework:

Retention Governance Elements

•Policy Ownership: Designate clear ownership (typically jointly between IT, Legal, and Compliance). Someone must be accountable for policy accuracy and enforcement.
•Regular Review Cadence: Annual (minimum) policy review involving all stakeholders. More frequent reviews when regulations change.
•Change Management: Formal process for policy modifications. Changes require impact analysis, approval workflow, and implementation planning.
•Exception Handling: Documented process for requesting retention extensions or early deletion. Exceptions require justification and approval.
•Training and Awareness: Ensure backup administrators, database teams, and relevant stakeholders understand policy requirements.
•Compliance Monitoring: Regular audits verifying policy adherence. Deviations trigger investigation and remediation.

Policy versioning and documentation:

Maintain complete history of policy changes:

retention_policy:
  version: "2.3"
  effective_date: "2024-07-01"
  supersedes: "2.2"
  
  change_log:
    - version: "2.3"
      date: "2024-07-01"
      changes:
        - "Extended Tier 1 daily retention from 14 to 30 days"
        - "Added GDPR data deletion procedures"
      approved_by: "CTO, General Counsel"
      reason: "Regulatory audit finding, incident recovery assessment"
      
    - version: "2.2"
      date: "2024-01-15"
      changes:
        - "Added transaction log retention requirements"
        - "Defined legal hold procedures"
      approved_by: "CTO, Compliance Officer"
      reason: "SOX audit preparation"

Stakeholder responsibilities:

Role	Responsibilities
IT/DBA Team	Implement and operate retention automation; report compliance status
Legal	Define legal hold requirements; advise on regulatory interpretation
Compliance	Audit policy adherence; report to regulators
Business Owners	Define business recovery requirements; approve data classification
Security	Ensure encryption key retention; access control auditing
Finance	Budget allocation for storage; approve major storage investments

Policy Document Location

Summary: Mastering Retention Policies

Key Takeaways

•Retention ≠ Recovery Window: Retention period is how long you store backups; recovery window is how far back you can restore. They're related but distinct.
•Compliance drives minimums: Regulatory requirements set floor for retention. You can exceed but not go below mandatory periods.
•Storage tiering slashes costs: Migrating aging backups to cheaper storage tiers can reduce costs by 80-90% while maintaining availability.
•Automate ruthlessly: Manual retention management fails at scale. Implement policy-driven automation with appropriate safeguards.
•Legal holds override everything: When litigation arises, normal retention rules suspend. Implement hold capabilities before you need them.
•Audit and report continuously: Demonstrate compliance through regular reporting. Maintain immutable deletion audit trails.
•Govern formally: Retention policies require ownership, regular review, change management, and stakeholder alignment.

What's next:

Page Complete

2 / 5