Loading learning content...
Backup storage is where data protection becomes tangible—the physical and logical systems that store your backup data and determine whether recovery is possible when disaster strikes. A sophisticated backup strategy with world-class tools means nothing if the storage infrastructure fails, becomes inaccessible, or cannot scale with data growth.
Backup storage decisions have far-reaching implications: they affect recovery time objectives (RTO), determine operational costs, influence data retention capabilities, and establish your resilience against various disaster scenarios. This page provides comprehensive coverage of backup storage technologies, architectures, and best practices for designing robust backup infrastructure.
By the end of this page, you will understand the spectrum of backup storage technologies, master storage architecture patterns for different requirements, design retention and tiering strategies, leverage cloud storage effectively, and build storage infrastructure that supports your recovery objectives.
Backup storage spans a range of technologies, each with distinct performance characteristics, cost profiles, and use cases.
Hard Disk Drives (HDD)
Traditional spinning disk remains the workhorse of backup storage:
Characteristics:
Best for: Primary backup landing zone, mid-tier storage, high-capacity requirements
Solid State Drives (SSD)
Flash storage for performance-sensitive backup needs:
Characteristics:
Best for: Backup landing zones requiring fast ingestion, instant recovery staging, deduplication indexes
| Media Type | Cost/TB | Write Speed | Read Speed | Durability | Best Use Case |
|---|---|---|---|---|---|
| HDD (Enterprise) | $15-25 | 200 MB/s | 250 MB/s | Excellent | Primary backup storage |
| SSD (NVMe) | $80-150 | 5,000 MB/s | 7,000 MB/s | Very Good | Fast recovery staging |
| Tape (LTO-9) | $5-8 | 400 MB/s | 400 MB/s | Excellent | Long-term archive |
| Object Storage | $2-23/mo | Variable | Variable | Excellent | Cloud backup, archive |
| Optical (BDXL) | $10-15 | 36 MB/s | 72 MB/s | Excellent | Compliance archive |
Tape Storage
Magnetic tape remains relevant for large-scale archival:
Modern Tape (LTO-9):
Advantages:
Considerations:
Object Storage
Cloud and on-premises object storage is increasingly popular:
Public Cloud (S3, Azure Blob, GCS):
On-Premises (MinIO, Ceph, Dell ECS):
Backup storage architecture determines performance, resilience, and operational characteristics. Several patterns address different requirements.
Direct-Attached Storage (DAS)
Storage directly connected to the backup server:
Configuration:
Advantages:
Limitations:
Network-Attached Storage (NAS)
File-based storage accessed over network:
Configuration:
Advantages:
Limitations:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182
#!/bin/bash# Backup Storage Configuration Examples # =====================================# DAS: RAID Configuration with mdadm# ===================================== # Create RAID 6 array for backup storage# 6 data disks + 2 parity = tolerates 2 disk failuressudo mdadm --create /dev/md0 \ --level=6 \ --raid-devices=8 \ /dev/sd[b-i] # Create filesystemsudo mkfs.xfs -L backup_storage /dev/md0 # Mount for backup usesudo mkdir -p /backupsudo mount /dev/md0 /backup # Add to fstab for persistenceecho "LABEL=backup_storage /backup xfs defaults,noatime 0 2" | sudo tee -a /etc/fstab # =====================================# NAS: NFS Mount for Backup# ===================================== # Install NFS clientsudo apt-get install nfs-common # Mount NFS share for backupsudo mkdir -p /backup/nfssudo mount -t nfs -o hard,intr,rsize=1048576,wsize=1048576 \ nas.example.com:/vol/backups /backup/nfs # Persistent mount in fstabecho "nas.example.com:/vol/backups /backup/nfs nfs hard,intr,rsize=1048576,wsize=1048576 0 0" | sudo tee -a /etc/fstab # =====================================# SAN: iSCSI Target Mount# ===================================== # Discover iSCSI targetssudo iscsiadm -m discovery -t sendtargets -p san.example.com:3260 # Login to targetsudo iscsiadm -m node -T iqn.2024-01.com.example:backup001 -p san.example.com:3260 -l # Create filesystem on iSCSI LUNsudo mkfs.xfs /dev/sdc # Newly discovered iSCSI disk # Mount iSCSI volumesudo mkdir -p /backup/sansudo mount /dev/sdc /backup/san # =====================================# Object Storage: S3-Compatible Mount# ===================================== # Install s3fs for S3 mounting (not recommended for primary backup)sudo apt-get install s3fs # Create credentials fileecho "ACCESS_KEY:SECRET_KEY" > ~/.passwd-s3fschmod 600 ~/.passwd-s3fs # Mount S3 buckets3fs mybucket /backup/s3 -o passwd_file=~/.passwd-s3fs,url=https://s3.amazonaws.com # Better approach: Use backup tool's native S3 support# pgBackRest with S3cat >> /etc/pgbackrest/pgbackrest.conf <<EOF[global]repo1-type=s3repo1-s3-bucket=backup-bucketrepo1-s3-endpoint=s3.amazonaws.comrepo1-s3-region=us-east-1repo1-path=/pgbackrestrepo1-s3-key=ACCESS_KEYrepo1-s3-key-secret=SECRET_KEYEOFStorage Area Network (SAN)
Block-level storage over dedicated network:
Configuration:
Advantages:
Considerations:
Deduplication Appliances
Purpose-built backup storage with inline deduplication:
Products:
Advantages:
Considerations:
Most production environments benefit from hybrid architectures: fast DAS or SAN for backup landing zones, NAS or deduplication appliances for retention, and cloud or tape for archive. Data automatically moves through tiers based on age and access patterns, optimizing both performance and cost.
Cloud storage has transformed backup architecture, offering scalability, durability, and geographic distribution that would be prohibitively expensive on-premises.
Cloud Storage Classes
Cloud providers offer multiple storage tiers optimized for different access patterns:
| Tier (AWS) | Cost/GB/Month | Retrieval Time | Retrieval Cost | Best For |
|---|---|---|---|---|
| S3 Standard | $0.023 | Immediate | None | Frequently accessed backups |
| S3 Standard-IA | $0.0125 | Immediate | $0.01/GB | Infrequent restore, 30+ day retention |
| S3 Glacier Instant | $0.004 | Milliseconds | $0.03/GB | Archive with instant access needs |
| S3 Glacier Flexible | $0.0036 | 1-12 hours | $0.03/GB | Archive, flexible retrieval OK |
| S3 Glacier Deep Archive | $0.00099 | 12-48 hours | $0.02/GB | Long-term archive, rare access |
Cloud Storage Integration Strategies
1. Direct Cloud Backup
Backup directly to cloud storage:
Advantages:
Considerations:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122
#!/bin/bash# Cloud Storage Integration Examples # =====================================# AWS S3 Lifecycle Policy# ===================================== # Create lifecycle policy for backup tieringcat > lifecycle-policy.json <<EOF{ "Rules": [ { "ID": "BackupLifecycle", "Status": "Enabled", "Filter": { "Prefix": "backups/" }, "Transitions": [ { "Days": 30, "StorageClass": "STANDARD_IA" }, { "Days": 90, "StorageClass": "GLACIER_IR" }, { "Days": 365, "StorageClass": "DEEP_ARCHIVE" } ], "Expiration": { "Days": 2555 }, "NoncurrentVersionExpiration": { "NoncurrentDays": 30 } } ]}EOF aws s3api put-bucket-lifecycle-configuration \ --bucket my-backup-bucket \ --lifecycle-configuration file://lifecycle-policy.json # =====================================# Backup to S3 with Intelligent-Tiering# ===================================== # Upload backup with intelligent tieringaws s3 cp /backup/daily_backup.tar.gz \ s3://my-backup-bucket/daily/ \ --storage-class INTELLIGENT_TIERING # Sync entire backup directoryaws s3 sync /backup/postgresql s3://my-backup-bucket/postgresql/ \ --storage-class INTELLIGENT_TIERING \ --only-show-errors # =====================================# Cross-Region Replication for DR# ===================================== # Enable versioning (required for replication)aws s3api put-bucket-versioning \ --bucket my-backup-bucket \ --versioning-configuration Status=Enabled # Create replication configurationcat > replication-config.json <<EOF{ "Role": "arn:aws:iam::123456789:role/S3ReplicationRole", "Rules": [ { "ID": "DR-Replication", "Status": "Enabled", "Priority": 1, "DeleteMarkerReplication": { "Status": "Disabled" }, "Filter": { "Prefix": "" }, "Destination": { "Bucket": "arn:aws:s3:::my-backup-bucket-dr", "StorageClass": "STANDARD_IA" } } ]}EOF aws s3api put-bucket-replication \ --bucket my-backup-bucket \ --replication-configuration file://replication-config.json # =====================================# Azure Blob Storage Integration# ===================================== # Create storage account with geo-redundancyaz storage account create \ --name mybackupstorage \ --resource-group backups \ --location eastus \ --sku Standard_GRS \ --kind StorageV2 # Create container with blob-level tieringaz storage container create \ --name backups \ --account-name mybackupstorage # Upload with access tieraz storage blob upload \ --account-name mybackupstorage \ --container-name backups \ --name daily/backup.tar.gz \ --file /backup/daily_backup.tar.gz \ --tier Cool # Set lifecycle management policyaz storage account management-policy create \ --account-name mybackupstorage \ --policy @lifecycle-policy.json2. Hybrid Cloud Backup
Local backup with cloud replication:
Pattern:
Advantages:
3. Cloud-Native Database Backup
For cloud databases (RDS, Cloud SQL, etc.):
Built-in features:
Cloud storage is inexpensive but cloud egress (downloading data) is not. Restoring 10 TB from S3 Standard costs ~$900 in egress fees. Factor egress costs into disaster recovery planning. Strategies to mitigate: (1) choose cloud regions with lower egress costs, (2) use Direct Connect/ExpressRoute for large restores, (3) test restores periodically to understand actual costs, (4) consider keeping recent backups locally for common restore scenarios.
Backup storage must be protected against both hardware failures and malicious attacks (especially ransomware). Multiple protection mechanisms work together.
RAID Protection
RAID provides first-line protection against disk failures:
Replication
Copying backup data to secondary locations:
Immutability and WORM (Write Once Read Many)
Critical protection against ransomware and accidental deletion:
Why immutability matters:
Implementation options:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091
#!/bin/bash# Backup Immutability Configuration Examples # =====================================# AWS S3 Object Lock# ===================================== # Create bucket with object lock enabledaws s3api create-bucket \ --bucket my-immutable-backup \ --object-lock-enabled-for-bucket \ --create-bucket-configuration LocationConstraint=us-east-1 # Set default retention policy (Governance mode)aws s3api put-object-lock-configuration \ --bucket my-immutable-backup \ --object-lock-configuration '{ "ObjectLockEnabled": "Enabled", "Rule": { "DefaultRetention": { "Mode": "GOVERNANCE", "Days": 30 } } }' # Upload with compliance retention (cannot be deleted even by root)aws s3api put-object \ --bucket my-immutable-backup \ --key backups/critical-backup.tar.gz \ --body /backup/critical-backup.tar.gz \ --object-lock-mode COMPLIANCE \ --object-lock-retain-until-date "2024-12-31T00:00:00Z" # =====================================# Azure Immutable Blob Storage# ===================================== # Create container with legal holdaz storage container create \ --name immutable-backups \ --account-name mybackupstorage # Set time-based retention policy (locked after confirmation)az storage container immutability-policy create \ --resource-group backups \ --account-name mybackupstorage \ --container-name immutable-backups \ --period 90 # Lock the policy (irreversible!)az storage container immutability-policy lock \ --resource-group backups \ --account-name mybackupstorage \ --container-name immutable-backups # =====================================# Linux: Immutable File Attribute# ===================================== # Make backup file immutable (cannot be modified/deleted even by root)sudo chattr +i /backup/critical_backup.tar.gz # Check immutable statuslsattr /backup/critical_backup.tar.gz# Output: ----i--------e-- /backup/critical_backup.tar.gz # Remove immutable (required for deletion/rotation)sudo chattr -i /backup/critical_backup.tar.gz # Automated immutable backup scriptbackup_file="/backup/$(date +%Y%m%d)_backup.tar.gz"tar -czf "$backup_file" /datasudo chattr +i "$backup_file" # Schedule immutability removal for retentionecho "sudo chattr -i $backup_file && rm -f $backup_file" | \ at "now + 30 days" # =====================================# ZFS: Snapshot Holds# ===================================== # Create hold on snapshot (prevents destruction)zfs hold compliance_hold zpool/backups@2024-01-15 # List holdszfs holds zpool/backups@2024-01-15 # Release hold (when retention period expires)zfs release compliance_hold zpool/backups@2024-01-15Backup retention determines how long backups are kept and how storage is managed over time. Well-designed retention policies balance recovery requirements, compliance obligations, and storage costs.
Retention Strategies
Grandfather-Father-Son (GFS)
Classic rotation scheme maintaining multiple recovery points:
Total backups retained: 7 + 4 + 12 + 7 = 30 backups Maximum recovery granularity: Depends on age of data
| Backup Type | Frequency | Retention | Count | Purpose |
|---|---|---|---|---|
| Daily Full | Every day | 7 days | 7 | Recent point-in-time recovery |
| Weekly Full | Every Sunday | 4 weeks | 4 | Last month recovery |
| Monthly Full | 1st of month | 12 months | 12 | Historical recovery |
| Yearly Full | Jan 1st | 7 years | 7 | Compliance, audit |
Tiered Retention with Storage Optimization
Move backups through storage tiers as they age:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107
#!/bin/bash# Backup Retention and Lifecycle Management # =====================================# GFS Retention Implementation# ===================================== BACKUP_DIR="/backup/postgresql"TODAY=$(date +%u) # Day of week (1-7, Monday=1)DAY_OF_MONTH=$(date +%d)MONTH=$(date +%m) # Daily backup (keep 7 days)pg_dump mydb > "$BACKUP_DIR/daily/$(date +%Y%m%d).sql.gz"find "$BACKUP_DIR/daily" -name "*.sql.gz" -mtime +7 -delete # Weekly backup on Sunday (keep 4 weeks)if [ "$TODAY" -eq 7 ]; then cp "$BACKUP_DIR/daily/$(date +%Y%m%d).sql.gz" "$BACKUP_DIR/weekly/" find "$BACKUP_DIR/weekly" -name "*.sql.gz" -mtime +28 -deletefi # Monthly backup on 1st (keep 12 months)if [ "$DAY_OF_MONTH" -eq "01" ]; then cp "$BACKUP_DIR/daily/$(date +%Y%m%d).sql.gz" "$BACKUP_DIR/monthly/" find "$BACKUP_DIR/monthly" -name "*.sql.gz" -mtime +365 -deletefi # Yearly backup on Jan 1st (keep 7 years)if [ "$DAY_OF_MONTH" -eq "01" ] && [ "$MONTH" -eq "01" ]; then cp "$BACKUP_DIR/daily/$(date +%Y%m%d).sql.gz" "$BACKUP_DIR/yearly/" find "$BACKUP_DIR/yearly" -name "*.sql.gz" -mtime +2555 -deletefi # =====================================# Tiered Storage Migration# ===================================== FAST_TIER="/backup/ssd" # 0-7 daysCAPACITY_TIER="/backup/hdd" # 7-30 daysARCHIVE_TIER="/backup/s3" # 30+ days # Move from fast to capacity tier (7+ days old)find "$FAST_TIER" -name "*.backup" -mtime +7 -exec mv {} "$CAPACITY_TIER/" \; # Archive to cloud (30+ days old)find "$CAPACITY_TIER" -name "*.backup" -mtime +30 | while read file; do # Compress before archive gzip "$file" # Upload to S3 with lifecycle aws s3 cp "${file}.gz" s3://backup-bucket/archive/ \ --storage-class STANDARD_IA # Remove local copy after confirmed upload rm "${file}.gz"done # =====================================# pgBackRest Retention Configuration# ===================================== cat >> /etc/pgbackrest/pgbackrest.conf <<EOF[global]# Full backup retention (keep 4 full backups)repo1-retention-full=4repo1-retention-full-type=count # Differential retention (keep 2 per full)repo1-retention-diff=2 # Archive retention (match full backup retention)repo1-retention-archive=4repo1-retention-archive-type=fullEOF # Expire old backups based on retentionpgbackrest --stanza=main expire # =====================================# Database-Level Retention Tracking# ===================================== # Create backup catalog tablepsql <<EOFCREATE TABLE IF NOT EXISTS backup_catalog ( backup_id SERIAL PRIMARY KEY, backup_date TIMESTAMP WITH TIME ZONE DEFAULT NOW(), backup_type VARCHAR(20), -- full, diff, incr, log backup_path TEXT, backup_size BIGINT, retention_class VARCHAR(20), -- daily, weekly, monthly, yearly expire_date DATE, status VARCHAR(20) DEFAULT 'active'); -- Insert today's backupINSERT INTO backup_catalog (backup_type, backup_path, backup_size, retention_class, expire_date)VALUES ('full', '/backup/daily/20240115.sql.gz', 1073741824, 'daily', CURRENT_DATE + 7); -- Find expired backupsSELECT backup_path FROM backup_catalog WHERE expire_date < CURRENT_DATE AND status = 'active'; -- Mark as expiredUPDATE backup_catalog SET status = 'expired' WHERE expire_date < CURRENT_DATE;EOFRetention policies are often driven by compliance requirements: HIPAA (6 years), SOX (7 years), GDPR (varies by data type), PCI-DSS (1 year typically). Document your retention policies, ensure they meet regulatory requirements, and maintain audit trails of backup creation and deletion.
Backup storage must scale with data growth while maintaining performance. Proactive capacity planning prevents backup failures.
Capacity Estimation
Calculate required storage for your retention policy:
| Backup Type | Size | Count | Total | Notes |
|---|---|---|---|---|
| Daily Full (compressed) | 300 GB | 7 | 2.1 TB | 30% compression typical |
| Weekly Full | 300 GB | 4 | 1.2 TB | Separate from daily |
| Monthly Full | 300 GB | 12 | 3.6 TB | Full year history |
| Yearly Full | 300 GB | 7 | 2.1 TB | Compliance retention |
| Transaction Logs (daily) | 50 GB | 30 | 1.5 TB | For PITR |
| Total Required | 10.5 TB | Plus 20% headroom |
Growth Projection
Account for data growth:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081
#!/bin/bash# Backup Storage Monitoring and Alerting # =====================================# Storage Capacity Monitoring# ===================================== BACKUP_PATH="/backup"THRESHOLD_WARNING=80THRESHOLD_CRITICAL=90 # Check usage percentageusage=$(df "$BACKUP_PATH" | awk 'NR==2 {print $5}' | tr -d '%') if [ "$usage" -ge "$THRESHOLD_CRITICAL" ]; then echo "CRITICAL: Backup storage at ${usage}%" | mail -s "CRITICAL: Backup Storage Space" dba@example.com exit 2elif [ "$usage" -ge "$THRESHOLD_WARNING" ]; then echo "WARNING: Backup storage at ${usage}%" | mail -s "WARNING: Backup Storage Space" dba@example.com exit 1fi # =====================================# Backup Size Trending# ===================================== # Log daily backup sizescat >> /var/log/backup_sizes.log <<EOF$(date +%Y-%m-%d),$(du -sb /backup/daily/$(date +%Y%m%d)* 2>/dev/null | awk '{sum+=$1} END {print sum}')EOF # Detect abnormal backup size (>50% deviation from average)recent_avg=$(tail -7 /var/log/backup_sizes.log | awk -F, '{sum+=$2; count++} END {print sum/count}')today_size=$(tail -1 /var/log/backup_sizes.log | cut -d, -f2)deviation=$(echo "scale=2; ($today_size - $recent_avg) / $recent_avg * 100" | bc) if [ "$(echo "$deviation > 50 || $deviation < -50" | bc)" -eq 1 ]; then echo "Unusual backup size detected: ${deviation}% deviation" | mail -s "Backup Size Anomaly" dba@example.comfi # =====================================# Backup Success Monitoring# ===================================== # Check for today's backupexpected_backup="/backup/daily/$(date +%Y%m%d)_backup.sql.gz"if [ ! -f "$expected_backup" ]; then echo "CRITICAL: Missing backup for $(date +%Y-%m-%d)" | mail -s "MISSING BACKUP" dba@example.com exit 2fi # Verify backup is not empty/corruptmin_size=$((100 * 1024 * 1024)) # 100 MB minimumactual_size=$(stat -c%s "$expected_backup")if [ "$actual_size" -lt "$min_size" ]; then echo "CRITICAL: Backup too small (${actual_size} bytes)" | mail -s "BACKUP SIZE CRITICAL" dba@example.com exit 2fi # =====================================# Prometheus/Grafana Metrics# ===================================== # Expose metrics for Prometheus scrapingcat > /var/lib/node_exporter/textfile_collector/backup.prom <<EOF# HELP backup_storage_bytes Total backup storage used# TYPE backup_storage_bytes gaugebackup_storage_bytes $(du -sb /backup | awk '{print $1}') # HELP backup_storage_available_bytes Available backup storage# TYPE backup_storage_available_bytes gaugebackup_storage_available_bytes $(df -B1 /backup | awk 'NR==2 {print $4}') # HELP backup_last_success_timestamp Last successful backup timestamp# TYPE backup_last_success_timestamp gaugebackup_last_success_timestamp $(stat -c%Y "$expected_backup" 2>/dev/null || echo 0) # HELP backup_size_bytes Size of most recent backup# TYPE backup_size_bytes gauge backup_size_bytes $actual_sizeEOFBackup storage is the foundation of data protection. Let's consolidate the key principles:
| Requirement | Recommended Architecture |
|---|---|
| Fast local recovery | SSD/NVMe landing zone + HDD retention |
| Ransomware protection | Immutable cloud + air-gapped tape |
| Compliance archive | WORM storage (cloud or tape) |
| Geographic DR | Cross-region cloud replication |
| Cost optimization | Tiered storage with lifecycle policies |
| Large-scale (PB) | Deduplication + tape archive |
Congratulations! You've completed the Backup Implementation module. You now understand how to implement online and offline backups, ensure backup consistency, select appropriate tools, and design robust backup storage infrastructure. These skills form the foundation for protecting enterprise data against loss and enabling business continuity in the face of disasters.