Loading content...
Provisioning storage is just the beginning. Production environments require sophisticated patterns for data protection, recovery, migration, and optimization. The difference between a development cluster and a production-ready system often lies in how data persistence is handled.
The reality of production data:
This page covers the patterns that transform basic Kubernetes storage into a production-grade data platform: backup and restore strategies, disaster recovery architectures, data migration techniques, storage tiering, local volume optimization, and operational best practices that protect your most critical asset.
By the end of this page, you will understand backup and restore patterns for Kubernetes data, disaster recovery architectures, data migration strategies, storage tiering approaches, local volume patterns for high performance, and operational patterns for production data management.
Kubernetes backup strategies must address multiple layers: cluster state, application configuration, and persistent data. Each layer requires different approaches and tools.
Backup layers:
| Layer | What to Backup | Tools/Methods | Frequency |
|---|---|---|---|
| Cluster State | etcd, control plane configs | etcd snapshot, Kubeadm | Daily + before upgrades |
| Kubernetes Objects | Deployments, Services, ConfigMaps, Secrets | kubectl, Velero, Kasten | After changes, hourly |
| Application Config | Helm values, GitOps manifests | Git repository | Continuous (version control) |
| Persistent Data | PVC contents, databases | Volume snapshots, app-native backup | Hourly/daily RPO-dependent |
Volume snapshot-based backup:
Volume snapshots are the most straightforward approach for backing up PVCs:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778
# Comprehensive backup with Velero (backup application + data)# Install Velero first: velero install --provider <cloud> ... # Backup policy: namespace with all resources and PVCsapiVersion: velero.io/v1kind: Backupmetadata: name: production-backup namespace: velerospec: # What to backup includedNamespaces: - production - payment-service includedResources: - '*' # Exclude temporary resources excludedResources: - events - events.events.k8s.io # Snapshot PVCs using volume snapshotter snapshotVolumes: true # Label selector for specific pods labelSelector: matchLabels: backup: enabled # Time-to-live for backup ttl: 720h # 30 days # Storage location storageLocation: default ---# Scheduled backup (hourly)apiVersion: velero.io/v1kind: Schedulemetadata: name: hourly-backup namespace: velerospec: schedule: "0 * * * *" # Every hour template: includedNamespaces: - production snapshotVolumes: true ttl: 168h # 7 days storageLocation: default ---# Cross-region backup locationapiVersion: velero.io/v1kind: BackupStorageLocationmetadata: name: dr-region namespace: velerospec: provider: aws objectStorage: bucket: velero-backups-dr prefix: production config: region: us-west-2 # Different from primary (us-east-1) ---# Volume snapshot location for DRapiVersion: velero.io/v1kind: VolumeSnapshotLocationmetadata: name: dr-snapshots namespace: velerospec: provider: aws config: region: us-west-2Volume snapshots are crash-consistent, not application-consistent. For databases, use application-native backup tools (pg_dump, mysqldump, mongodump) or freeze I/O before snapshots. Velero supports pre-backup hooks for this purpose.
Application-native backup patterns:
For databases and stateful applications, application-native backups provide stronger consistency guarantees:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273
# CronJob for PostgreSQL backup to S3apiVersion: batch/v1kind: CronJobmetadata: name: postgres-backup namespace: productionspec: schedule: "0 */6 * * *" # Every 6 hours concurrencyPolicy: Forbid jobTemplate: spec: template: spec: containers: - name: backup image: postgres:15 env: - name: PGHOST value: "postgresql.production.svc.cluster.local" - name: PGUSER valueFrom: secretKeyRef: name: postgres-credentials key: username - name: PGPASSWORD valueFrom: secretKeyRef: name: postgres-credentials key: password - name: AWS_ACCESS_KEY_ID valueFrom: secretKeyRef: name: aws-s3-credentials key: access-key - name: AWS_SECRET_ACCESS_KEY valueFrom: secretKeyRef: name: aws-s3-credentials key: secret-key command: - /bin/bash - -c - | set -e TIMESTAMP=$(date +%Y%m%d_%H%M%S) BACKUP_FILE="postgres_backup_$TIMESTAMP.sql.gz" echo "Starting PostgreSQL backup..." pg_dumpall | gzip > /tmp/$BACKUP_FILE echo "Uploading to S3..." apt-get update && apt-get install -y awscli aws s3 cp /tmp/$BACKUP_FILE s3://database-backups/postgres/$BACKUP_FILE echo "Backup completed: $BACKUP_FILE" restartPolicy: OnFailure ---# Velero pre-backup hook for consistent MySQL snapshotapiVersion: apps/v1kind: Deploymentmetadata: name: mysql namespace: production annotations: # Velero hooks for application consistency pre.hook.backup.velero.io/container: mysql pre.hook.backup.velero.io/command: '["/bin/sh", "-c", "mysql -u root -p$MYSQL_ROOT_PASSWORD -e "FLUSH TABLES WITH READ LOCK; SYSTEM sleep 5;""]' pre.hook.backup.velero.io/timeout: 30s post.hook.backup.velero.io/container: mysql post.hook.backup.velero.io/command: '["/bin/sh", "-c", "mysql -u root -p$MYSQL_ROOT_PASSWORD -e "UNLOCK TABLES;""]'spec: # ... deployment specDisaster recovery (DR) for Kubernetes storage requires planning for various failure scenarios, from individual node failures to complete region outages. DR strategies are characterized by two key metrics:
DR strategies by RPO/RTO:
| Strategy | RPO | RTO | Cost | Complexity |
|---|---|---|---|---|
| Backup & Restore | Hours | Hours | Low | Low |
| Pilot Light | Minutes-Hours | Minutes | Medium | Medium |
| Warm Standby | Minutes | Minutes | High | High |
| Active-Active | Zero/Near-zero | Seconds | Very High | Very High |
Pattern 1: Backup and Restore (Cold DR)
Simplest approach—backups in DR region, restore on-demand when disaster occurs:
Pattern 2: Pilot Light (Warm Infrastructure)
Core infrastructure running in DR region, scaled down, ready to activate:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384
# DR Region: Minimal cluster with critical data sync# Primary Region: us-east-1, DR Region: us-west-2 # StatefulSet in DR region - scaled to 0, ready to activateapiVersion: apps/v1kind: StatefulSetmetadata: name: database namespace: production annotations: dr.example.com/role: standbyspec: replicas: 0 # Scaled down - no pods running serviceName: database selector: matchLabels: app: database template: spec: containers: - name: postgres image: postgres:15 volumeMounts: - name: data mountPath: /var/lib/postgresql/data volumeClaimTemplates: - metadata: name: data spec: accessModes: ["ReadWriteOnce"] storageClassName: fast-ssd resources: requests: storage: 100Gi ---# Data sync - Replicate snapshots to DR region# AWS example: Cross-region EBS snapshot copyapiVersion: batch/v1kind: CronJobmetadata: name: snapshot-replicator namespace: velerospec: schedule: "*/15 * * * *" # Every 15 minutes jobTemplate: spec: template: spec: containers: - name: replicate image: amazon/aws-cli:latest env: - name: SOURCE_REGION value: "us-east-1" - name: DEST_REGION value: "us-west-2" command: - /bin/sh - -c - | # Get latest snapshot for each volume SNAPSHOTS=$(aws ec2 describe-snapshots \ --region $SOURCE_REGION \ --filters "Name=tag:kubernetes.io/created-for/pvc/namespace,Values=production" \ --query 'Snapshots[*].SnapshotId' --output text) for SNAP in $SNAPSHOTS; do echo "Copying snapshot $SNAP to $DEST_REGION" aws ec2 copy-snapshot \ --source-region $SOURCE_REGION \ --source-snapshot-id $SNAP \ --destination-region $DEST_REGION \ --description "DR copy of $SNAP" done restartPolicy: OnFailure ---# DR Activation runbook (conceptual - trigger via CI/CD or manual)# 1. Scale up StatefulSet: kubectl scale sts database --replicas=3# 2. Update DNS to DR region# 3. Scale up application deployments# 4. Verify connectivity and data integrity# 5. Monitor and validatePattern 3: Active-Active (Multi-Region)
Both regions actively serving traffic with synchronized data:
Calculate the cost of downtime ($/hour) and data loss ($/MB lost). Compare against DR infrastructure costs to find the right balance. Most applications don't need active-active—pilot light with good automation often achieves acceptable RTO at much lower cost.
Data migration in Kubernetes is required for various scenarios: moving between storage classes, migrating to different cloud providers, resizing volumes, or consolidating data. Each scenario requires different approaches.
Migration scenarios:
| Scenario | Challenge | Recommended Approach |
|---|---|---|
| Storage class change | PVC storageClassName is immutable | Clone to new PVC or restore from snapshot |
| Volume resize (shrink) | Cannot shrink PVCs | Create smaller PVC, copy data, swap |
| Cross-namespace migration | PVC is namespace-scoped | Clone via snapshot, create in new namespace |
| Cross-cluster migration | No direct PVC transfer | Snapshot copy or application-level backup/restore |
| Cloud provider migration | Different storage backends | Application-level backup, object storage transfer |
Pattern 1: Clone via Volume Snapshot
The cleanest approach when changing storage class or migrating within the same cluster/cloud:
1234567891011121314151617181920212223242526272829303132333435363738
# Step 1: Create snapshot of source PVCapiVersion: snapshot.storage.k8s.io/v1kind: VolumeSnapshotmetadata: name: migration-snapshot namespace: productionspec: volumeSnapshotClassName: ebs-snapshot-class source: persistentVolumeClaimName: old-database-data ---# Step 2: Create new PVC from snapshot with different storage classapiVersion: v1kind: PersistentVolumeClaimmetadata: name: new-database-data namespace: productionspec: storageClassName: fast-ssd # New storage class accessModes: - ReadWriteOnce resources: requests: storage: 200Gi # Can be larger than original dataSource: name: migration-snapshot kind: VolumeSnapshot apiGroup: snapshot.storage.k8s.io ---# Step 3: Update workload to use new PVC# Option A: Direct switch (downtime)# Option B: Blue-green deployment with new PVC # After migration verified:# Step 4: Delete old PVC (if data confirmed migrated)# kubectl delete pvc old-database-dataPattern 2: rsync-based migration
For cross-cluster or cross-cloud migrations where snapshots aren't viable:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950
# Job to sync data between PVCs using rsync# Use case: Cross-cluster migration, detailed controlapiVersion: batch/v1kind: Jobmetadata: name: data-migration namespace: productionspec: template: spec: containers: - name: rsync image: instrumentisto/rsync-ssh:latest command: - /bin/sh - -c - | echo "Starting rsync migration..." # Option 1: Local rsync between mounted volumes rsync -avz --progress /source/ /destination/ # Option 2: Remote rsync (if migrating to different cluster) # rsync -avz --progress /source/ user@remote-host:/destination/ echo "Migration complete. Verifying..." diff -r /source /destination && echo "Verified OK" || echo "MISMATCH DETECTED" volumeMounts: - name: source mountPath: /source readOnly: true - name: destination mountPath: /destination restartPolicy: OnFailure volumes: - name: source persistentVolumeClaim: claimName: old-pvc - name: destination persistentVolumeClaim: claimName: new-pvc ---# For live migration with minimal downtime:# Pattern: rsync multiple passes# 1. Initial sync while application running (can take hours for large data)# 2. Quiesce application (stop writes)# 3. Final incremental sync (seconds/minutes)# 4. Switch application to new PVC# 5. Resume applicationMost migration patterns require application downtime during the final cutover. Plan migrations during maintenance windows. For zero-downtime, use database-native replication to synchronize data, then failover at the database level rather than storage level.
Not all data is equal—hot data needs fast storage, while cold data can tolerate slower, cheaper options. Storage tiering optimizes costs while maintaining performance where it matters.
Tiering strategies:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394
# Tiered storage class hierarchy# Hot tier: io2 for production databasesapiVersion: storage.k8s.io/v1kind: StorageClassmetadata: name: hot-tier labels: tier: hot cost-per-gb: highprovisioner: ebs.csi.aws.comparameters: type: io2 iops: "16000" encrypted: "true"reclaimPolicy: RetainvolumeBindingMode: WaitForFirstConsumer ---# Warm tier: gp3 for general workloadsapiVersion: storage.k8s.io/v1kind: StorageClassmetadata: name: warm-tier annotations: storageclass.kubernetes.io/is-default-class: "true" labels: tier: warm cost-per-gb: mediumprovisioner: ebs.csi.aws.comparameters: type: gp3 encrypted: "true"reclaimPolicy: DeletevolumeBindingMode: WaitForFirstConsumer ---# Cold tier: st1 for archives and logsapiVersion: storage.k8s.io/v1kind: StorageClassmetadata: name: cold-tier labels: tier: cold cost-per-gb: lowprovisioner: ebs.csi.aws.comparameters: type: st1 encrypted: "true"reclaimPolicy: DeletevolumeBindingMode: WaitForFirstConsumer ---# Example: Multi-tier application deploymentapiVersion: apps/v1kind: StatefulSetmetadata: name: log-processorspec: # ... standard spec volumeClaimTemplates: # Hot data: active processing - metadata: name: hot-data spec: storageClassName: hot-tier accessModes: ["ReadWriteOnce"] resources: requests: storage: 50Gi # Cold data: processed logs - metadata: name: cold-data spec: storageClassName: cold-tier accessModes: ["ReadWriteOnce"] resources: requests: storage: 500Gi ---# Resource quota per tier (prevent budget overruns)apiVersion: v1kind: ResourceQuotametadata: name: storage-tier-quotas namespace: productionspec: hard: # Limit hot tier to 1TB per namespace hot-tier.storageclass.storage.k8s.io/requests.storage: 1Ti # Limit warm tier to 5TB warm-tier.storageclass.storage.k8s.io/requests.storage: 5Ti # Cold tier more flexible cold-tier.storageclass.storage.k8s.io/requests.storage: 20TiSome storage systems (NetApp, Portworx, Robin) offer automated data tiering—automatically moving data between tiers based on access patterns. For cloud-native, consider using application logic to archive old data to object storage (S3, GCS) and only keep hot data on PVCs.
Local volumes use storage directly attached to nodes, bypassing network storage overhead. They offer the highest performance but sacrifice the flexibility of network-attached storage.
Local volume characteristics:
Local Persistent Volume configuration:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899
# Local volume StorageClass (no provisioner - static only)apiVersion: storage.k8s.io/v1kind: StorageClassmetadata: name: local-nvmeprovisioner: kubernetes.io/no-provisioner # No dynamic provisioningvolumeBindingMode: WaitForFirstConsumer # Essential for local volumes ---# Local PersistentVolume for NVMe disk# Must be created manually per disk/nodeapiVersion: v1kind: PersistentVolumemetadata: name: local-nvme-node1-disk1 labels: node: node-1 disk: nvme0n1spec: capacity: storage: 1Ti volumeMode: Filesystem accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Retain storageClassName: local-nvme local: path: /mnt/nvme-disk1 # Pre-formatted and mounted # Critical: nodeAffinity ties PV to specific node nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - node-1 ---# PVC for local storageapiVersion: v1kind: PersistentVolumeClaimmetadata: name: database-local-storage namespace: productionspec: storageClassName: local-nvme accessModes: - ReadWriteOnce resources: requests: storage: 500Gi ---# StatefulSet with local storage and pod anti-affinityapiVersion: apps/v1kind: StatefulSetmetadata: name: high-perf-databasespec: serviceName: database replicas: 3 selector: matchLabels: app: database template: metadata: labels: app: database spec: # Schedule to nodes with local storage labels nodeSelector: storage.kubernetes.io/local-nvme: "true" # Spread across nodes (each node has local storage) affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchLabels: app: database topologyKey: kubernetes.io/hostname containers: - name: database image: scylladb/scylla:5.2 volumeMounts: - name: data mountPath: /var/lib/scylla volumeClaimTemplates: - metadata: name: data spec: storageClassName: local-nvme accessModes: ["ReadWriteOnce"] resources: requests: storage: 1TiLocal volume provisioner:
The sig-storage-local-static-provisioner can automate local PV creation by discovering and managing local disks:
1234567891011121314151617181920
# Local volume provisioner ConfigMap# Discovers disks in specified directories and creates PVs automaticallyapiVersion: v1kind: ConfigMapmetadata: name: local-provisioner-config namespace: kube-systemdata: storageClassMap: | local-nvme: hostDir: /mnt/disks mountDir: /mnt/disks volumeMode: Filesystem fsType: ext4 blockCleanerCommand: - "/scripts/shred.sh" - "2" # Provisioner discovers disks in /mnt/disks/ and creates PVs# Symlink format: /mnt/disks/<uniquename> -> /dev/nvme0n1p1Use local volumes for: latency-sensitive databases (ScyllaDB, Cassandra, Redis), applications with built-in replication that handle node failures, caching layers where data is reconstructible, and analytics workloads with local shuffle space. Always pair with application-level replication for HA.
Not all data needs persistence. Ephemeral storage patterns optimize for temporary data that can be regenerated or is only needed during pod lifetime.
Ephemeral volume types:
| Type | Backing | Persistence | Sharing | Use Case |
|---|---|---|---|---|
| emptyDir | Memory or disk | Pod lifetime | Same-pod containers | Scratch space, temp files |
| emptyDir (memory) | tmpfs (RAM) | Pod lifetime | Same-pod containers | In-memory cache, secrets |
| configMap | etcd → memory | Pod lifetime | Read-only | Configuration files |
| secret | etcd → memory (tmpfs) | Pod lifetime | Read-only | Credentials, certificates |
| Generic Ephemeral | StorageClass provisioned | Pod lifetime | Same-pod containers | Per-pod scratch with specific characteristics |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106
# Pattern 1: emptyDir for scratch spaceapiVersion: v1kind: Podmetadata: name: data-processorspec: containers: - name: processor image: processor:v1 volumeMounts: # Disk-backed scratch space - name: scratch mountPath: /scratch # Memory-backed for fastest access - name: cache mountPath: /cache volumes: # Standard emptyDir (uses node disk) - name: scratch emptyDir: sizeLimit: 5Gi # Limit to prevent node resource exhaustion # Memory-backed emptyDir (tmpfs) - name: cache emptyDir: medium: Memory sizeLimit: 512Mi # Counts against container memory limit ---# Pattern 2: Sidecar communication via emptyDirapiVersion: v1kind: Podmetadata: name: app-with-sidecarspec: containers: # Main application - name: app image: myapp:v1 volumeMounts: - name: shared mountPath: /shared # Log shipper sidecar - name: log-shipper image: fluent/fluent-bit:latest volumeMounts: - name: shared mountPath: /logs readOnly: true volumes: - name: shared emptyDir: {} ---# Pattern 3: Generic Ephemeral Volume (CSI-provisioned per-pod storage)# Useful when you need StorageClass features but no persistenceapiVersion: v1kind: Podmetadata: name: ml-trainingspec: containers: - name: trainer image: tensorflow/tensorflow:latest-gpu volumeMounts: - name: training-scratch mountPath: /data/scratch volumes: - name: training-scratch ephemeral: volumeClaimTemplate: metadata: labels: type: ephemeral-scratch spec: accessModes: ["ReadWriteOnce"] storageClassName: fast-ssd resources: requests: storage: 500Gi # PVC auto-created with pod, auto-deleted when pod terminates ---# Pattern 4: Resource limits for ephemeral storageapiVersion: v1kind: Podmetadata: name: bounded-ephemeralspec: containers: - name: app image: myapp:v1 volumeMounts: - name: scratch mountPath: /scratch resources: requests: # Ephemeral storage request counts emptyDir + container writable layer ephemeral-storage: 2Gi limits: ephemeral-storage: 5Gi # Pod evicted if exceeds this volumes: - name: scratch emptyDir: {}Kubernetes can evict pods that exceed ephemeral storage limits or when nodes are under disk pressure. Always set ephemeral-storage resource limits and monitor kubelet's imagefs and nodefs pressure signals.
Proactive monitoring prevents storage-related outages. Key metrics, alerts, and dashboards are essential for production storage operations.
Critical storage metrics:
kubelet_volume_stats_used_bytes / kubelet_volume_stats_capacity_bytes — Alert at 80%, critical at 90%kubelet_volume_stats_inodes_used / kubelet_volume_stats_inodes — Alert at 80%, many small files can exhaust inodes before capacitykube_persistentvolumeclaim_status_phase — Alert on Pending > 5 minutes, Lost immediatelykube_persistentvolume_status_phase — Alert on Failed, Released without action plan123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384
# Prometheus alerting rules for Kubernetes storageapiVersion: monitoring.coreos.com/v1kind: PrometheusRulemetadata: name: storage-alerts namespace: monitoringspec: groups: - name: kubernetes-storage rules: # Capacity alerts - alert: VolumeAlmostFull expr: | (kubelet_volume_stats_used_bytes / kubelet_volume_stats_capacity_bytes) > 0.80 for: 10m labels: severity: warning annotations: summary: "Volume {{ $labels.persistentvolumeclaim }} is over 80% full" description: "Volume in namespace {{ $labels.namespace }} is {{ humanizePercentage $value }} full" runbook_url: "https://runbooks.example.com/volume-almost-full" - alert: VolumeCriticallyFull expr: | (kubelet_volume_stats_used_bytes / kubelet_volume_stats_capacity_bytes) > 0.90 for: 5m labels: severity: critical annotations: summary: "CRITICAL: Volume {{ $labels.persistentvolumeclaim }} is over 90% full" # Inode alerts - alert: VolumeInodesExhausting expr: | (kubelet_volume_stats_inodes_used / kubelet_volume_stats_inodes) > 0.80 for: 10m labels: severity: warning annotations: summary: "Volume {{ $labels.persistentvolumeclaim }} inodes over 80%" # PVC issues - alert: PVCPendingTooLong expr: | kube_persistentvolumeclaim_status_phase{phase="Pending"} == 1 for: 5m labels: severity: warning annotations: summary: "PVC {{ $labels.persistentvolumeclaim }} stuck in Pending" description: "Check StorageClass provisioner, quotas, and cloud API limits" - alert: PVCLost expr: | kube_persistentvolumeclaim_status_phase{phase="Lost"} == 1 for: 1m labels: severity: critical annotations: summary: "CRITICAL: PVC {{ $labels.persistentvolumeclaim }} is LOST" description: "Underlying PV is no longer available. Data may be lost." # PV issues - alert: PVFailed expr: | kube_persistentvolume_status_phase{phase="Failed"} == 1 for: 1m labels: severity: critical annotations: summary: "CRITICAL: PV {{ $labels.persistentvolume }} is FAILED" # Orphaned PVCs (no pod using them) - alert: OrphanedPVC expr: | kube_persistentvolumeclaim_status_phase{phase="Bound"} == 1 unless on(namespace, persistentvolumeclaim) kube_pod_spec_volumes_persistentvolumeclaims_info for: 24h labels: severity: info annotations: summary: "PVC {{ $labels.persistentvolumeclaim }} is bound but unused for 24h" description: "Consider cleanup if no longer needed to save costs"Data persistence patterns transform basic Kubernetes storage into production-grade infrastructure. These patterns address the full lifecycle of data: protection, recovery, migration, optimization, and operational visibility.
Module complete:
You have now completed the Kubernetes Storage module. You understand the full stack from Persistent Volumes through Storage Classes, StatefulSet storage patterns, cloud provider integration, and production data persistence patterns. This knowledge enables you to design, deploy, and operate reliable storage infrastructure for any Kubernetes workload.
You now possess comprehensive knowledge of Kubernetes storage—from fundamental concepts through cloud integration and production operational patterns. You can design storage architectures that meet performance, availability, and cost requirements for enterprise workloads.