Kubernetes Storage - Learning Module

Loading content...

0/273

StatefulSet Storage

Storage for Stateful Workloads

While Deployments excel at managing stateless applications, they lack the guarantees required by databases, message queues, and other stateful systems. These applications need:

Stable network identity: The same DNS name across restarts
Stable storage: The same volume reattached after rescheduling
Ordered operations: Controlled startup, scaling, and shutdown sequences

StatefulSets provide these guarantees by combining stable pod identities with persistent volume management. Unlike Deployments where pods are interchangeable, StatefulSet pods have predictable names (pod-0, pod-1, pod-2) and each gets a dedicated PVC that follows it through restarts and reschedules.

This page explores how StatefulSets manage storage—the volumeClaimTemplates mechanism, storage identity guarantees, scaling behaviors, and patterns for running production databases on Kubernetes.

What You Will Learn

By the end of this page, you will understand volumeClaimTemplates, stable storage identity, the relationship between pod and PVC lifecycles, scaling behaviors for storage, PVC retention policies introduced in Kubernetes 1.27+, and production patterns for stateful applications.

StatefulSet Storage Model

StatefulSets use a fundamentally different storage model than Deployments. Understanding this model is crucial for designing reliable stateful applications.

The core difference:

Deployments: All pods share PVCs defined in the pod spec. Pods are fungible—any can attach to any available PVC (if using RWX) or they compete for RWO volumes.
StatefulSets: Each pod gets its own PVC, automatically created from a template. The PVC name includes the pod ordinal (pod name), creating a permanent bond between specific pods and specific volumes.

The identity binding:

StatefulSet Identity Components
Component	Pattern	Example	Persists Across
Pod Name	<statefulset>-<ordinal>	mysql-0, mysql-1, mysql-2	Reschedules, restarts
PVC Name	<volumeClaimTemplate>-<statefulset>-<ordinal>	data-mysql-0, data-mysql-1	Pod deletion, scale down
DNS Name	<pod>.<service>.namespace.svc.cluster.local	mysql-0.mysql-headless.db.svc.cluster.local	Reschedules, restarts
Ordinal Index	0, 1, 2, ... (0-indexed)	0 is primary, 1+ are replicas	Pod lifetime

The storage guarantee:

When mysql-0 is deleted and recreated (due to node failure, manual deletion, or rolling update):

The new mysql-0 pod is scheduled (possibly on a different node)
Kubernetes finds the existing PVC data-mysql-0
The PVC is mounted to the new pod
The application resumes with all its previous data

This guarantee is what makes StatefulSets suitable for databases—the primary (mysql-0) always gets the primary's data, regardless of which physical node runs it.

PVCs Outlive Pods

By default, StatefulSet PVCs are NOT deleted when pods are deleted or the StatefulSet is scaled down. This is intentional—data preservation is the priority. Manual cleanup or PVC retention policies (Kubernetes 1.27+) are required to remove PVCs.

VolumeClaimTemplates Deep Dive

The volumeClaimTemplates field is the mechanism by which StatefulSets create PVCs. It's a list of PVC specifications that act as templates—for each pod, Kubernetes creates one PVC per template.

statefulset-with-storage.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgresql
  namespace: database
spec:
  serviceName: postgresql-headless  # Required: headless service for DNS
  replicas: 3
  selector:
    matchLabels:
      app: postgresql
  
  # Pod template - standard pod spec
  template:
    metadata:
      labels:
        app: postgresql
    spec:
      containers:
        - name: postgresql
          image: postgres:15
          ports:
            - containerPort: 5432
              name: postgres
          env:
            - name: PGDATA
              value: /var/lib/postgresql/data/pgdata
          volumeMounts:
            # Mount the data volume (from volumeClaimTemplate)
            - name: data
              mountPath: /var/lib/postgresql/data
            # Mount WAL volume for separate WAL storage
            - name: wal
              mountPath: /var/lib/postgresql/wal
          resources:
            requests:
              memory: "1Gi"
              cpu: "500m"
            limits:
              memory: "4Gi"
              cpu: "2"
      
      # Init container for permissions
      initContainers:
        - name: init-permissions
          image: busybox
          command: ['sh', '-c', 'chown -R 999:999 /var/lib/postgresql/data /var/lib/postgresql/wal']
          volumeMounts:
            - name: data
              mountPath: /var/lib/postgresql/data
            - name: wal
              mountPath: /var/lib/postgresql/wal
 
  # VolumeClaimTemplates - PVC templates for each pod
  volumeClaimTemplates:
    # Primary data volume - high-performance SSD
    - metadata:
        name: data
        labels:
          app: postgresql
          component: data
      spec:
        accessModes: ["ReadWriteOnce"]
        storageClassName: fast-ssd
        resources:
          requests:
            storage: 100Gi
    
    # WAL volume - separate disk for write-ahead logs
    # Improves performance by isolating sequential WAL writes
    - metadata:
        name: wal
        labels:
          app: postgresql
          component: wal
      spec:
        accessModes: ["ReadWriteOnce"]
        storageClassName: fast-ssd
        resources:
          requests:
            storage: 20Gi
 
---
# Headless service for stable DNS
apiVersion: v1
kind: Service
metadata:
  name: postgresql-headless
  namespace: database
spec:
  clusterIP: None  # Headless - no load balancing
  selector:
    app: postgresql
  ports:
    - port: 5432
      targetPort: 5432
      name: postgres

PVC creation mechanics:

When the above StatefulSet is created with 3 replicas, Kubernetes creates:

Pods: postgresql-0, postgresql-1, postgresql-2
Data PVCs: data-postgresql-0, data-postgresql-1, data-postgresql-2
WAL PVCs: wal-postgresql-0, wal-postgresql-1, wal-postgresql-2

Each pod gets exactly one PVC per template, named <template-name>-<statefulset-name>-<ordinal>.

VolumeClaimTemplate Design Patterns

•Single data volume: Most common—one template for all application data
•Separate data and logs: Isolate database logs/WAL to different volumes for performance
•Hot and cold storage: Multiple templates using different storage classes for tiered data
•Data and scratch: Persistent data volume plus ephemeral scratch space using emptyDir

Template Immutability

VolumeClaimTemplates cannot be modified after StatefulSet creation. Changing storage size or class requires creating a new StatefulSet or using manual PVC resizing. Plan storage requirements carefully before deployment.

Ordered Provisioning and Startup

StatefulSets provide ordered, graceful deployment that's critical for distributed systems where startup order matters (e.g., database clusters where primaries must initialize before replicas).

Default ordered behavior (OrderedReady):

Pods are created sequentially: pod-0 must be Running and Ready before pod-1 starts
Each pod's PVCs are created and bound before pod creation
Updates proceed in reverse order: pod-N → pod-0
Scale-down is reverse order: pod-N deleted first, then pod-N-1, etc.

The startup sequence:

statefulset-startup-sequence.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# StatefulSet with 3 replicas - startup sequence
 
Time   Action                              Status
────────────────────────────────────────────────────────────────────
T+0    Create StatefulSet                  
T+1    Create PVC data-app-0               Pending (waiting for binding)
T+2    PVC data-app-0 bound                Bound
T+3    Create Pod app-0                    Pending (waiting for scheduling)
T+4    Pod app-0 scheduled, containers starting
T+5    Pod app-0 Running                   Running (not yet Ready)
T+6    Pod app-0 passes readiness probe    Running, Ready  ✓
       ──── pod-0 Ready, proceed to pod-1 ────
T+7    Create PVC data-app-1               Pending → Bound
T+8    Create Pod app-1                    Pending → Running
T+9    Pod app-1 Ready                     Running, Ready  ✓
       ──── pod-1 Ready, proceed to pod-2 ────
T+10   Create PVC data-app-2               Pending → Bound
T+11   Create Pod app-2                    Pending → Running
T+12   Pod app-2 Ready                     Running, Ready  ✓
       ──── All replicas ready ────

Parallel pod management:

For workloads that don't require strict ordering (e.g., sharded databases where each shard is independent), you can use parallel pod management:

parallel-statefulset.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: sharded-cache
spec:
  serviceName: sharded-cache
  replicas: 10
  
  # Parallel pod management - all pods start simultaneously
  podManagementPolicy: Parallel
  
  selector:
    matchLabels:
      app: sharded-cache
  template:
    metadata:
      labels:
        app: sharded-cache
    spec:
      containers:
        - name: cache
          image: redis:7
          volumeMounts:
            - name: data
              mountPath: /data
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes: ["ReadWriteOnce"]
        storageClassName: standard
        resources:
          requests:
            storage: 10Gi

Pod Management Policies
Policy	Creation Order	Update Order	Use Case
OrderedReady	Sequential (0→N)	Reverse (N→0)	Primary-replica databases, consensus systems
Parallel	Simultaneous	Simultaneous	Sharded systems, stateless-like with stable IDs

Scaling Behaviors and PVC Lifecycle

Understanding how StatefulSets handle scaling is critical for capacity planning and cost management. The behavior differs significantly from Deployments.

Scale up behavior:

When Scaling Up

•New pods are created with the next ordinal (if currently 3 replicas, pod-3 is created)
•New PVCs are created from volumeClaimTemplates for the new ordinal
•If using OrderedReady policy, new pods wait for previous pods to be Ready
•Storage is provisioned dynamically if using a StorageClass with a provisioner

Scale down behavior (critical to understand):

When Scaling Down

•Pods are deleted in reverse ordinal order (pod-4 before pod-3)
•PVCs are NOT deleted by default — This is intentional to protect data
•PVs remain bound to their PVCs (data preserved)
•Scaling back up reattaches pods to their original PVCs
•Orphaned PVCs continue consuming storage and incurring costs

scaling-example.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# Scale up from 3 to 5 replicas
kubectl scale statefulset mysql --replicas=5
 
# Result:
# Pods: mysql-0, mysql-1, mysql-2, mysql-3, mysql-4
# PVCs: data-mysql-0, data-mysql-1, data-mysql-2, data-mysql-3, data-mysql-4
 
# Scale down from 5 to 2 replicas
kubectl scale statefulset mysql --replicas=2
 
# Result:
# Pods: mysql-0, mysql-1 (mysql-2,3,4 deleted)
# PVCs: data-mysql-0, data-mysql-1, data-mysql-2, data-mysql-3, data-mysql-4
#       ↑ ALL PVCs still exist!
 
# The orphaned PVCs still consume storage
kubectl get pvc -l app=mysql
# NAME           STATUS   VOLUME       CAPACITY   ACCESS MODES
# data-mysql-0   Bound    pv-xxx       100Gi      RWO
# data-mysql-1   Bound    pv-yyy       100Gi      RWO
# data-mysql-2   Bound    pv-zzz       100Gi      RWO    ← Orphaned!
# data-mysql-3   Bound    pv-aaa       100Gi      RWO    ← Orphaned!
# data-mysql-4   Bound    pv-bbb       100Gi      RWO    ← Orphaned!
 
# Manual cleanup when data is no longer needed
kubectl delete pvc data-mysql-2 data-mysql-3 data-mysql-4

Cost Implications

Orphaned PVCs from scale-down operations continue incurring cloud storage costs. Implement monitoring for orphaned PVCs and regular cleanup procedures. Some organizations use controllers to automatically notify on or delete stale PVCs.

PVC Retention Policies (Kubernetes 1.27+)

Kubernetes 1.27 introduced StatefulSet PVC Auto Deletion as a stable feature, providing automated control over PVC lifecycle relative to pods and StatefulSets.

The persistentVolumeClaimRetentionPolicy:

This field controls when PVCs are automatically deleted:

pvc-retention-policy.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: disposable-workers
spec:
  serviceName: workers
  replicas: 5
  
  # PVC retention policy - controls automatic PVC deletion
  persistentVolumeClaimRetentionPolicy:
    # What happens when the StatefulSet is deleted
    whenDeleted: Delete  # Options: Retain (default), Delete
    
    # What happens when the pod is scaled down
    whenScaled: Delete   # Options: Retain (default), Delete
  
  selector:
    matchLabels:
      app: worker
  template:
    metadata:
      labels:
        app: worker
    spec:
      containers:
        - name: worker
          image: worker:v1
          volumeMounts:
            - name: scratch
              mountPath: /scratch
  volumeClaimTemplates:
    - metadata:
        name: scratch
      spec:
        accessModes: ["ReadWriteOnce"]
        storageClassName: standard
        resources:
          requests:
            storage: 50Gi
 
---
# Production database - preserve data on scale down and delete
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: production-db
spec:
  serviceName: production-db
  replicas: 3
  
  persistentVolumeClaimRetentionPolicy:
    whenDeleted: Retain  # Keep PVCs even if StatefulSet deleted
    whenScaled: Retain   # Keep PVCs even if pods scaled down
  
  # ... rest of spec

PVC Retention Policy Options
Scenario	whenScaled	whenDeleted	Result
Preserve all data (default)	Retain	Retain	PVCs never auto-deleted, manual cleanup required
Clean up on scale down	Delete	Retain	Scaling down deletes PVCs; StatefulSet deletion preserves them
Clean up on deletion	Retain	Delete	PVCs kept during scaling, deleted with StatefulSet
Ephemeral storage	Delete	Delete	PVCs always auto-deleted when pods go away

Use Cases for Auto Deletion

Use Delete policies for: worker nodes with scratch storage, CI/CD runners with build caches, ML training jobs with checkpoints that become irrelevant. Always use Retain for production databases, message queue state, and any data that needs backup before deletion.

Storage Identity and Recovery Patterns

The stable storage identity provided by StatefulSets enables powerful recovery patterns. Understanding these patterns is essential for designing highly available stateful systems.

Pod failure recovery:

When a StatefulSet pod fails (node crash, OOM killed, etc.):

Kubernetes detects the pod is unhealthy
The failed pod is terminated
A new pod with the same name (e.g., mysql-0) is created
The scheduler places it on an available node
The existing PVC (data-mysql-0) is automatically attached
The application resumes with its data

For block storage (EBS, GCE PD), this may require waiting for the volume to detach from the failed node—a process that can take several minutes if the node is unresponsive.

recovery-scenarios.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
# Scenario 1: Pod crash recovery
# Pod mysql-0 crashes on node-1
# Timeline:
#   T+0:  mysql-0 crashes (OOM, application error)
#   T+1:  kubelet detects, reports to API server
#   T+2:  StatefulSet controller creates replacement mysql-0
#   T+3:  Scheduler selects node-2 (node-1 may be viable again)
#   T+5:  PVC data-mysql-0 attached to mysql-0 on node-2
#   T+6:  mysql-0 starts with existing data
#
# Total recovery: ~10-30 seconds for healthy nodes
 
# Scenario 2: Node failure recovery
# node-1 fails (hardware, network partition)
# Timeline:
#   T+0:    node-1 becomes unresponsive
#   T+5min: kubelet heartbeat fails, node marked NotReady
#   T+10min: pod.spec.tolerations.node.kubernetes.io/not-ready:NoExecute expires
#   T+10min: Pod evicted from failed node
#   T+10min: StatefulSet creates replacement pod
#   T+12min: Volume detach timeout, force detach issued
#   T+13min: PVC attached to new pod
#   T+14min: Pod running with data
#
# Total recovery: ~10-15 minutes worst case
 
# Decrease recovery time with pod disruption budget and
# volume attachment tuning:
---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: mysql-pdb
spec:
  minAvailable: 2  # Maintain quorum during disruptions
  selector:
    matchLabels:
      app: mysql
 
---
# Force detach timeout (CSI driver configuration example)
# Decrease time before force-detaching from failed nodes
apiVersion: storage.k8s.io/v1
kind: CSIDriver
metadata:
  name: ebs.csi.aws.com
spec:
  attachRequired: true
  podInfoOnMount: false
  # volumeLifecycleModes: ["Persistent", "Ephemeral"]

Manual recovery patterns:

Sometimes automated recovery isn't enough. Manual intervention patterns include:

Manual Recovery Operations

•Force delete stuck pods: kubectl delete pod mysql-0 --force --grace-period=0 when pods won't terminate
•Force volume detach: Remove VolumeAttachment to free stuck volumes
•Recreate from backup: Delete PVC, restore from snapshot to new PVC, restart pod
•Clone to new ordinal: Create new PVC manually, copy data, adjust StatefulSet
•Emergency read-only recovery: Mount volume on debug pod to recover data

Force Delete Dangers

Force deleting pods bypasses graceful shutdown. For databases, this can cause data corruption. Only use force delete when you're certain the original pod is unreachable and cannot perform writes. Combine with application-level checks (verify primary is truly dead before promoting replica).

Multi-Volume Design Patterns

Production stateful applications often benefit from multiple volumes with different characteristics. VolumeClaimTemplates support this pattern natively.

Common multi-volume patterns:

Multi-Volume Design Patterns
Pattern	Volumes	Rationale
Data + WAL	Primary data, write-ahead logs	Isolate sequential WAL writes from random data I/O
Data + Logs	Application data, application logs	Different retention, separate backup strategies
Hot + Cold	Fast SSD, cheap HDD	Tiered storage within same application
Data + Config	Persistent data, configuration files	Different update patterns, security considerations
Data + Temp	Persistent data, ephemeral scratch	Scratch space doesn't need persistence

multi-volume-statefulset.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
# Elasticsearch with optimized multi-volume layout
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: elasticsearch
spec:
  serviceName: elasticsearch
  replicas: 3
  selector:
    matchLabels:
      app: elasticsearch
  template:
    metadata:
      labels:
        app: elasticsearch
    spec:
      containers:
        - name: elasticsearch
          image: elasticsearch:8.10.0
          env:
            - name: path.data
              value: /usr/share/elasticsearch/data
            - name: path.logs
              value: /var/log/elasticsearch
          volumeMounts:
            # Primary data volume - fast SSD
            - name: data
              mountPath: /usr/share/elasticsearch/data
            # Logs volume - standard storage, can tolerate loss
            - name: logs
              mountPath: /var/log/elasticsearch
            # Snapshots volume - cheaper storage for backups
            - name: snapshots
              mountPath: /snapshots
          resources:
            requests:
              memory: "4Gi"
              cpu: "1"
            limits:
              memory: "8Gi"
              cpu: "2"
  
  volumeClaimTemplates:
    # Data: Fast SSD, highest performance tier
    - metadata:
        name: data
        labels:
          tier: premium
      spec:
        accessModes: ["ReadWriteOnce"]
        storageClassName: fast-ssd
        resources:
          requests:
            storage: 500Gi
    
    # Logs: Standard storage, acceptable to lose
    - metadata:
        name: logs
        labels:
          tier: standard
      spec:
        accessModes: ["ReadWriteOnce"]
        storageClassName: standard
        resources:
          requests:
            storage: 50Gi
    
    # Snapshots: Cold storage for backups
    - metadata:
        name: snapshots
        labels:
          tier: archive
      spec:
        accessModes: ["ReadWriteOnce"]
        storageClassName: cold-storage
        resources:
          requests:
            storage: 1000Gi

Ephemeral Volumes for Scratch

For temporary data that doesn't need persistence (caches, temp files, scratch space), use emptyDir volumes instead of volumeClaimTemplates. This avoids unnecessary PVC creation and storage costs.

Production Considerations

Running stateful applications in production requires attention to several operational concerns beyond basic StatefulSet configuration.

Production Checklist

•Storage class selection: Use appropriate storage class for workload (IOPS, throughput, durability requirements)
•Capacity planning: Size initial volumes with growth in mind—expansion may require pod restart
•Backup strategy: Implement volume snapshots or application-level backups before destructive operations
•Anti-affinity rules: Spread replicas across nodes/zones for HA
•Pod disruption budgets: Prevent too many concurrent pod terminations during upgrades/maintenance
•Resource requests/limits: Prevent OOM kills with proper memory limits; request sufficient CPU
•Health probes: Configure readiness probes that verify application health, not just process presence
•Graceful shutdown: Applications must handle SIGTERM properly, flush data before termination
•Monitoring: Alert on PVC capacity, I/O latency, pod restart frequency
•Update strategy: Use OnDelete or RollingUpdate with maxUnavailable/partition as appropriate

production-statefulset.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
# Production-ready StatefulSet configuration
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: production-mysql
spec:
  serviceName: mysql
  replicas: 3
  
  # Conservative update strategy - manual control
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      # partition: 2  # Uncomment to update only pods >= ordinal 2
  
  # Preserve PVCs on all operations
  persistentVolumeClaimRetentionPolicy:
    whenDeleted: Retain
    whenScaled: Retain
  
  # Minimum ready time before available
  minReadySeconds: 30
  
  selector:
    matchLabels:
      app: mysql
  
  template:
    metadata:
      labels:
        app: mysql
    spec:
      # Graceful termination time
      terminationGracePeriodSeconds: 120
      
      # Spread pods across nodes
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchLabels:
                  app: mysql
              topologyKey: kubernetes.io/hostname
      
      containers:
        - name: mysql
          image: mysql:8.0
          lifecycle:
            preStop:
              exec:
                # Graceful flush before shutdown
                command: ["/bin/sh", "-c", "mysqladmin shutdown -uroot -p\$MYSQL_ROOT_PASSWORD"]
          
          readinessProbe:
            exec:
              command: ["mysqladmin", "ping", "-uroot", "-p$(MYSQL_ROOT_PASSWORD)"]
            initialDelaySeconds: 15
            periodSeconds: 5
            timeoutSeconds: 3
          
          livenessProbe:
            exec:
              command: ["mysqladmin", "ping", "-uroot", "-p$(MYSQL_ROOT_PASSWORD)"]
            initialDelaySeconds: 60
            periodSeconds: 30
            timeoutSeconds: 5
          
          resources:
            requests:
              memory: "2Gi"
              cpu: "1"
            limits:
              memory: "8Gi"
              cpu: "4"
          
          volumeMounts:
            - name: data
              mountPath: /var/lib/mysql
  
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes: ["ReadWriteOnce"]
        storageClassName: fast-ssd
        resources:
          requests:
            storage: 200Gi
 
---
# PDB to maintain quorum
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: mysql-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: mysql

Summary: StatefulSet Storage Mastery

StatefulSet storage provides the stable, persistent storage foundation that stateful applications require in Kubernetes. The combination of volumeClaimTemplates, ordered operations, and stable identity enables reliable database and stateful service deployments.

Key Takeaways

•VolumeClaimTemplates create per-pod PVCs — Each pod gets dedicated storage with predictable, ordinal-based naming.
•Storage identity is stable — Pod-0 always gets PVC-0, regardless of which node runs the pod. Data follows identity.
•PVCs outlive pods by default — Scale down and pod deletion do NOT delete PVCs. This protects data but requires cleanup.
•PVC retention policies enable automation — Kubernetes 1.27+ allows automatic PVC deletion on scale down or StatefulSet deletion.
•Ordered operations matter — OrderedReady ensures sequential startup; Parallel for independent shards. Update order is reverse of creation.
•Multi-volume patterns optimize performance — Separate data, logs, and WAL for workload-specific storage characteristics.
•Recovery is automatic but bounded — Pod failures recover automatically; node failures may take 10+ minutes for volume detach.
•Production requires operational discipline — Backups, anti-affinity, PDBs, graceful shutdown, and monitoring are essential.

What's next:

We'll explore cloud provider integration for Kubernetes storage—how AWS EBS, Google Persistent Disk, Azure Disk, and associated CSI drivers integrate with Storage Classes and PVs. Understanding cloud-specific behavior is essential for production deployments.

Page Complete

You now understand StatefulSet storage comprehensively—from volumeClaimTemplates through ordered provisioning, scaling behaviors, PVC retention policies, recovery patterns, and production considerations. This knowledge enables reliable stateful application deployments in Kubernetes.