Loading learning content...
Containers were designed to be ephemeral—born, executed, and destroyed without leaving traces. This fundamental characteristic that makes containers so powerful for stateless workloads becomes a critical challenge when applications need to persist data beyond the container lifecycle.
The core tension: Kubernetes orchestrates containers across a dynamic, distributed cluster where pods can be scheduled on any node, moved during failures, and scaled across the infrastructure. How do we provide durable, consistent storage to workloads that may run anywhere in the cluster, potentially on different physical machines than where their data resides?
Persistent Volumes (PVs) represent Kubernetes' solution to this fundamental challenge. They abstract the storage infrastructure, decouple storage provisioning from consumption, and enable stateful workloads to operate reliably in a containerized environment.
By the end of this page, you will understand the complete Persistent Volume architecture, including PV/PVC binding mechanics, access modes and their implications, reclaim policies for data lifecycle management, volume phases and state transitions, and production best practices for reliable storage in Kubernetes.
Before diving into Persistent Volumes, we must understand the storage challenges that containers inherently face. This context is essential for appreciating why the PV abstraction exists and how it solves these problems.
Container filesystem fundamentals:
When a container starts, it receives a layered filesystem constructed from its image. The base layers are read-only, shared across all containers using that image. A thin writable layer (the container layer) sits on top, capturing any modifications made during runtime.
This architecture has profound implications:
The node scheduling challenge:
Consider a database pod running on Node A with data stored locally. If Node A fails, Kubernetes must reschedule the pod to Node B. Without persistent storage abstraction:
This scenario is unacceptable for production stateful workloads. We need storage that:
While the "cattle, not pets" philosophy encourages stateless design, production systems inevitably require state: databases, message queues, file storage, session data, and more. Kubernetes must accommodate these workloads with robust storage primitives.
Persistent Volumes introduce a two-layer abstraction that decouples storage provisioning from consumption. This separation of concerns is fundamental to Kubernetes' storage model.
The PV/PVC model:
Kubernetes uses two distinct API resources for persistent storage:
Persistent Volume (PV): A cluster-level resource representing a piece of physical storage. PVs are provisioned by administrators (or dynamically by Storage Classes) and exist independently of any pod. They represent the actual storage capacity—an NFS share, an AWS EBS volume, a GCE Persistent Disk, or any supported storage backend.
Persistent Volume Claim (PVC): A namespace-scoped request for storage by a user or workload. PVCs specify the desired storage characteristics (size, access mode, storage class) without requiring knowledge of the underlying storage infrastructure. Pods mount PVCs, not PVs directly.
123456789101112131415161718192021222324252627282930313233343536373839404142434445
# Persistent Volume (PV) - Cluster-level storage resource# Provisioned by administrators or dynamically via StorageClassapiVersion: v1kind: PersistentVolumemetadata: name: database-pv-001 labels: type: ssd environment: production app: postgresqlspec: # Storage capacity - must match or exceed PVC requests capacity: storage: 100Gi # Volume mode: Filesystem (default) or Block volumeMode: Filesystem # Access modes determine how the volume can be mounted # This volume supports ReadWriteOnce (single node read-write) accessModes: - ReadWriteOnce # Reclaim policy: what happens when PVC is deleted # Retain: keep data, manual cleanup required # Delete: automatically delete backing storage # Recycle: deprecated, basic scrub (rm -rf) persistentVolumeReclaimPolicy: Retain # Storage class for binding with PVCs # Empty string means this PV only binds to PVCs requesting "" storageClassName: fast-ssd # Mount options passed to the mount command mountOptions: - hard - nfsvers=4.1 # NFS storage backend specification # Other backends: awsElasticBlockStore, gcePersistentDisk, # azureDisk, csi, hostPath, local, etc. nfs: path: /exports/database-001 server: nfs-server.storage.svc.cluster.local readOnly: falseKey architectural properties:
| Aspect | Persistent Volume (PV) | Persistent Volume Claim (PVC) |
|---|---|---|
| Scope | Cluster-wide resource | Namespace-scoped resource |
| Created by | Administrators or dynamically | Developers/workload owners |
| Represents | Actual storage infrastructure | Request for storage |
| Contains | Backend-specific configuration | Desired storage characteristics |
| Lifecycle | Independent of pods/workloads | Bound to application lifecycle |
| Knowledge required | Storage infrastructure details | Only capacity and access needs |
Why this separation matters:
This architecture enables several critical capabilities:
Role separation: Storage administrators manage infrastructure without application knowledge. Developers request storage without infrastructure expertise.
Portability: Applications reference PVCs by name. The same pod specification works across environments where PVCs are satisfied by different underlying storage.
Pre-provisioning: PVs can be created ahead of workloads, ensuring storage is available when applications deploy.
Resource management: PVs can be managed, monitored, and maintained independently of the workloads consuming them.
Persistent Volume Claims are the interface through which applications request storage. Understanding PVC specification and the binding process is essential for reliable storage management.
PVC specification anatomy:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566
# Persistent Volume Claim (PVC) - Request for storageapiVersion: v1kind: PersistentVolumeClaimmetadata: name: postgresql-data namespace: database labels: app: postgresql component: primaryspec: # Access modes requested (must be supported by bound PV) accessModes: - ReadWriteOnce # Volume mode: Filesystem or Block volumeMode: Filesystem # Resource request (minimum storage capacity) resources: requests: storage: 50Gi # Storage class to use for binding/provisioning # - Specific class name: bind to PV with matching class # - Empty string "": bind only to PVs without class # - Omitted: use cluster default StorageClass storageClassName: fast-ssd # Optional: Label selector to narrow PV selection selector: matchLabels: type: ssd environment: production matchExpressions: - key: app operator: In values: - postgresql - mysql # Optional: Request binding to specific PV by name # This bypasses normal matching logic # volumeName: database-pv-001 ---# Pod consuming the PVCapiVersion: v1kind: Podmetadata: name: postgresql namespace: databasespec: containers: - name: postgresql image: postgres:15 volumeMounts: - name: data mountPath: /var/lib/postgresql/data # Optional: mount a subdirectory of the volume subPath: pgdata volumes: - name: data persistentVolumeClaim: claimName: postgresql-data # Optional: mount as read-only readOnly: falseThe binding algorithm:
When a PVC is created, the Kubernetes PV controller attempts to bind it to an appropriate PV. The binding process follows these rules:
StorageClass matching: PVC storageClassName must match PV storageClassName exactly. If PVC omits the field, cluster default is used.
Access mode compatibility: The PV must support all access modes requested by the PVC.
Capacity satisfaction: The PV capacity must be greater than or equal to the PVC request.
Selector evaluation: If PVC specifies a selector, PV labels must satisfy all conditions.
Volume name: If PVC specifies volumeName, only that specific PV is considered.
Best fit selection: Among qualifying PVs, Kubernetes selects the smallest PV that satisfies the request (to minimize waste).
Binding is exclusive: Each PV binds to exactly one PVC. Once bound, the PV is unavailable to other claims.
PV-PVC binding is a one-to-one relationship. Once bound, a PV cannot be bound to a different PVC until the current binding is released (PVC deleted). Even if the PVC requests 10Gi and binds to a 100Gi PV, no other claim can use the remaining 90Gi.
Binding scenarios and troubleshooting:
| Symptom | Cause | Resolution |
|---|---|---|
| PVC stuck in Pending | No PV matches requirements | Create appropriate PV or adjust PVC spec |
| PVC stuck in Pending | StorageClass has no provisioner | Use a class with dynamic provisioning or pre-create PVs |
| PVC stuck in Pending | Requested capacity exceeds available PVs | Create larger PV or reduce PVC request |
| PVC stuck in Pending | Access mode mismatch | Ensure PV supports requested access modes |
| PVC stuck in Pending | Selector matches no PVs | Verify PV labels match selector criteria |
| Wrong PV bound | Multiple PVs qualified, smallest picked | Use selectors or volumeName for precise control |
Access modes define how a Persistent Volume can be mounted by pods across the cluster. Understanding access modes is critical for designing storage architectures that match workload requirements while respecting infrastructure constraints.
The three access modes:
Access modes are constraints on how storage can be mounted, not guarantees about data consistency. ReadWriteMany doesn't provide distributed locking or prevent write conflicts. Applications must implement their own concurrency control when sharing writable storage.
ReadWriteOncePod (RWOP):
Kubernetes 1.22+ introduced a fourth access mode:
ReadWriteOncePod (RWOP): The volume can be mounted as read-write by a single pod cluster-wide. This is stronger than RWO—even pods on the same node cannot share the mount. Use RWOP when you need to guarantee exclusive access for data integrity.
Storage backend support matrix:
Not all storage backends support all access modes. This table shows common patterns:
| Storage Backend | RWO | ROX | RWX | RWOP | Notes |
|---|---|---|---|---|---|
| AWS EBS | ✓ | ✗ | ✗ | ✓ | Block storage, single-AZ |
| GCE Persistent Disk | ✓ | ✓* | ✗ | ✓ | *ROX requires disk type configuration |
| Azure Disk | ✓ | ✗ | ✗ | ✓ | Block storage, premium SSD available |
| NFS | ✓ | ✓ | ✓ | ✗ | Network filesystem, widely compatible |
| CephFS | ✓ | ✓ | ✓ | ✗ | Distributed filesystem |
| AWS EFS | ✓ | ✓ | ✓ | ✗ | Managed NFS, multi-AZ |
| Azure Files | ✓ | ✓ | ✓ | ✗ | SMB/NFS file shares |
| GCE Filestore | ✓ | ✓ | ✓ | ✗ | Managed NFS |
| Local PV | ✓ | ✗ | ✗ | ✓ | Node-local storage only |
| HostPath | ✓ | ✗ | ✗ | ✗ | Single-node, testing only |
Choosing access modes:
The access mode decision impacts architecture significantly:
The reclaim policy determines what happens to a Persistent Volume's backing storage when its claim is deleted. This is a critical configuration for data safety, cost management, and operational procedures.
The volume lifecycle phases:
Persistent Volumes progress through defined phases:
Reclaim policies:
The persistentVolumeReclaimPolicy field controls transitions from Released state:
| Policy | Behavior | Use Case | Data Safety |
|---|---|---|---|
| Retain | PV enters Released state, data preserved, manual cleanup required | Production databases, audit data, any critical data | Highest—data preserved until manual action |
| Delete | PV and backing storage automatically deleted | Ephemeral workloads, development environments, auto-provisioned volumes | None—data destroyed automatically |
| Recycle (deprecated) | Basic rm -rf on volume, PV made Available again | Legacy compatibility only | Low—simple delete, no secure wipe |
12345678910111213141516171819202122232425262728293031323334
# Retain Policy: For production databases# Data is preserved even after PVC deletionapiVersion: v1kind: PersistentVolumemetadata: name: production-db-pvspec: capacity: storage: 500Gi accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Retain # Data survives PVC deletion storageClassName: production-ssd awsElasticBlockStore: volumeID: vol-0123456789abcdef0 fsType: ext4 ---# Delete Policy: For ephemeral workloads# Volume and data automatically cleaned upapiVersion: v1kind: PersistentVolumemetadata: name: dev-workspace-pvspec: capacity: storage: 20Gi accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Delete # Auto-cleanup on PVC deletion storageClassName: dev-standard gcePersistentDisk: pdName: dev-workspace-disk fsType: ext4Reclaim policy can be changed on existing PVs: `kubectl patch pv <pv-name> -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'. This is useful for upgrading development volumes to production protection without recreating storage.
Reclaiming Released volumes:
When using Retain policy, Released PVs require manual steps to make them Available again:
Delete the claimRef: Remove the reference to the old PVC
kubectl patch pv <pv-name> -p '{"spec":{"claimRef": null}}'
Clean up data if needed: Depending on use case, you may need to wipe, backup, or preserve data
Update labels/selectors: Adjust metadata if the PV will serve a different purpose
Verify Available state: kubectl get pv <pv-name> should show status Available
This manual process protects against accidental data loss but requires operational runbooks for volume recycling.
Kubernetes supports two volume modes that determine how storage is presented to pods. The choice between filesystem and block mode has significant implications for performance, flexibility, and compatibility.
Volume modes explained:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657
# Block mode PV - raw device accessapiVersion: v1kind: PersistentVolumemetadata: name: high-performance-block-pvspec: capacity: storage: 200Gi volumeMode: Block # Explicitly request block mode accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Retain storageClassName: high-iops local: path: /dev/nvme0n1 # Raw NVMe device nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - database-node-01 ---# PVC requesting block modeapiVersion: v1kind: PersistentVolumeClaimmetadata: name: database-block-storagespec: volumeMode: Block # Must match PV accessModes: - ReadWriteOnce resources: requests: storage: 200Gi storageClassName: high-iops ---# Pod using block volumeapiVersion: v1kind: Podmetadata: name: database-with-block-storagespec: containers: - name: database image: oracle-database:19c # Block volumes use volumeDevices, not volumeMounts volumeDevices: - name: data devicePath: /dev/xvda # Device path inside container volumes: - name: data persistentVolumeClaim: claimName: database-block-storageBlock volume mode requires: (1) Storage backend support for raw blocks, (2) PV and PVC must both specify volumeMode: Block, (3) Pod must use volumeDevices instead of volumeMounts, (4) Application must handle raw device I/O. Not all CSI drivers support block mode.
When to use block mode:
When to use filesystem mode:
Persistent Volumes can be created through two mechanisms: static provisioning (administrator creates PVs manually) or dynamic provisioning (PVs created automatically in response to PVCs). Understanding both approaches and their trade-offs is essential for production operations.
Static provisioning:
With static provisioning, cluster administrators pre-create Persistent Volumes before workloads request them. PVCs then bind to existing PVs that match their requirements.
Dynamic provisioning:
With dynamic provisioning, PVs are created automatically when a PVC is submitted. A Storage Class defines the provisioner and parameters for volume creation.
| Aspect | Static Provisioning | Dynamic Provisioning |
|---|---|---|
| Administrator involvement | Required for every volume | Only StorageClass setup |
| Lead time | PVs must exist before workloads | PVs created with workload deployment |
| Capacity planning | Pre-plan total storage needs | Scale as demands grow |
| Resource efficiency | Potential over-provisioning | Exact-fit allocation |
| Legacy storage support | Full support any backend | Requires CSI driver |
| Naming control | Explicit, predictable names | Auto-generated names |
| Recovery/migration | Easier to manage known PVs | More automation available |
Production environments often use both: dynamic provisioning for general workloads and self-service, static provisioning for databases and critical systems requiring specific storage placement. Storage Classes can be configured to prefer static PVs when available, falling back to dynamic provisioning.
Operating Persistent Volumes in production requires attention to data safety, operational procedures, and monitoring. These best practices represent lessons learned from production incidents across the industry.
persistentVolumeReclaimPolicy: Retain. Delete policy should only be used with explicit intent.Deleting a PVC with Delete reclaim policy immediately triggers backing storage destruction. There is no confirmation, no grace period, no undo. Implement admission webhooks or OPA policies to require additional approval for PVC deletion in production namespaces.
Monitoring and alerting checklist:
# Essential PV/PVC metrics to monitor
metrics:
- name: kube_persistentvolume_status_phase
alert_on: [Failed, Released] # Immediate attention for Failed
description: Volume phase status
- name: kube_persistentvolumeclaim_status_phase
alert_on: [Pending > 5m, Lost]
description: Claim status
- name: kubelet_volume_stats_used_bytes / kubelet_volume_stats_capacity_bytes
alert_on: "> 0.8 for 15m"
description: Volume utilization percentage
- name: kubelet_volume_stats_inodes_used / kubelet_volume_stats_inodes
alert_on: "> 0.9"
description: Inode exhaustion (filesystem only)
Persistent Volumes represent Kubernetes' answer to the fundamental challenge of stateful workloads in a containerized, distributed environment. Mastering PVs requires understanding both the API abstractions and the operational realities of production storage.
What's next:
With a solid understanding of Persistent Volumes, we'll explore Storage Classes—the mechanism that automates PV provisioning, defines storage tiers, and enables self-service storage for Kubernetes users. Storage Classes transform PVs from static infrastructure into dynamic, on-demand resources.
You now understand Persistent Volumes comprehensively—from container storage challenges through PV/PVC architecture, access modes, reclaim policies, volume modes, and production operations. This foundation prepares you for advanced storage patterns with Storage Classes.