What is a StorageClass, and how does dynamic provisioning work?

6 minadvancedstorageclassdynamic-provisioningstorage

Quick Answer

A StorageClass defines a category of storage a cluster can provision on demand — it names a specific provisioner (e.g., the AWS EBS CSI driver) and parameters (disk type, IOPS, filesystem type) describing *how* to create matching storage. When a PersistentVolumeClaim references a StorageClass and no existing PersistentVolume already satisfies it, the StorageClass's provisioner automatically creates both a new PersistentVolume object and the real underlying storage resource (e.g., actually calling the cloud provider's API to create a disk) — eliminating the need for an administrator to manually pre-provision storage ahead of time.

Detailed Answer

Defining a StorageClass

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
provisioner: ebs.csi.aws.com     # which CSI driver actually creates the storage
parameters:
  type: gp3
  iopsPerGB: "50"
reclaimPolicy: Delete             # what happens to the PV when its PVC is deleted
volumeBindingMode: WaitForFirstConsumer

Requesting storage from it

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: database-pvc
spec:
  storageClassName: fast-ssd      # references the StorageClass above
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 100Gi

What happens when this PVC is created

  1. The PVC controller checks for an existing, unbound PV matching the request — if none exists (the common case with dynamic provisioning, since PVs aren't pre-created), it proceeds to provisioning.
  2. The fast-ssd StorageClass's provisioner (ebs.csi.aws.com, a CSI — Container Storage Interface — driver) is invoked, which calls the actual cloud provider API to create a real 100Gi gp3 EBS volume.
  3. A new PersistentVolume object is automatically created, representing this newly-created real disk.
  4. The PVC is bound to this new PV — the whole process happening automatically, with no administrator needing to have anticipated this specific request ahead of time.

The Container Storage Interface (CSI) — another standard plugin interface

Similar in spirit to CRI (for container runtimes) and CNI (for networking), CSI standardizes how Kubernetes talks to storage systems, so any storage vendor can write a CSI driver that Kubernetes can use without storage-vendor-specific code baked into Kubernetes core. This is what lets the same StorageClass mechanism work uniformly whether the underlying provisioner is AWS EBS, GCP Persistent Disk, Azure Disk, or an on-prem storage system's CSI driver.

volumeBindingMode — an important, easy-to-miss setting

WaitForFirstConsumer (increasingly the recommended default) delays actually provisioning the volume until a Pod that will use the PVC is scheduled — this matters because the volume might have topology constraints (e.g., an EBS volume can only be attached to a node in the same availability zone it was created in), and provisioning it before knowing which node/zone the Pod will actually run in could create a volume in the wrong zone entirely, causing the Pod to become unschedulable. The older default, Immediate, provisions the volume as soon as the PVC is created, without waiting to see where the consuming Pod lands — a real source of "PVC bound to a volume in the wrong availability zone" issues if not configured carefully.

Default StorageClass

A cluster can designate one StorageClass as the default (via an annotation), used automatically for any PVC that doesn't explicitly specify storageClassName — worth being aware of when a PVC's storage behavior seems to "just work" without an explicit StorageClass reference; it's still using one, just implicitly.

Cluster operators typically define a small number of StorageClasses representing meaningful tiers (e.g., fast-ssd for databases needing high IOPS, standard for general-purpose storage, cold-archive for infrequently accessed data), and application teams simply reference the appropriate one by name in their PVCs — keeping the "what kind of storage, from which provider, with which performance characteristics" decision centralized and consistent across the cluster.