What's the difference between emptyDir, hostPath, and a persistent volume-backed volume?
Quick Answer
`emptyDir` is empty, node-local, ephemeral scratch space created fresh when a Pod starts and permanently deleted when the Pod is removed — useful for temporary data or sharing files between containers in the same Pod, never for anything that must survive. `hostPath` mounts a specific path from the underlying node's own filesystem directly into the Pod — powerful but tightly (and often problematically) coupled to whichever specific node the Pod happens to land on. A PersistentVolumeClaim-backed volume is genuinely durable, independent of any specific node or Pod's lifecycle, backed by real persistent storage (a cloud disk, network share) that survives Pod deletion and rescheduling.
Detailed Answer
emptyDir — ephemeral, Pod-scoped scratch space
volumes:
- name: scratch-space
emptyDir: {}
Created empty when the Pod starts, exists only as long as the Pod does on that node, and is permanently deleted the moment the Pod itself is deleted — a container within the Pod restarting does not lose an emptyDir's contents (it survives individual container restarts, just not Pod deletion). Good for: a temporary cache, scratch space for a batch computation, or as the shared medium between a main container and a sidecar/init container within the same Pod (see the sidecar and init container questions). Can optionally be backed by RAM (emptyDir: {medium: Memory}) for even faster, tmpfs-based scratch space, at the cost of counting against the Pod's memory usage.
hostPath — a specific node's own filesystem, mounted in
volumes:
- name: node-logs
hostPath:
path: /var/log/containers
type: Directory
Mounts an actual path from the specific node the Pod happens to be scheduled onto — powerful (direct access to node-level resources, useful for certain infrastructure DaemonSets that genuinely need to read/write the host's own filesystem, like a log collector reading /var/log), but comes with real caveats: the data is tied to that one specific node, not portable if the Pod is rescheduled elsewhere; different nodes might have different content/permissions at that path; and it's a meaningful security risk if used carelessly, since it gives a Pod direct access to the underlying host's filesystem, potentially including sensitive host-level files — most clusters restrict hostPath usage via Pod Security Admission policies (see the security topic) specifically because of this risk.
PersistentVolumeClaim-backed volume — genuinely durable, node-independent storage
volumes:
- name: data
persistentVolumeClaim:
claimName: my-pvc
Backed by real persistent storage (a cloud block volume, an NFS share) that exists independently of any specific node or Pod — if the Pod is deleted and recreated (even on a completely different node, assuming the storage backend/access mode supports it), it reattaches to the same underlying data. This is the only one of the three that provides genuine durability across Pod rescheduling, which is why it's the correct choice for anything that must survive beyond a single Pod's lifetime (databases, uploaded files, any real application data).
Side-by-side summary
| emptyDir | hostPath | PVC-backed | |
|---|---|---|---|
| Survives container restart (same Pod) | Yes | Yes | Yes |
| Survives Pod deletion | No | Yes, but tied to that node | Yes, node-independent |
| Tied to a specific node | No (doesn't matter, since it's deleted with the Pod anyway) | Yes | No (typically — depends on the storage backend/access mode) |
| Typical use | Scratch space, inter-container sharing within a Pod | Node-level infrastructure access (logs, host metrics) | Real application/database data |
Default to a PVC-backed volume for anything that must actually persist; use emptyDir for genuinely temporary, disposable data; and treat hostPath as a specialized tool reserved for infrastructure-level DaemonSets that have a real, deliberate need to access the host filesystem — not a general-purpose storage option for application workloads.