What is a StatefulSet, and when do you need one instead of a Deployment?

Detailed Answer

Why Deployments are wrong for stateful applications

A Deployment's Pods are interchangeable — they get randomly-suffixed names (web-7d8f9c-x2k4p), no guaranteed stable identity, and if you use a PersistentVolumeClaim in a Deployment's Pod template, every replica shares the same PVC (or, more commonly, each gets a fresh empty volume depending on configuration) — there's no built-in way to give each replica its own dedicated, durable, individually-tracked storage that follows that specific replica across restarts.

What a StatefulSet provides instead

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  serviceName: "web"     # must reference a headless Service (see the networking topic)
  replicas: 3
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
        - name: web
          image: myapp:1.0
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 10Gi

Stable, predictable Pod names: web-0, web-1, web-2 — not random suffixes. If web-1 is deleted, its replacement is created with the exact same name web-1, not a new random one.
Stable network identity: combined with a headless Service, each Pod gets a predictable, individually-addressable DNS name (web-0.web.default.svc.cluster.local) — essential for applications where peers need to address a specific other instance by name (e.g., a database replica connecting to a specific primary).
Per-replica persistent storage (volumeClaimTemplates): each replica gets its own PVC (data-web-0, data-web-1, data-web-2), and critically, if web-1's Pod is deleted and recreated (even on a different node), it's reattached to the same data-web-1 PVC — its data survives, tied to its identity, not to whichever node happened to run it.
Ordered, sequential deployment and scaling: by default, StatefulSet Pods are created, updated, and terminated one at a time, in order (web-0 before web-1 before web-2), which matters for applications with ordering dependencies (e.g., a database's designated primary must come up before replicas that need to connect to it).

When you actually need this

Databases and distributed data stores run directly on Kubernetes (PostgreSQL, MongoDB, Cassandra, Elasticsearch) — each replica typically holds a distinct portion of data and needs stable identity to know its role and reconnect to its own data after a restart.
Distributed coordination systems (ZooKeeper, etcd itself, when run on Kubernetes) where each member needs a stable identity to participate correctly in a consensus protocol.
Any application where "which replica am I" is meaningful to the application's own logic, not just an interchangeable unit of horizontal scale.

When you don't

Stateless web servers, API services, or workers that don't care which specific instance handles a given request, and don't need to persist state tied to a specific replica's identity — these are the common case, and a Deployment (simpler, with more flexible rollout behavior) is the right default. Reaching for a StatefulSet when a Deployment would do adds real operational complexity (slower, ordered rollouts; PVC lifecycle management) for no corresponding benefit.

An important caveat

Running genuinely stateful, data-critical systems like production databases directly on Kubernetes (rather than using a managed cloud database service) is itself a significant operational commitment — StatefulSets solve the scheduling and identity problem, but backup, failover, and data consistency logic for the actual stateful application usually still needs to be handled by an Operator (see the extensibility topic) or the application's own clustering logic, not by the StatefulSet primitive alone.