What is a PodDisruptionBudget, and why does it matter during voluntary disruptions?

Detailed Answer

The problem it solves

Deployment "web" has 3 replicas, spread across 3 nodes.
An administrator needs to drain (empty and take offline) 2 of those 3 nodes
for maintenance, one after another.

Without a PDB: nothing stops the drain from evicting Pods on both nodes in
quick succession, potentially leaving only 1 (or even 0, if timed unluckily
with a rolling restart) of the 3 replicas available at once -- a real,
avoidable capacity/availability hit during routine maintenance.

Defining a PDB

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: web-pdb
spec:
  minAvailable: 2          # at least 2 of the "web" Pods must remain available
  # or: maxUnavailable: 1  # at most 1 may be unavailable at a time
  selector:
    matchLabels:
      app: web

With minAvailable: 2 on a Deployment with 3 replicas, a node drain (which uses the Eviction API to voluntarily remove Pods) will only be allowed to evict one web Pod at a time — attempting to evict a second concurrently, while the first's replacement isn't yet up and Ready, is blocked until enough Pods are available again.

Voluntary vs. involuntary disruptions — the key distinction

A PDB only governs voluntary disruptions — actions requested through the Eviction API, which respects PDBs: kubectl drain, a cluster autoscaler scaling down a node, a manual eviction. It has no effect on involuntary disruptions — a node crashing unexpectedly, a kernel panic, a hardware failure, or the node simply becoming unreachable. There's no way to "budget" for a sudden, unplanned failure; PDBs are specifically about giving routine, planned maintenance operations a safety constraint to respect.

Why this matters for cluster upgrades and autoscaling

Cluster upgrades typically work by draining and replacing nodes one at a time (or in small batches) — a properly configured PDB is what allows this process to proceed automatically and safely without an administrator needing to manually watch and time each node's drain to avoid taking down too much of any one application's capacity simultaneously. Similarly, a Cluster Autoscaler (see the scheduling topic) scaling down underutilized nodes respects PDBs when deciding which nodes it's safe to drain and remove.

A common misconfiguration

Setting minAvailable equal to (or maxUnavailable: 0 with) the total replica count effectively blocks all voluntary disruptions entirely — a node can never be drained if doing so would evict any Pod of that application, since evicting even one would violate the budget. This can silently prevent cluster upgrades or node maintenance from ever completing for that application, which is usually not the intended outcome — PDBs should be set to allow some disruption (typically enough to preserve real availability, e.g., "keep at least 2 of 3 available," not "keep all 3 available at all times").

Define a PDB for any application where losing more than a small number of replicas at once would cause a real availability problem — this is a cheap, low-effort safeguard that pays for itself the first time a node drain or upgrade would otherwise have accidentally taken down too many replicas of a critical service simultaneously.