How do you drain a node safely for maintenance?

Detailed Answer

Step 1: cordon — stop new Pods from landing here

kubectl cordon node-1

Marks the node as SchedulingDisabled — the scheduler will no longer consider it a candidate for new Pods, but existing Pods already running on it are completely unaffected by cordon alone. This is a good first step even before you're ready to actually drain, since it prevents the situation from getting worse (more Pods landing on a node you're about to take offline) while you prepare.

Step 2: drain — evict existing Pods safely

kubectl drain node-1 --ignore-daemonsets --delete-emptydir-data

drain uses the Kubernetes Eviction API to gracefully remove every Pod from the node — this respects each Pod's terminationGracePeriodSeconds (allowing a clean shutdown) and, importantly, respects PodDisruptionBudgets (see that question), refusing to evict a Pod if doing so would violate its PDB, rather than forcing it through regardless.

Two flags are commonly required to avoid the drain command refusing to proceed:

--ignore-daemonsets — DaemonSet-managed Pods (see the workload controllers topic) are, by design, tied to running on every node, so they can't be meaningfully "evicted and rescheduled elsewhere" the way a Deployment's Pod can; without this flag, drain will refuse to proceed past a DaemonSet Pod.
--delete-emptydir-data — Pods using emptyDir volumes (see the storage topic) will lose that data when evicted (since emptyDir is node-local and ephemeral by design); without explicitly acknowledging this with the flag, drain refuses to proceed, forcing you to consciously accept the (usually expected and fine) data loss rather than it happening silently.

If a Pod's PodDisruptionBudget would be violated by evicting it, drain will wait and retry, rather than proceeding — this can cause a drain to appear "stuck," which is often a signal that a PDB is configured too strictly for the current situation (e.g., minAvailable equal to total replica count, blocking any eviction at all — see that question's common misconfiguration).

Step 3: perform the actual maintenance

With the node cordoned and drained (no Pods running on it, no new ones landing there), it's now safe to perform whatever maintenance is needed — an OS patch, a kubelet upgrade, hardware maintenance, or simply decommissioning the node entirely.

Step 4: uncordon — return it to service

kubectl uncordon node-1

Marks the node schedulable again — the scheduler will now consider it a normal candidate for new Pods. If the node is being decommissioned entirely rather than returning to service, you'd instead remove it from the cluster (kubectl delete node node-1, alongside actually decommissioning the underlying machine/VM) rather than uncordoning it.

What makes this "safe" in the first place

The entire process depends on the application workloads being drained actually being resilient to losing one node's worth of capacity — sufficient replica counts spread appropriately across nodes (ideally reinforced with pod anti-affinity — see that question), and PodDisruptionBudgets configured to prevent too many replicas being evicted simultaneously. Draining a node running the only replica of a critical, non-redundant application will cause a real outage for that application regardless of how carefully you follow the cordon/drain/uncordon sequence — the safety of node maintenance is ultimately a property of how the workloads themselves are architected, not just of the drain command's mechanics.

Automation at scale

For clusters with many nodes needing regular maintenance/upgrades (see the cluster-upgrade question), this cordon/drain/uncordon sequence is typically automated rather than run manually node-by-node — cloud-managed node pool upgrade features, or cluster-lifecycle tools, commonly implement exactly this sequence with configurable batching and safety checks built in.