What are taints and tolerations, and how do they work together?
Quick Answer
A taint is applied to a **node**, repelling Pods from being scheduled there unless they explicitly tolerate it. A toleration is applied to a **Pod**, allowing (but not forcing) it to be scheduled onto nodes with a matching taint. This is the inverse of affinity: affinity is about a Pod attracting itself to certain nodes; taints/tolerations are about a node repelling Pods unless they've explicitly opted in — commonly used to reserve specialized nodes (GPU nodes, control-plane nodes) for only the specific workloads that need them.
Detailed Answer
Applying a taint to a node
kubectl taint nodes gpu-node-1 dedicated=gpu-workloads:NoSchedule
This taint (key=dedicated, value=gpu-workloads, effect=NoSchedule) means: no Pod will be scheduled onto this node unless it carries a matching toleration. Ordinary Pods, with no toleration specified, simply won't be placed here, even if the node has ample free CPU/memory — the taint overrides normal scheduling based purely on resource fit.
Adding a matching toleration to a Pod
spec:
tolerations:
- key: "dedicated"
operator: "Equal"
value: "gpu-workloads"
effect: "NoSchedule"
containers:
- name: ml-training
image: ml-trainer:1.0
This Pod tolerates the dedicated=gpu-workloads:NoSchedule taint, meaning it's now eligible to be scheduled on gpu-node-1 — but a toleration only removes the repulsion, it doesn't attract the Pod there. If you specifically want ML workloads to only land on GPU nodes (not merely "allowed to," but "preferentially placed there"), you'd combine this toleration with node affinity (see that question) targeting nodes labeled for GPU capability — taints/tolerations and affinity are complementary, commonly used together.
The three taint effects
| Effect | Behavior for non-tolerating Pods |
|---|---|
NoSchedule | New Pods won't be scheduled here; already-running Pods are unaffected |
PreferNoSchedule | The scheduler tries to avoid placing new Pods here, but it's a soft preference, not a hard rule |
NoExecute | New Pods won't be scheduled here, and existing Pods already running here without a matching toleration are evicted |
NoExecute is the strongest effect — it doesn't just prevent future scheduling, it actively removes Pods that are already there and don't tolerate it. This is exactly the mechanism used, for example, when a node becomes NotReady — the control plane automatically applies a NoExecute taint for node-not-ready conditions, and Pods without a toleration for it are evicted after a grace period (which is itself configurable via tolerationSeconds).
Common real-world uses
- Reserving specialized hardware — tainting GPU nodes so only ML/GPU-requiring workloads (which explicitly tolerate the taint) land there, keeping expensive specialized nodes from being consumed by ordinary workloads.
- Control-plane node protection — control-plane nodes are commonly tainted (
node-role.kubernetes.io/control-plane:NoSchedule) to keep ordinary application workloads off them by default; only specifically-tolerating Pods (often infrastructure DaemonSets — see that question) run there. - Automatic node-condition taints — Kubernetes itself automatically applies taints for conditions like
node.kubernetes.io/not-ready,node.kubernetes.io/memory-pressure, and similar, which is how the system automatically starts repelling (and, forNoExecute, evicting) Pods from an unhealthy node without needing an administrator to intervene manually.
The key distinction from affinity, restated
Node affinity is something a Pod declares about which nodes it wants. Taints are something a node declares about which Pods it's willing to accept. They solve related but inverted problems, and real cluster designs commonly combine both — a taint to keep ordinary workloads off a specialized node by default, plus affinity on the specialized workload's Pods to actively steer them onto that same node.