What is node affinity/anti-affinity, and how does it differ from pod affinity/anti-affinity?
Quick Answer
Node affinity/anti-affinity constrains which **nodes** a Pod can be scheduled onto, based on node labels (e.g., "only nodes with an SSD," "prefer nodes in this availability zone"). Pod affinity/anti-affinity constrains scheduling based on **other Pods already running** on candidate nodes (e.g., "schedule near Pods from the same application, for locality," or "never schedule two replicas of this Pod on the same node, for availability"). Both come in a `required` (hard constraint — must be satisfied) and `preferred` (soft constraint — best-effort) variant.
Detailed Answer
Node affinity — based on node labels
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution: # hard requirement
nodeSelectorTerms:
- matchExpressions:
- key: disktype
operator: In
values: ["ssd"]
preferredDuringSchedulingIgnoredDuringExecution: # soft preference
- weight: 80
preference:
matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values: ["us-east-1a"]
This says: this Pod must land on a node labeled disktype=ssd (a hard requirement — the Pod won't be scheduled at all if no such node has room), and, among nodes satisfying that, prefer (but don't require) one in zone us-east-1a. This is a more expressive successor to the simpler nodeSelector field, supporting richer expressions (In, NotIn, Exists, etc.) and the required/preferred distinction.
Pod affinity/anti-affinity — based on co-located Pods
spec:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values: ["cache"]
topologyKey: "kubernetes.io/hostname"
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values: ["web"]
topologyKey: "kubernetes.io/hostname"
This example combines both: pod affinity requires this Pod to be scheduled on a node that already has a Pod labeled app=cache running on it (useful for co-locating an application with a local cache for lower latency); pod anti-affinity requires this Pod to avoid nodes that already have another Pod labeled app=web (i.e., don't put two replicas of the same "web" application on the same node — a common high-availability pattern, so a single node failure can't take down multiple replicas of the same critical service at once).
The topologyKey — defining what "together" means
topologyKey determines the granularity of "together" — kubernetes.io/hostname means "same node specifically"; topology.kubernetes.io/zone would mean "same availability zone" (a looser, region-level notion of togetherness/separation). Anti-affinity keyed on zone rather than hostname is a common pattern for spreading replicas across failure domains larger than a single node, protecting against a whole zone going down, not just one machine.
Required vs. preferred — hard vs. soft constraints
Both affinity types support requiredDuringSchedulingIgnoredDuringExecution (a hard constraint — the Pod simply won't be scheduled if it can't be satisfied) and preferredDuringSchedulingIgnoredDuringExecution (a soft, weighted preference — the scheduler tries to satisfy it, but will still schedule the Pod elsewhere if it can't). The verbose naming itself is informative: "IgnoredDuringExecution" means these rules are only checked at scheduling time — if labels change after the Pod is already running such that the rule would no longer be satisfied, the already-running Pod isn't evicted retroactively.
Why anti-affinity for high availability is a very common real pattern
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app: web
topologyKey: "kubernetes.io/hostname"
Using preferred (rather than required) anti-affinity for spreading replicas across nodes is a common, pragmatic middle ground — you get the availability benefit of spreading replicas across different nodes/zones under normal conditions, without the risk of Pods becoming entirely unschedulable during a genuine capacity crunch where a hard requirement couldn't be satisfied (e.g., a small cluster or a zone outage leaving too few eligible nodes).
Being able to distinguish node affinity (Pod vs. node labels) from pod affinity (Pod vs. other Pods) precisely, and knowing when required vs. preferred and hostname vs. zone-level topology keys are the right choice, demonstrates real scheduling design experience beyond just knowing the YAML fields exist.