What is node affinity/anti-affinity, and how does it differ from pod affinity/anti-affinity?

Detailed Answer

Node affinity — based on node labels

spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:   # hard requirement
        nodeSelectorTerms:
          - matchExpressions:
              - key: disktype
                operator: In
                values: ["ssd"]
      preferredDuringSchedulingIgnoredDuringExecution:    # soft preference
        - weight: 80
          preference:
            matchExpressions:
              - key: topology.kubernetes.io/zone
                operator: In
                values: ["us-east-1a"]

This says: this Pod must land on a node labeled disktype=ssd (a hard requirement — the Pod won't be scheduled at all if no such node has room), and, among nodes satisfying that, prefer (but don't require) one in zone us-east-1a. This is a more expressive successor to the simpler nodeSelector field, supporting richer expressions (In, NotIn, Exists, etc.) and the required/preferred distinction.

Pod affinity/anti-affinity — based on co-located Pods

spec:
  affinity:
    podAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchExpressions:
              - key: app
                operator: In
                values: ["cache"]
          topologyKey: "kubernetes.io/hostname"
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchExpressions:
              - key: app
                operator: In
                values: ["web"]
          topologyKey: "kubernetes.io/hostname"

This example combines both: pod affinity requires this Pod to be scheduled on a node that already has a Pod labeled app=cache running on it (useful for co-locating an application with a local cache for lower latency); pod anti-affinity requires this Pod to avoid nodes that already have another Pod labeled app=web (i.e., don't put two replicas of the same "web" application on the same node — a common high-availability pattern, so a single node failure can't take down multiple replicas of the same critical service at once).

The topologyKey — defining what "together" means

topologyKey determines the granularity of "together" — kubernetes.io/hostname means "same node specifically"; topology.kubernetes.io/zone would mean "same availability zone" (a looser, region-level notion of togetherness/separation). Anti-affinity keyed on zone rather than hostname is a common pattern for spreading replicas across failure domains larger than a single node, protecting against a whole zone going down, not just one machine.

Required vs. preferred — hard vs. soft constraints

Both affinity types support requiredDuringSchedulingIgnoredDuringExecution (a hard constraint — the Pod simply won't be scheduled if it can't be satisfied) and preferredDuringSchedulingIgnoredDuringExecution (a soft, weighted preference — the scheduler tries to satisfy it, but will still schedule the Pod elsewhere if it can't). The verbose naming itself is informative: "IgnoredDuringExecution" means these rules are only checked at scheduling time — if labels change after the Pod is already running such that the rule would no longer be satisfied, the already-running Pod isn't evicted retroactively.

Why anti-affinity for high availability is a very common real pattern

podAntiAffinity:
  preferredDuringSchedulingIgnoredDuringExecution:
    - weight: 100
      podAffinityTerm:
        labelSelector:
          matchLabels:
            app: web
        topologyKey: "kubernetes.io/hostname"

Using preferred (rather than required) anti-affinity for spreading replicas across nodes is a common, pragmatic middle ground — you get the availability benefit of spreading replicas across different nodes/zones under normal conditions, without the risk of Pods becoming entirely unschedulable during a genuine capacity crunch where a hard requirement couldn't be satisfied (e.g., a small cluster or a zone outage leaving too few eligible nodes).

Being able to distinguish node affinity (Pod vs. node labels) from pod affinity (Pod vs. other Pods) precisely, and knowing when required vs. preferred and hostname vs. zone-level topology keys are the right choice, demonstrates real scheduling design experience beyond just knowing the YAML fields exist.