How does kube-proxy route traffic to Service backends?

Detailed Answer

The core job: translate "Service virtual IP" into "one specific Pod IP"

A Service's ClusterIP is not a real, routable address assigned to any actual network interface — it's a virtual IP that only means something because every node's kube-proxy has been configured to intercept traffic sent to it and redirect that traffic to one of the Service's actual backing Pods. kube-proxy watches the API server for Service and EndpointSlice changes and continuously updates each node's local networking configuration to reflect the current set of healthy backends.

iptables mode (the long-standing default)

kube-proxy writes a chain of iptables rules (using the Linux kernel's netfilter framework) that match packets destined for a Service's ClusterIP:port and probabilistically redirect them (via DNAT) to one of the currently healthy backing Pod IP:ports, roughly at random.

Packet destined for Service ClusterIP 10.96.0.5:80
   → iptables rule matches, picks one of the 3 backing Pod IPs (weighted-random)
   → DNAT rewrites the destination to the chosen Pod's actual IP:port
   → packet continues on to that Pod

Limitation at scale: iptables rule evaluation is roughly linear in the number of rules — with thousands of Services, the sheer number of rules that must be checked for every packet can become a measurable performance bottleneck, and rule updates (whenever any Service's endpoints change) get progressively slower to apply as the ruleset grows.

IPVS mode — built for larger scale

IPVS (IP Virtual Server, a Linux kernel-level load balancer) uses hash-table-based lookups instead of a linear rule chain, giving effectively O(1) backend selection regardless of how many Services exist, and supports several actual load-balancing algorithms (round robin, least connection, etc.) rather than iptables's simpler random selection. Clusters with a very large number of Services (common in large multi-tenant or microservice-heavy environments) typically switch kube-proxy to IPVS mode specifically for this scaling advantage.

Bypassing kube-proxy entirely: eBPF-based CNI plugins

Some CNI plugins, most notably Cilium, can replace kube-proxy's functionality entirely using eBPF programs running directly in the kernel — achieving the same Service-routing behavior with lower latency and overhead than either iptables or IPVS, and often adding richer observability into the bargain. This is an increasingly common production configuration, though it changes the operational model somewhat (you're relying on the CNI plugin, not the separate kube-proxy component, for this critical routing function).

Why understanding this mechanism matters practically

When debugging "a Service exists and has healthy endpoints, but traffic still isn't reaching Pods," understanding that kube-proxy is the component actually translating the virtual IP into real routing rules (rather than something magical happening at the Service object level) tells you where to look: is kube-proxy running and healthy on the relevant nodes, are its iptables/IPVS rules actually present and correct (iptables-save | grep <service-name>), and does the EndpointSlice actually list the expected healthy Pod IPs in the first place.