What is an EndpointSlice, and why did it replace/augment Endpoints?
Quick Answer
An EndpointSlice tracks the set of network endpoints (Pod IPs and ports) backing a Service, the same role the older Endpoints object played — but EndpointSlices split large backend lists across multiple smaller objects (each capped at 100 endpoints by default) instead of one single, unbounded object. This was introduced specifically to fix a scalability problem: a Service with thousands of backing Pods meant one enormous Endpoints object that had to be entirely rewritten and redistributed to every watching component on every single change, which became a genuine performance bottleneck at scale.
Detailed Answer
The role both objects play
Both Endpoints and EndpointSlices exist to answer the same question: "which Pod IP:port combinations are currently the healthy backends for this Service?" — this is the data kube-proxy (and other consumers, like a service mesh's control plane) actually watches and reacts to when programming routing rules.
# The older Endpoints object -- one object per Service, unbounded size
apiVersion: v1
kind: Endpoints
metadata:
name: web # matches the Service name
subsets:
- addresses:
- ip: 10.1.2.3
- ip: 10.1.2.4
# ... every single backing Pod IP, in ONE list, in ONE object
ports:
- port: 8080
The scaling problem this caused
For a Service backed by a very large number of Pods (thousands, in large clusters), every single one of those Pod IPs lived inside one Endpoints object. Any single change — one Pod becoming unready, one Pod being replaced during a rollout — required the API server to serialize and transmit the entire updated list (potentially tens of thousands of IP entries) to every component watching that object. This scaled poorly: both the size of each update and the number of components needing to process it grew with cluster size, making large-scale rollouts and node churn noticeably more expensive on the control plane.
EndpointSlices — the same data, sharded into smaller pieces
apiVersion: discovery.k8s.io/v1
kind: EndpointSlice
metadata:
name: web-abc123 # one of potentially several slices for the "web" Service
labels:
kubernetes.io/service-name: web # links this slice back to its Service
addressType: IPv4
endpoints:
- addresses: ["10.1.2.3"]
conditions:
ready: true
- addresses: ["10.1.2.4"]
conditions:
ready: true
ports:
- port: 8080
Instead of one unbounded object, a Service's full backend list is split across multiple EndpointSlice objects, each capped at a configurable maximum (100 endpoints by default). When one Pod's readiness changes, only the one slice containing that Pod needs to be updated and redistributed — not the entire backend list for the whole Service — which is a significant, targeted fix for the update-cost-at-scale problem.
Additional improvements EndpointSlices brought along
Beyond sharding, EndpointSlices also natively support dual-stack (IPv4 and IPv6 simultaneously, via separate slices per address type) and carry richer per-endpoint information (like topology hints, used for topology-aware routing that prefers keeping traffic within the same zone/region for latency and cost reasons) that the older Endpoints object's simpler structure didn't accommodate well.
Practical relevance today
Endpoints objects still exist (for backward compatibility with older tooling that reads them directly) and are still automatically kept in sync alongside EndpointSlices for any Service, but EndpointSlices are what modern kube-proxy and other Service-consuming components actually watch and rely on. When debugging Service connectivity at scale, checking kubectl get endpointslices -l kubernetes.io/service-name=<service> (rather than the older kubectl get endpoints) is the more scalable and increasingly the more idiomatic diagnostic path.