What is an EndpointSlice, and why did it replace/augment Endpoints?

Detailed Answer

The role both objects play

Both Endpoints and EndpointSlices exist to answer the same question: "which Pod IP:port combinations are currently the healthy backends for this Service?" — this is the data kube-proxy (and other consumers, like a service mesh's control plane) actually watches and reacts to when programming routing rules.

# The older Endpoints object -- one object per Service, unbounded size
apiVersion: v1
kind: Endpoints
metadata:
  name: web           # matches the Service name
subsets:
  - addresses:
      - ip: 10.1.2.3
      - ip: 10.1.2.4
      # ... every single backing Pod IP, in ONE list, in ONE object
    ports:
      - port: 8080

The scaling problem this caused

For a Service backed by a very large number of Pods (thousands, in large clusters), every single one of those Pod IPs lived inside one Endpoints object. Any single change — one Pod becoming unready, one Pod being replaced during a rollout — required the API server to serialize and transmit the entire updated list (potentially tens of thousands of IP entries) to every component watching that object. This scaled poorly: both the size of each update and the number of components needing to process it grew with cluster size, making large-scale rollouts and node churn noticeably more expensive on the control plane.

EndpointSlices — the same data, sharded into smaller pieces

apiVersion: discovery.k8s.io/v1
kind: EndpointSlice
metadata:
  name: web-abc123        # one of potentially several slices for the "web" Service
  labels:
    kubernetes.io/service-name: web    # links this slice back to its Service
addressType: IPv4
endpoints:
  - addresses: ["10.1.2.3"]
    conditions:
      ready: true
  - addresses: ["10.1.2.4"]
    conditions:
      ready: true
ports:
  - port: 8080

Instead of one unbounded object, a Service's full backend list is split across multiple EndpointSlice objects, each capped at a configurable maximum (100 endpoints by default). When one Pod's readiness changes, only the one slice containing that Pod needs to be updated and redistributed — not the entire backend list for the whole Service — which is a significant, targeted fix for the update-cost-at-scale problem.

Additional improvements EndpointSlices brought along

Beyond sharding, EndpointSlices also natively support dual-stack (IPv4 and IPv6 simultaneously, via separate slices per address type) and carry richer per-endpoint information (like topology hints, used for topology-aware routing that prefers keeping traffic within the same zone/region for latency and cost reasons) that the older Endpoints object's simpler structure didn't accommodate well.

Practical relevance today

Endpoints objects still exist (for backward compatibility with older tooling that reads them directly) and are still automatically kept in sync alongside EndpointSlices for any Service, but EndpointSlices are what modern kube-proxy and other Service-consuming components actually watch and rely on. When debugging Service connectivity at scale, checking kubectl get endpointslices -l kubernetes.io/service-name=<service> (rather than the older kubectl get endpoints) is the more scalable and increasingly the more idiomatic diagnostic path.