Services and Networking

Difficulty

The problem: Pod IPs are not stable

Every Pod gets its own IP address when it starts — but that address is not durable. If a Pod crashes and is replaced, is rescheduled to a different node, or is part of a rolling update replacing it with a new version, the replacement gets a different IP address. Hardcoding a Pod's IP anywhere (in another application's configuration, in a load balancer) would break constantly as normal cluster operation replaced Pods.

Deployment "web" with 3 replicas might have Pod IPs:
  10.1.2.3, 10.1.2.4, 10.1.2.5   -- right now

After a rolling update or a node failure and rescheduling:
  10.1.3.7, 10.1.2.4, 10.1.4.1   -- completely different set, moments later

What a Service provides

apiVersion: v1
kind: Service
metadata:
  name: web
spec:
  selector:
    app: web            # matches Pods with this label
  ports:
    - port: 80           # the Service's own stable port
      targetPort: 8080   # the port the Pods actually listen on

A Service gets its own stable virtual IP (a ClusterIP) and DNS name (web.default.svc.cluster.local) that never changes for the Service's lifetime, regardless of how many times its backing Pods are replaced. Any other Pod in the cluster can reliably reach http://web (or http://web.default.svc.cluster.local from another namespace) and have traffic routed to one of the currently-healthy Pods matching the app: web label selector — without ever needing to know or track individual Pod IPs.

How the mapping stays current

The Service continuously watches for Pods matching its selector, and an associated Endpoints (or EndpointSlice — see that question) object is kept up to date with the current set of healthy backing Pod IPs. kube-proxy on every node uses this list to program local networking rules (iptables/IPVS/eBPF, depending on configuration) that route traffic sent to the Service's virtual IP to one of the currently-listed healthy Pod IPs.

Why this is the foundational abstraction for nearly everything else

Deployments, StatefulSets, Ingress, and service meshes all build on top of the basic guarantee a Service provides: a stable way to address a set of Pods without caring about individual Pod identity or IP churn. Understanding "Services solve the problem of ephemeral Pod IPs by providing a stable, load-balanced front" is the conceptual anchor for the entire networking topic — every other networking object (Ingress routing to Services, NetworkPolicies restricting traffic to/from Pods a Service fronts, headless Services for StatefulSets) is a variation or extension of this same core need.

Related Resources

ClusterIP — internal only (the default)

apiVersion: v1
kind: Service
metadata:
  name: backend-api
spec:
  type: ClusterIP     # default; can be omitted
  selector:
    app: backend-api
  ports:
    - port: 80
      targetPort: 8080

Gets a virtual IP reachable only from inside the cluster. This is the right choice for the overwhelming majority of Services — internal microservice-to-microservice communication (a frontend calling a backend API, an API calling a database) almost never needs to be reachable from outside the cluster directly.

NodePort — reachable via any node's IP, on a static port

spec:
  type: NodePort
  ports:
    - port: 80
      targetPort: 8080
      nodePort: 30080     # opened on EVERY node's IP, in the 30000-32767 range by default

Every node in the cluster starts listening on nodePort and forwards traffic to the Service, regardless of whether that specific node is actually running any of the backing Pods. This makes the Service reachable via <any-node-ip>:30080 from outside the cluster — but it's a fairly low-level mechanism (you're responsible for load-balancing across nodes yourself, and the fixed port range is limited) rarely used directly in production; it's more commonly a building block that LoadBalancer Services are implemented on top of.

LoadBalancer — provisions a real external cloud load balancer

spec:
  type: LoadBalancer
  ports:
    - port: 80
      targetPort: 8080

On a supported cloud provider (AWS, GCP, Azure), creating a LoadBalancer Service triggers the cloud provider's integration to provision an actual external load balancer (an AWS ELB/NLB, a GCP Load Balancer) that gets a real, internet-routable IP address and forwards traffic into the cluster (typically via NodePort under the hood). This is the standard way to expose a single Service directly to the internet — but provisioning one external load balancer per Service gets expensive and unwieldy at any real scale, which is exactly the problem Ingress (see that question) solves.

ExternalName — a pure DNS alias, no proxying

spec:
  type: ExternalName
  externalName: my-database.us-east-1.rds.amazonaws.com

Creates no virtual IP and does no traffic proxying at all — it's purely a DNS-level CNAME-like redirect. Any Pod resolving my-service.default.svc.cluster.local gets redirected, at the DNS layer, straight to my-database.us-east-1.rds.amazonaws.com. Useful for giving an external dependency (a managed cloud database, a third-party API) a consistent in-cluster name, so application configuration can refer to a stable internal name even if the actual external address changes later.

Choosing between them

NeedService type
Internal service-to-service communication onlyClusterIP
Low-level external access via node IPs (rare in practice)NodePort
A single Service exposed directly to the internetLoadBalancer
Many services need to be exposed under one external IP/domain, with path/host-based routingIngress (fronting ClusterIP Services)
A stable internal name for an external dependencyExternalName

In practice, most production clusters use ClusterIP for internal services and a small number of LoadBalancer Services (or, more commonly, a single one fronting an Ingress controller) rather than exposing many individual Services externally.

Related Resources

The problem with one LoadBalancer Service per application

A LoadBalancer Service provisions a dedicated external cloud load balancer for that one Service — fine for a single application, but a cluster hosting dozens of services, each needing external HTTP access, would need dozens of expensive cloud load balancers, each with its own IP, and no shared logic for routing by hostname or path.

What Ingress solves: one entry point, many backends, routed by rules

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: main-ingress
spec:
  rules:
    - host: api.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: backend-api
                port:
                  number: 80
    - host: app.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: frontend
                port:
                  number: 80

One Ingress (typically fronted by exactly one LoadBalancer Service, pointing at the Ingress controller itself) can route api.example.com to the backend-api Service and app.example.com to the frontend Service — host-based and path-based routing, TLS termination, and often other HTTP-layer features (URL rewriting, request/response header manipulation) that a plain L4 LoadBalancer Service knows nothing about, since a LoadBalancer Service just forwards raw TCP/UDP traffic without any awareness of HTTP semantics.

The Ingress controller — the piece that actually does the work

Critically, creating an Ingress object by itself does nothing unless an Ingress controller is running in the cluster to watch for Ingress objects and implement their rules — much like how a Deployment spec sitting in etcd does nothing until the Deployment controller acts on it. Common Ingress controllers: NGINX Ingress Controller, Traefik, cloud-specific ones (AWS Load Balancer Controller, GCE Ingress). Different controllers support different annotations/features beyond the Ingress spec's baseline (rate limiting, custom load-balancing algorithms, WAF integration), so the choice of controller genuinely matters, not just the Ingress YAML itself.

LoadBalancer Service vs. Ingress — where each fits

LoadBalancer ServiceIngress
LayerL4 (TCP/UDP)L7 (HTTP/HTTPS)
Routing granularityWhole Service, one external LB eachHost/path-based routing to many Services through one entry point
Cost at scale (cloud)One cloud LB per Service — expensive with many servicesTypically one cloud LB total, fronting the Ingress controller
TLS terminationPossible, but more manualCommonly built in via tls config referencing a Secret
Non-HTTP protocols (raw TCP, gRPC without HTTP/1.1 semantics, etc.)Works fine — it's protocol-agnostic at L4HTTP-focused; some controllers support gRPC/TCP passthrough with extensions, but it's not the core design target

The typical real-world setup

Most production clusters run exactly one (or a small number of) LoadBalancer Service, pointed at an Ingress controller Deployment, and expose every other internal application Service purely as ClusterIP, routed to externally only via Ingress rules — this minimizes external cloud load balancer cost and centralizes TLS/routing configuration in one place rather than scattering it across many individually-exposed Services.

Gateway API — the newer alternative

Kubernetes's newer Gateway API is a more expressive, role-oriented successor to Ingress (separating cluster-operator-owned infrastructure config from application-team-owned routing rules, and supporting more protocols/features natively) — increasingly adopted alongside or instead of Ingress, though Ingress remains extremely widely deployed and is not being removed.

Related Resources

The DNS naming pattern

Every Service automatically gets a DNS record following a predictable pattern:

<service-name>.<namespace>.svc.cluster.local
# A Service named "backend-api" in namespace "production"
# is resolvable at:
backend-api.production.svc.cluster.local

From within the same namespace, the short name alone resolves correctly (http://backend-api), because a Pod's DNS search domains include its own namespace — this is why application configuration inside a cluster almost never needs a fully-qualified name, just the plain Service name, as long as the caller and the Service are in the same namespace. Calling across namespaces requires at least backend-api.production (namespace included), or the fully-qualified form.

CoreDNS — the component that answers these queries

Modern Kubernetes clusters run CoreDNS (the successor to the older kube-dns) as a cluster add-on, typically itself a Deployment with a few replicas for availability, exposed via its own Service (usually named kube-dns for historical compatibility, even though it's running CoreDNS). Every Pod's /etc/resolv.conf is automatically configured (by the kubelet) to send DNS queries to CoreDNS's ClusterIP, with the appropriate search domains appended.

# Inside a Pod, this is what gets auto-configured:
cat /etc/resolv.conf
# nameserver 10.96.0.10          <- CoreDNS's Service ClusterIP
# search default.svc.cluster.local svc.cluster.local cluster.local

What gets a DNS record, and what doesn't

  • Every Service gets a DNS A/AAAA record resolving to its ClusterIP (or, for a headless Service, resolving directly to its backing Pods' individual IPs — see that question).
  • Pods themselves can optionally get individual DNS records too (if subdomain and a headless Service are configured — mainly relevant for StatefulSets, where individually addressing web-0 vs web-1 matters).
  • Ordinary Pods (not part of a headless-Service-backed StatefulSet) don't get an individually resolvable DNS name by default — you address them collectively, through their Service.

Why this is the "built-in service discovery" story

Rather than requiring applications to register themselves with, and query, a separate external service registry (like Consul, or a hand-rolled database of "which host runs which service"), Kubernetes uses its own control plane's already-authoritative knowledge of every Service and its endpoints to answer DNS queries directly and automatically. An application only needs to know one thing at deploy time — the Service's name — and DNS plus the Service abstraction together handle everything about which actual Pod IPs currently back it, with zero application-level service-registry code required.

A common practical gotcha

DNS resolution inside a Pod has a real (if usually small) latency cost, and some language runtimes' default DNS resolvers cache results in ways that don't always respect TTLs correctly, or don't retry properly against multiple nameserver entries — this occasionally causes subtle connectivity issues after a Service's backing Pods change, and is worth knowing as a troubleshooting angle when "everything looks fine in Kubernetes but the app still can't reach its dependency" comes up.

The default: flat, fully-open networking

Out of the box, Kubernetes's networking model guarantees every Pod can reach every other Pod's IP directly, cluster-wide, with no NAT and no default restriction — this "flat network" model is a deliberate simplicity choice in the base Kubernetes networking design, but it means a compromised or misbehaving Pod in one namespace can, by default, reach any Pod in any other namespace, including ones it has no legitimate business talking to.

Restricting traffic with a NetworkPolicy

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: backend-allow-from-frontend-only
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: backend-api          # this policy applies to Pods labeled app=backend-api
  policyTypes:
    - Ingress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: frontend       # only allow traffic FROM pods labeled app=frontend
      ports:
        - protocol: TCP
          port: 8080

This says: Pods labeled app: backend-api in the production namespace only accept inbound traffic on port 8080, and only from Pods labeled app: frontend — traffic from any other Pod (including other Pods in the same namespace not labeled frontend, and every Pod in every other namespace) is rejected.

The default-deny pattern

An empty podSelector: {} with no ingress/egress rules matches all Pods in the namespace and, since no rules are specified, denies all traffic of that type — a common, deliberate first step for hardening a namespace:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: production
spec:
  podSelector: {}
  policyTypes:
    - Ingress
    - Egress

Applied alone, this blocks all traffic in/out of every Pod in the namespace — then additional, more permissive NetworkPolicies are layered on top to explicitly allow only the specific traffic patterns actually needed (frontend → backend, backend → database, everything → DNS). This "default deny, then explicitly allow" approach is the standard, security-recommended pattern rather than trying to enumerate every disallowed path from an otherwise-open default.

The critical caveat: NetworkPolicy needs CNI support

A NetworkPolicy object is just declarative configuration — like any Kubernetes object, it does nothing on its own unless something in the cluster actually enforces it. NetworkPolicy enforcement depends entirely on the cluster's CNI plugin (see that question) supporting it — Calico, Cilium, and several others implement NetworkPolicy enforcement; some simpler CNI plugins (certain Flannel configurations, in particular) do not, meaning NetworkPolicy objects you create are silently accepted by the API server but have zero actual effect on traffic. Verifying your specific CNI plugin actually enforces NetworkPolicies is an essential, easy-to-overlook step — a false sense of security from an unenforced NetworkPolicy is worse than no policy at all, since it looks secured but isn't.

For any cluster running genuinely multi-tenant workloads, or handling sensitive data, treat NetworkPolicies (backed by a CNI plugin that actually enforces them) as a baseline security control, not an optional extra — combined with a default-deny starting posture per namespace and explicit allow rules for legitimate traffic paths only.