What is etcd, and why is it critical to a Kubernetes cluster?
Quick Answer
etcd is a distributed, strongly-consistent key-value store, built on the Raft consensus algorithm, that holds the complete state of a Kubernetes cluster — every object's spec and status. It's the only stateful component in the control plane and the single source of truth the API server reads from and writes to; losing etcd's data without a backup means losing the cluster's entire configuration and state, which is why etcd backup and a tested restore procedure are non-negotiable for any production cluster.
Detailed Answer
What etcd actually stores
Every Kubernetes object you create — every Deployment, Service, ConfigMap, Secret, Pod status — is ultimately stored as a key in etcd. The API server is the only component that talks to etcd directly; everything else (kubectl, the scheduler, controllers, kubelets) goes through the API server, which reads and writes to etcd on their behalf.
kubectl apply -f deployment.yaml
→ API server validates & authorizes
→ API server writes the Deployment object to etcd
→ Controller manager's Deployment controller, watching the API server,
notices the new/changed object and creates matching ReplicaSets/Pods
Why Raft consensus matters
etcd is typically run as a cluster of an odd number of nodes (commonly 3 or 5) using the Raft consensus algorithm to agree on writes — a write is only considered committed once a majority (quorum) of etcd nodes have durably persisted it. This gives etcd strong consistency (every read reflects the most recently committed write) and tolerance of node failures (a 5-node etcd cluster can lose 2 nodes and keep operating, since 3 still form a majority) — but it also means etcd write latency is bounded by the slowest node needed to reach quorum, and etcd performance is quite sensitive to disk I/O latency and network latency between its nodes.
Why losing etcd is catastrophic
Every other control plane component is effectively stateless or easily reconstructible — the API server holds no state of its own, the scheduler and controllers can be restarted and will simply re-read current state from etcd (via the API server) and resume operating. But if etcd's data is lost or corrupted without a backup, there is no other copy of the cluster's state anywhere — every Deployment, Service, Secret, and their current status is simply gone, and the cluster must effectively be rebuilt from whatever configuration (YAML manifests, Helm charts, GitOps repositories) exists outside the cluster.
Backup and disaster recovery
# Take a point-in-time snapshot of etcd's data
etcdctl snapshot save backup.db
# Restore from a snapshot (typically as part of rebuilding a control plane node)
etcdctl snapshot restore backup.db
Regular, automated etcd snapshots — stored somewhere other than the etcd nodes themselves — combined with periodic restore testing (an untested backup isn't a real backup) is standard practice for any self-managed production cluster. Managed Kubernetes services (EKS, GKE, AKS) handle etcd backup and the entire control plane's resilience for you, which is one of the most significant operational burdens they take off a team's plate compared to self-hosting.
Security note
Because etcd holds every Secret's data (by default, unencrypted unless encryption-at-rest is explicitly configured — see the security topic), direct network access to etcd must be tightly restricted to the control plane components that need it, and encryption at rest should be enabled for any cluster storing genuinely sensitive Secret data.