What is the Kubernetes API server, and how does kubectl interact with it?
Quick Answer
The API server is a RESTful HTTP API that's the single entry point for all cluster operations — every read or write, from `kubectl`, a controller, or any other client, is an HTTP request to the API server, which authenticates and authorizes the request, validates it, and reads from or writes to etcd on the caller's behalf. `kubectl` is essentially a thin client that translates commands and YAML manifests into API server requests and formats the JSON responses for human-readable output.
Detailed Answer
The API server as the single front door
Every interaction with a Kubernetes cluster — whether a human running kubectl get pods, a controller watching for changes, or the scheduler assigning a Pod to a node — happens through the API server's REST endpoints. Nothing in the cluster (other than the API server itself) talks directly to etcd.
kubectl get pods -n default
→ kubectl sends: GET https://<api-server>/api/v1/namespaces/default/pods
→ API server: authenticates the request, checks RBAC authorization,
reads matching Pod objects from etcd, returns JSON
→ kubectl formats the JSON response as the human-readable table you see
The request pipeline
Every request passes through several stages: authentication (who are you — client certificate, bearer token, etc.), authorization (are you allowed to do this — typically RBAC), admission control (mutating and validating webhooks that can modify or reject the request — see the security topic), and finally the actual read/write against etcd. Any stage can reject the request, which is why a well-formed kubectl apply can still fail with a permissions error or an admission webhook rejection even though the YAML itself is syntactically valid.
What kubectl actually is
kubectl is a client binary with no special privileged access of its own — it authenticates using whatever credentials are configured in your kubeconfig file, and every single thing it does is exactly one or more calls to the same public API server endpoints that any other client (a CI pipeline, a custom controller, a monitoring tool) could call directly. This is why kubectl apply -f deployment.yaml and a Python script using the Kubernetes client library to PUT the same object are functionally identical from the API server's point of view.
# These achieve the same result via different means:
kubectl apply -f deployment.yaml
curl -X POST https://<api-server>/apis/apps/v1/namespaces/default/deployments \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/yaml" \
--data-binary @deployment.yaml
Watches, not polling
A key API server feature many clients (including kubectl get pods --watch, and every controller) rely on is the watch mechanism — instead of repeatedly polling for changes, a client can open a long-lived connection and receive a stream of change events as they happen. This is the foundation of the reconciliation model: controllers watch for changes to the objects they care about and react immediately, rather than polling on a fixed interval.
Why this design matters
Because everything goes through one well-defined API, Kubernetes's entire ecosystem of tools (Helm, ArgoCD, custom controllers, monitoring dashboards) can all interact with a cluster the same consistent way, and the API itself can be extended (via Custom Resource Definitions — see that topic) without needing to change how any existing client talks to the cluster.