What are Quality of Service (QoS) classes in Kubernetes?
Quick Answer
Kubernetes assigns every Pod one of three QoS classes based on how its containers' resource requests and limits relate to each other: **Guaranteed** (every container has requests equal to limits, for both CPU and memory — the strongest protection), **Burstable** (at least one container has a request set, but requests and limits aren't all equal — some protection), or **BestEffort** (no requests or limits set at all — no protection). This classification directly determines which Pods the kubelet evicts first when a node comes under memory pressure.
Detailed Answer
The three classes, defined by requests vs. limits
Guaranteed — every container in the Pod has both CPU and memory requests and limits specified, and for every container, request equals limit:
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "500m" # identical to request
memory: "512Mi" # identical to request
Burstable — at least one container has a CPU or memory request set, but the Pod doesn't meet the strict "every container, request equals limit for both resources" bar required for Guaranteed:
resources:
requests:
cpu: "250m"
memory: "256Mi"
limits:
cpu: "500m" # higher than request -- allows bursting
memory: "512Mi" # higher than request
BestEffort — no requests or limits specified at all, for any container in the Pod:
resources: {} # nothing set
Why this classification exists: eviction priority under memory pressure
When a node runs low on memory, the kubelet proactively evicts Pods to reclaim resources before the node becomes so overloaded it risks crashing entirely — and it doesn't evict randomly. The eviction order is: BestEffort Pods first, then Burstable Pods whose actual usage exceeds their requests (evicted in order of how far over their request they are), and Guaranteed Pods last (evicted only as an absolute last resort, since by definition they're using exactly what they requested and no more).
Node under memory pressure:
1. Evict BestEffort Pods first (no guarantees were ever made to them)
2. Evict Burstable Pods exceeding their requests, worst offenders first
3. Guaranteed Pods are evicted only if the situation is still critical
after 1 and 2 -- they were never over-consuming relative to their promise
What this means practically for workload design
- Critical, latency-sensitive workloads (a production database, a payment-processing service) should be Guaranteed — setting request equal to limit trades away burst flexibility for the strongest protection against being evicted when the node is under pressure.
- Typical application workloads with somewhat variable but bounded resource needs are usually Burstable — a reasonable middle ground, getting some scheduling guarantee while still allowing headroom for occasional spikes.
- BestEffort should essentially never be used deliberately in production — it's what you get by forgetting to set requests/limits, not a class you should intentionally target; it offers zero protection and is the first thing sacrificed under any resource pressure.
A common mistake
Teams sometimes assume setting a generous memory limit alone is protective — but if the request is left low or unset while the limit is high, the Pod lands in Burstable (or effectively unprotected relative to its actual usage), not Guaranteed, and will be evicted before properly-configured Guaranteed Pods even if it's "only" using resources within its stated limit. QoS class is determined by the relationship between requests and limits, not by the limit's absolute value alone.
Being able to state precisely which combination of requests/limits produces each QoS class — not just the three names — and connecting that directly to eviction order under memory pressure demonstrates real operational understanding of why this classification exists, not just textbook recall.