What are Quality of Service (QoS) classes in Kubernetes?

Detailed Answer

The three classes, defined by requests vs. limits

Guaranteed — every container in the Pod has both CPU and memory requests and limits specified, and for every container, request equals limit:

resources:
  requests:
    cpu: "500m"
    memory: "512Mi"
  limits:
    cpu: "500m"       # identical to request
    memory: "512Mi"   # identical to request

Burstable — at least one container has a CPU or memory request set, but the Pod doesn't meet the strict "every container, request equals limit for both resources" bar required for Guaranteed:

resources:
  requests:
    cpu: "250m"
    memory: "256Mi"
  limits:
    cpu: "500m"        # higher than request -- allows bursting
    memory: "512Mi"    # higher than request

BestEffort — no requests or limits specified at all, for any container in the Pod:

resources: {}     # nothing set

Why this classification exists: eviction priority under memory pressure

When a node runs low on memory, the kubelet proactively evicts Pods to reclaim resources before the node becomes so overloaded it risks crashing entirely — and it doesn't evict randomly. The eviction order is: BestEffort Pods first, then Burstable Pods whose actual usage exceeds their requests (evicted in order of how far over their request they are), and Guaranteed Pods last (evicted only as an absolute last resort, since by definition they're using exactly what they requested and no more).

Node under memory pressure:
  1. Evict BestEffort Pods first (no guarantees were ever made to them)
  2. Evict Burstable Pods exceeding their requests, worst offenders first
  3. Guaranteed Pods are evicted only if the situation is still critical
     after 1 and 2 -- they were never over-consuming relative to their promise

What this means practically for workload design

Critical, latency-sensitive workloads (a production database, a payment-processing service) should be Guaranteed — setting request equal to limit trades away burst flexibility for the strongest protection against being evicted when the node is under pressure.
Typical application workloads with somewhat variable but bounded resource needs are usually Burstable — a reasonable middle ground, getting some scheduling guarantee while still allowing headroom for occasional spikes.
BestEffort should essentially never be used deliberately in production — it's what you get by forgetting to set requests/limits, not a class you should intentionally target; it offers zero protection and is the first thing sacrificed under any resource pressure.

A common mistake

Teams sometimes assume setting a generous memory limit alone is protective — but if the request is left low or unset while the limit is high, the Pod lands in Burstable (or effectively unprotected relative to its actual usage), not Guaranteed, and will be evicted before properly-configured Guaranteed Pods even if it's "only" using resources within its stated limit. QoS class is determined by the relationship between requests and limits, not by the limit's absolute value alone.

Being able to state precisely which combination of requests/limits produces each QoS class — not just the three names — and connecting that directly to eviction order under memory pressure demonstrates real operational understanding of why this classification exists, not just textbook recall.

What are Quality of Service (QoS) classes in Kubernetes?

Quick Answer

Detailed Answer

The three classes, defined by requests vs. limits

Why this classification exists: eviction priority under memory pressure

What this means practically for workload design

A common mistake

Related Resources

Related Questions