What are ResourceQuotas and LimitRanges, and how do they differ from pod-level requests/limits?

6 minadvancedresourcequotalimitrangeresource-management

Quick Answer

A ResourceQuota caps the aggregate resource consumption (and/or object counts) allowed within an entire namespace — the total across every Pod, not any single one. A LimitRange sets default, minimum, and maximum resource request/limit values *per container* within a namespace, filling in sensible defaults for Pods that don't specify their own and rejecting ones that fall outside allowed bounds. Individual Pod-level requests/limits (set in each Pod's own spec) are what these two namespace-level mechanisms constrain and default — they operate at a different scope, not as a replacement for per-Pod configuration.

Detailed Answer

ResourceQuota — a namespace-wide ceiling

apiVersion: v1
kind: ResourceQuota
metadata:
  name: team-a-quota
  namespace: team-a
spec:
  hard:
    requests.cpu: "20"
    requests.memory: 40Gi
    limits.cpu: "40"
    limits.memory: 80Gi
    pods: "50"
    persistentvolumeclaims: "10"

This caps the sum total across every Pod in the team-a namespace — combined, they can request at most 20 CPU cores and 40Gi memory, have limits summing to at most 40 CPU cores and 80Gi memory, and the namespace can contain at most 50 Pods and 10 PVCs total. Once a quota is defined for a namespace, every Pod created in it must specify requests/limits for whichever resources the quota covers — a Pod with no requests/limits set would have no defined consumption to check against the quota, so Kubernetes requires them explicitly once a quota is in force for that resource type.

LimitRange — per-container defaults and bounds

apiVersion: v1
kind: LimitRange
metadata:
  name: team-a-limits
  namespace: team-a
spec:
  limits:
    - type: Container
      default:
        cpu: "500m"          # applied if a container doesn't specify a limit
        memory: "512Mi"
      defaultRequest:
        cpu: "250m"           # applied if a container doesn't specify a request
        memory: "256Mi"
      min:
        cpu: "100m"            # reject any container requesting less than this
      max:
        cpu: "2"                # reject any container requesting more than this

A LimitRange operates at the individual container level, within a namespace — it fills in sensible default requests/limits for any container that doesn't specify its own (so a developer who forgets to set them doesn't end up with an unbounded, un-scheduled-predictably container — see the requests/limits question on why this matters), and enforces min/max bounds so no single container can be created with an unreasonably tiny or unreasonably huge resource request, independent of what the namespace's aggregate ResourceQuota allows overall.

How the three levels relate

Pod's own spec.resources.requests/limits   <- what an individual container declares (or, if
                                                omitted, what LimitRange's default fills in)
        ↓ constrained by
LimitRange (per-namespace)                  <- bounds and defaults for each individual container
        ↓ their sum constrained by
ResourceQuota (per-namespace)               <- an aggregate ceiling across the WHOLE namespace

A LimitRange operates on each container individually ("no single container may request more than 2 CPU"); a ResourceQuota operates on the namespace in aggregate ("all containers in this namespace, combined, may request at most 20 CPU total"). Both exist specifically because pod-level requests/limits alone — while essential for scheduling and per-container runtime enforcement (see that question) — provide no mechanism on their own to prevent either a single misconfigured container from requesting an unreasonable amount, or many individually-reasonable Pods from collectively consuming an entire shared cluster's capacity.

Why both matter for multi-tenancy specifically

Without a ResourceQuota, one tenant namespace could accumulate enough Pods (each individually reasonable) to starve every other tenant sharing the cluster of capacity — a "noisy neighbor" problem at the aggregate level. Without a LimitRange, a single developer forgetting to set requests/limits (or setting them absurdly high "just in case") in one Pod spec could itself cause scheduling problems or resource starvation, even within an otherwise well-quota'd namespace. Both mechanisms are typically deployed together as part of a real multi-tenancy strategy (see that question) — quotas capping the aggregate, LimitRanges keeping individual containers within sane, well-defaulted bounds.