How do you limit a container's CPU and memory usage?

Detailed Answer

Setting a memory limit

docker run --memory="512m" myapp

This directly configures a cgroup memory limit (see the fundamentals topic's namespaces/cgroups question). If the container's processes try to use more than 512MB in total, the kernel's OOM killer terminates the offending process. This is the same underlying mechanism, with the same tradeoffs, that Kubernetes uses for memory limits and OOMKilled behavior. Memory cannot be gracefully throttled the way CPU can. There is no way to make an over-budget memory allocation simply run more slowly. So exceeding this limit always means termination, not degradation.

docker run --memory="512m" --memory-swap="1g" myapp

--memory-swap sets the combined memory-plus-swap limit. If you don't set it, it defaults to double the --memory value, which allows some swap usage as a buffer. Setting --memory-swap equal to --memory disables swap for the container entirely.

Setting a CPU limit

docker run --cpus="1.5" myapp

--cpus caps the container at that many CPU cores' worth of processing time. Fractional values are allowed — 1.5 means one and a half cores. The kernel enforces this cap using its CFS (Completely Fair Scheduler) quota mechanism. Unlike memory, exceeding a CPU limit doesn't kill the container. The kernel simply throttles it: the container gets less CPU time than it's trying to use, so it keeps running, just more slowly.

docker run --cpu-shares=512 myapp

--cpu-shares is an older mechanism. Instead of setting an absolute cap, it sets a relative weight for CPU time when the host's CPU is under contention. A container with --cpu-shares=1024 gets twice the CPU time of one with --cpu-shares=512, but only when the host's CPU is actually under contention. If the host isn't contended, either container can use up to 100% of available CPU. This is a different model from --cpus's hard, absolute cap. The two are sometimes used together: a relative priority under contention, plus an absolute ceiling.

Why exceeding these limits behaves so differently

This asymmetry — memory limits cause termination, CPU limits cause throttling — comes from the nature of each resource. Memory is incompressible: a process either has an allocation or it doesn't. CPU time is compressible: a process can simply be scheduled less often and run more slowly without crashing. The Kubernetes stack's question on OOMKilled versus CPU throttling covers this same asymmetry, because both ultimately rely on the same underlying Linux cgroup mechanisms. Kubernetes's resources.limits are, at the implementation level, just a more structured, orchestrator-managed way of setting these same --memory/--cpus-equivalent cgroup constraints.

Inspecting current resource usage

docker stats
# CONTAINER   CPU %   MEM USAGE / LIMIT   MEM %
# my-api       12.34%   340MiB / 512MiB      66.4%

docker stats gives a live view of actual resource consumption against configured limits. Use it to confirm whether a configured limit is realistic — comfortably above typical usage, with headroom — or already dangerously close to being hit under normal load. Check this before a real incident forces the question.

Why setting these limits matters, even on a single host

Without explicit limits, a single misbehaving or leaking container can consume all of a host's available CPU or memory. This starves every other container sharing that machine. It is the same "noisy neighbor" problem covered in the Kubernetes stack's resource-management and multi-tenancy topics, just at the scale of one Docker host instead of a whole cluster. Setting sensible --memory and --cpus limits on every container is a basic operational hygiene practice. This applies even in a simple single-host Docker Compose setup, not just in orchestrated, multi-node environments.