How do you monitor a running container's resource usage?

6 minintermediatedocker-statsmonitoringresource-usage

Quick Answer

docker stats gives a live, continuously updating view of every running container's CPU, memory, network I/O, and block I/O usage directly from the CLI — the quickest way to check current resource consumption without any additional tooling. For historical trends, alerting, and monitoring across many containers/hosts at real production scale, a dedicated monitoring stack (Prometheus with cAdvisor, or a commercial APM/monitoring tool) is the standard approach, mirroring the same pattern covered in the Kubernetes stack's observability topic.

Detailed Answer

docker stats — a quick, live snapshot

docker stats
# CONTAINER   CPU %   MEM USAGE / LIMIT   MEM %   NET I/O          BLOCK I/O
# api          12.34%   340MiB / 512MiB      66.4%   1.2MB / 3.4MB     0B / 12MB
# db            8.21%   890MiB / 1GiB        86.9%   890KB / 1.1MB     45MB / 12MB
docker stats --no-stream               # a single snapshot instead of continuously updating
docker stats api db                     # limit to specific containers
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}"   # customize output columns

This is the fastest way to get an immediate read on resource usage without setting up any additional tooling. It is useful for quick, ad-hoc checks — for example, "is this container actually using anywhere near its configured limit right now" — during local development or a quick production investigation.

Why docker stats alone isn't sufficient for real production monitoring

  • No historydocker stats only shows the current, live moment. It has no memory of usage 10 minutes ago, let alone last week. This makes it useless for spotting trends, such as a slow memory leak climbing gradually over hours, or for correlating a past incident with resource usage at the time it occurred.
  • No alerting — nothing notifies anyone if a container's memory usage climbs dangerously close to its limit; you have to be actively watching the output yourself.
  • Doesn't scale across many hosts — checking docker stats one host at a time doesn't work once you're running containers across more than a small handful of machines.

cAdvisor — container-level metrics collection

# A cAdvisor container, commonly run as a sidecar/daemon on every host,
# exposing detailed per-container resource metrics for a scraper (like Prometheus)

cAdvisor (Container Advisor, originally built by Google) runs alongside Docker and exposes detailed, per-container resource metrics, in a format that monitoring systems like Prometheus can scrape and store historically. This is conceptually the same role that Kubernetes's kubelet-embedded cAdvisor plays for metrics-server and Prometheus in that stack (see that stack's metrics-server/Prometheus question). It is the same underlying technology, just deployed standalone for plain Docker rather than as part of a Kubernetes node.

Building a real monitoring stack

cAdvisor (per host) ──scraped by──▶ Prometheus (stores historical time series)
                                            │
                                            ▼
                                     Grafana (dashboards)
                                            │
                                            ▼
                                     Alertmanager (notifies on thresholds)

This mirrors exactly the monitoring architecture covered in the Kubernetes stack's observability topic. The same core idea — a metrics-exposing agent, a time-series database, a dashboarding tool, an alerting layer — applies whether the underlying workloads run on plain Docker hosts or a Kubernetes cluster. Only the specific agent differs (cAdvisor standalone vs. kubelet-embedded).

Diagnosing a specific resource problem

docker stats --no-stream api
# if MEM % is consistently near 100%, likely candidate for OOMKilled -- see that concern
# if CPU % is pegged at/near its --cpus limit, the container is likely being throttled

Correlating docker stats output (or, better, historical Prometheus data) against configured --memory/--cpus limits (see the lifecycle topic) is the standard first step. Use it to diagnose whether a container's observed slowness or instability is actually a resource-constraint issue, before assuming the problem lies elsewhere — in the application code, a dependency, or networking.

Related Resources