What strategies reduce Docker image size?

7 minintermediateimage-sizeoptimization

Quick Answer

Use a minimal base image (slim, Alpine, or distroless — see the images topic's base-image question), use multi-stage builds to exclude build-only tooling from the final image, combine related RUN instructions to avoid leaving cleanup-undone bloat in an earlier layer, and use .dockerignore to avoid copying unnecessary files into the build context in the first place. Smaller images pull faster, start faster, and present a smaller attack surface — the benefits compound across every container started from that image, at scale.

Detailed Answer

This question ties together several individually-covered techniques from earlier topics into one consolidated view — a strong answer references each with a concrete "why."

1. Choose a minimal base image

FROM node:20-slim      # instead of the full "node:20"
# or, for even smaller:
FROM node:20-alpine

Recall from the base-image question: full images include a complete OS userland with many tools you likely never use. Slim and Alpine variants strip this down significantly, often by hundreds of megabytes, before your own application code is even added.

2. Multi-stage builds — exclude build-only tooling entirely

FROM golang:1.22 AS builder
WORKDIR /app
COPY . .
RUN go build -o server .

FROM alpine:3.19
COPY --from=builder /app/server /usr/local/bin/server
CMD ["server"]

As covered in that dedicated question, this is often the single most effective size reduction available for any compiled or build-step-requiring language. The final image contains only the compiled artifact, not the entire compiler or toolchain used to produce it.

3. Combine RUN instructions to avoid leaving bloat in an earlier layer

# BAD: cleanup in a LATER layer doesn't shrink the earlier layer that added the bloat
RUN apt-get update
RUN apt-get install -y build-essential
RUN apt-get purge -y build-essential && rm -rf /var/lib/apt/lists/*

# GOOD: install-and-cleanup happens within ONE layer
RUN apt-get update && \
    apt-get install -y --no-install-recommends some-runtime-only-package && \
    rm -rf /var/lib/apt/lists/*

Recall from the layer-ordering question: since layers are immutable once committed, "deleting" something in a later layer only hides it from the merged view. It doesn't actually shrink the earlier layer's stored size. Combining install-then-cleanup into a single RUN ensures the cleanup genuinely reduces that one resulting layer's footprint.

4. Use .dockerignore to avoid unnecessary context

.git
node_modules
*.md
test/

Recall from that question: files excluded via .dockerignore are never even sent as part of the build context. Beyond just speeding up the build, this also prevents accidentally COPYing large, unnecessary directories — an entire .git history, local build artifacts, or test fixtures — into the image at all.

5. Avoid installing unnecessary packages/recommendations

RUN apt-get install -y --no-install-recommends curl

Package managers often install a broader set of "recommended" packages by default, beyond the one you explicitly asked for. Flags like --no-install-recommends (apt), or equivalent minimal-install options in other package managers, avoid this extra, usually-unneeded bloat.

6. Minimize the number of layers where it doesn't hurt caching

Each meaningfully distinct step generally deserves its own layer, for cache-granularity reasons (see the layer-caching question). However, needlessly splitting many tiny, related operations into separate RUN instructions, when they will always change together anyway, just adds layer overhead without any caching benefit. This is a judgment call that balances cache granularity against layer-count overhead.

Why this matters beyond just "the image is smaller"

  • Faster pulls, faster deployments — every node or machine that needs to run the image pulls it faster. This speeds up scaling events and deployments at real scale, a difference that compounds significantly across a fleet of many machines.
  • Reduced attack surface — fewer installed packages and tools means fewer things a compromised container could exploit or use to escalate further (see the security topic).
  • Lower storage costs — across many images, many tags, and many registry replicas, image size differences compound into real infrastructure cost at scale.

docker images and docker history <image> (per-layer size breakdown) are the basic tools for identifying exactly where an image's size is actually coming from. This is the necessary first step before applying any of the techniques above, rather than guessing at where the bloat lives.