How do you order Dockerfile instructions to maximize cache reuse?

6 minintermediatedockerfilebuild-cacheoptimization

Quick Answer

Place instructions that change rarely (installing system packages, installing dependencies from a lockfile) before instructions that change frequently (copying in your actual application source code), so that a typical day-to-day code change only invalidates the cheap, fast final layers — not the expensive dependency-installation step. The general principle: order by "least likely to change" first, "most likely to change" last.

Detailed Answer

The anti-pattern: copying everything before installing dependencies

# BAD ORDERING
FROM node:20-slim
WORKDIR /app
COPY . .                  # copies EVERYTHING, including source code that changes constantly
RUN npm install             # this layer's cache key now depends on the ENTIRE copied tree
CMD ["node", "server.js"]

With this ordering, changing any file in the project invalidates the COPY . . layer. This is true even for a single comment in an unrelated source file that has nothing to do with dependencies. Invalidating the COPY . . layer in turn invalidates the npm install layer right after it, since its cache key depends on the preceding layer. Every single build then re-runs the full dependency installation from scratch, even though the actual dependency list (package.json) hasn't changed at all. This is a slow, entirely avoidable rebuild on every code change.

The fix: copy the dependency manifest first, install, then copy the rest

# GOOD ORDERING
FROM node:20-slim
WORKDIR /app
COPY package.json package-lock.json ./    # only the dependency manifest -- changes rarely
RUN npm ci                                  # cached, as long as the manifest hasn't changed
COPY . .                                      # application code -- changes constantly, but
                                                # this is now the LAST filesystem-changing step
CMD ["node", "server.js"]

Now, changing application code only invalidates the final COPY . . layer. The npm ci layer, which is often much slower, stays cached as long as package.json/package-lock.json haven't changed. This is the common case for most day-to-day commits.

The general principle, stated once

Order instructions from least-likely-to-change to most-likely-to-change. System package installation and dependency installation (driven by a lockfile that changes relatively rarely) belong early; application source code (which changes on nearly every commit) belongs as late as possible.

FROM python:3.12-slim
RUN apt-get update && apt-get install -y libpq-dev      # rarely changes
COPY requirements.txt .                                     # changes occasionally
RUN pip install -r requirements.txt                           # cached unless requirements.txt changes
COPY . .                                                        # changes on every commit -- last
CMD ["python", "app.py"]

Combining related RUN instructions to control layer granularity

# Creates two separate layers, and (more importantly) leaves package-manager
# cache/lists behind in the FIRST layer even after the second layer "removes" them,
# since removal in a later layer doesn't shrink an earlier, already-committed layer
RUN apt-get update
RUN apt-get install -y curl && rm -rf /var/lib/apt/lists/*

# Better: combine into ONE layer so cleanup actually reduces that layer's size
RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/*

Since each layer is immutable once committed, "deleting" a file in a later layer doesn't reclaim the space that file used in an earlier layer. It just hides that file from the merged view (recall the union filesystem question). Combining install-then-cleanup into a single RUN instruction ensures the cleanup actually shrinks that one resulting layer, rather than leaving bloat in an earlier layer that a later layer merely masks.

A quick self-check for any existing Dockerfile: change one line of application code, rebuild, and ask what the minimum set of layers should have needed to re-execute. If the actual rebuild touches an expensive dependency-installation step that has nothing to do with that change, the ordering has room to improve. This is often the difference between a multi-minute rebuild and one that takes a couple of seconds.