Images, Dockerfile, and Builds

Difficulty

A representative Dockerfile

FROM node:20-slim
WORKDIR /app
ENV NODE_ENV=production
COPY package.json package-lock.json ./
RUN npm ci --omit=dev
COPY . .
EXPOSE 3000
USER node
CMD ["node", "server.js"]

Instruction by instruction

  • FROM — every Dockerfile starts with a base image to build on top of; this determines the starting filesystem layers and OS/runtime foundation everything else adds to.
  • RUN — executes a command at build time, and commits its filesystem changes as a new image layer (see the layer caching question) — used for installing packages, compiling code, or any other build-time setup.
  • COPY — copies files/directories from the build context (the directory you run docker build from) into the image's filesystem.
  • WORKDIR — sets the working directory for all subsequent RUN, CMD, COPY, etc. instructions — functionally similar to running cd, but persists across instructions and creates the directory if it doesn't exist.
  • ENV — sets an environment variable that persists into the running container. It is visible to the application at runtime, not just during the build. This is distinct from ARG, which only exists during the build (see that question).
  • EXPOSE — purely documentation/metadata. It tells anyone reading the Dockerfile (and tools like docker network) which port(s) the containerized application listens on, but it doesn't actually publish or open that port to the host. You still need -p on docker run to actually map it (see the networking topic).
  • USER — sets which user subsequent instructions run as, and which user the final container's main process runs as by default — critical for the security practice of not running as root (see the security topic).
  • CMD and ENTRYPOINT — both define what actually executes when a container starts, with an important behavioral distinction covered in the next question.

Additional instructions worth knowing

ARG BUILD_VERSION=dev          # build-time-only variable (see the ARG vs ENV question)
LABEL maintainer="team@example.com"   # arbitrary metadata attached to the image
VOLUME /data                    # documents/declares a mount point (see the storage topic)
HEALTHCHECK CMD curl -f http://localhost/health || exit 1   # see the container lifecycle topic

Why instruction order matters beyond just readability

Each instruction that touches the filesystem (RUN, COPY, ADD) creates a new cached layer. Docker's build cache invalidates from the point of the first changed instruction onward. Every instruction after that point must be re-executed, even if its own inputs didn't change. A Dockerfile is really executable documentation of how to build and run the application. This ordering sensitivity is exactly why it deserves the same deliberate structure as any other piece of code, not whatever order felt natural while developing.

Related Resources

COPY — simple, explicit, predictable

COPY package.json package-lock.json ./
COPY src/ ./src/

Copies files or directories from the build context straight into the image's filesystem, with no additional behavior — what you see is exactly what happens.

ADD — COPY, plus automatic extraction and remote URL fetching

# ADD automatically extracts a LOCAL tar archive into the destination
ADD myapp.tar.gz /app/

# ADD can fetch directly from a URL
ADD https://example.com/config.json /app/config.json

The first example is the behavior that most often surprises people. If the source is a recognized local archive format (.tar, .tar.gz, .tar.bz2, etc.), ADD automatically unpacks it into the destination directory. COPY would instead just copy the compressed archive file itself, unextracted. This implicit "maybe it extracts, maybe it doesn't, depending on file type" behavior is exactly what Docker's own documentation calls out as a source of confusion.

Why COPY is the recommended default

  • PredictabilityCOPY's behavior is a single, simple operation with no hidden conditional logic based on file type.
  • ADD's remote-URL fetching is generally discouraged — fetching a remote file directly in a Dockerfile instruction means the build result depends on the state of a URL outside your control at build time, which is worse for reproducibility. The fetched file also isn't automatically cleaned up if it's only needed transiently. A RUN curl ... && ... in the same layer, or a multi-stage build, gives more explicit control over this.
  • ADD's auto-extraction is only useful in one specific scenario: unpacking a local tarball as part of assembling the image. This is legitimate, but narrow enough that it's worth reaching for ADD deliberately for that one purpose, rather than defaulting to it out of habit. A remote URL fetch in particular is better done as an explicit RUN curl/wget, or a multi-stage build step that verifies the artifact, than via ADD's implicit behavior.

CMD alone — a default command, easily overridden

CMD ["node", "server.js"]
docker run myapp                  # runs: node server.js
docker run myapp node debug.js     # OVERRIDES the entire CMD -- runs: node debug.js instead

Any arguments given after the image name on docker run completely replace the CMD. This makes CMD alone appropriate when the image is meant to be flexible about what it runs — a general-purpose base image, or a development image where you might want to run a shell or a different script for debugging.

ENTRYPOINT alone — a fixed command that always runs

ENTRYPOINT ["node", "server.js"]
docker run myapp                    # runs: node server.js
docker run myapp --port=9000         # runs: node server.js --port=9000 (appended as ARGS, not a replacement)

Arguments given at docker run are appended to the ENTRYPOINT, not used to replace it. This makes ENTRYPOINT appropriate when the image should always run one specific thing no matter what. It essentially makes the container behave like a fixed, dedicated executable.

Combining both — the standard, recommended pattern

ENTRYPOINT ["node"]
CMD ["server.js"]
docker run myapp                # runs: node server.js       (CMD's default argument used)
docker run myapp debug.js        # runs: node debug.js        (CMD's default OVERRIDDEN, but still passed to ENTRYPOINT)

This gives you the best of both: ENTRYPOINT fixes what program runs (node, always), while CMD provides a sensible default argument to it. That default argument is still easy to override for a one-off different invocation, without needing to override the entire command.

Exec form vs. shell form — a critical, easy-to-miss distinction

# Exec form (recommended): runs the command DIRECTLY, no shell involved
CMD ["node", "server.js"]

# Shell form: runs the command wrapped in "/bin/sh -c ..."
CMD node server.js

The exec form (JSON array syntax) runs the specified program directly as PID 1 inside the container. Signals like SIGTERM (sent by docker stop) go straight to it, allowing graceful shutdown handling. The shell form instead runs /bin/sh -c "node server.js". The shell itself becomes PID 1, and it's the shell's responsibility to forward signals to the actual application process underneath it — a responsibility it doesn't always fulfill correctly. This is a common, subtle cause of containers that don't shut down gracefully: they ignore SIGTERM and only stop after docker stop's timeout forces a SIGKILL.

ScenarioRecommended setup
Fixed, purpose-built application containerENTRYPOINT + CMD (overridable default args)
General-purpose or dev image, command often replaced entirelyCMD alone
Either form, alwaysExec (JSON array) syntax, for correct signal handling

How the cache decides whether to reuse a layer

FROM node:20-slim              # Layer A
WORKDIR /app                    # Layer B
COPY package.json ./             # Layer C -- cache key includes package.json's actual content
RUN npm install                   # Layer D -- cache key includes the PRECEDING layer + this instruction's text
COPY . .                            # Layer E -- cache key includes the content of every copied file

For each instruction, Docker computes a cache key based on the preceding layer plus that instruction's own inputs. For RUN, that's the literal command text. For COPY/ADD, it's the actual file contents being copied, not just their names. So even a single-character change inside package.json invalidates Layer C and, since caching is sequential, everything after it too.

Why this makes rebuilds fast — when structured well

# First build: everything builds from scratch
docker build -t myapp .
# ... (30 seconds, say, mostly spent on `npm install`)

# Change only application code (not package.json), rebuild:
docker build -t myapp .
# Layer A, B, C, D all CACHE HIT (package.json unchanged, so npm install's inputs are identical)
# Only Layer E (COPY . .) and anything after it actually re-executes
# ... (2 seconds)

Because npm install (often one of the slowest steps) sits before the COPY . . that brings in frequently-changing application code, changing application code alone doesn't invalidate the expensive dependency-installation layer at all. This is the single most impactful Dockerfile optimization technique, covered in more depth in the cache-ordering question.

Why cache invalidation cascades forward, never backward

FROM node:20-slim
COPY package.json ./     # Layer C
RUN npm install            # Layer D
COPY . .                     # Layer E  <- if THIS changes, only E (and anything after) rebuilds
                              #             C and D are unaffected, since their own inputs didn't change

If instead package.json changes, Layer C invalidates, and every layer from C onward (D, E) must rebuild too. This happens even though Layer E's own inputs (the application code) might not have changed at all. This "cascades forward from the first change" rule is why placing rarely-changing, expensive instructions (dependency installation) before frequently-changing ones (application code) is so consistently valuable. It maximizes how often the expensive early layers get to reuse the cache.

Sharing cache and layers across images, not just across builds of the same image

Layers are content-addressed and stored once on a given host. So two entirely different images that happen to share an identical layer (e.g., both FROM node:20-slim, with no differences up to some point) genuinely share that stored layer on disk — not just conceptually, but as literally the same data. This saves both disk space and pull time when a machine already has one image with a shared base layer and pulls another.

Cache-busting techniques when you deliberately want to skip the cache

docker build --no-cache -t myapp .        # ignore the cache entirely for this build

This is occasionally necessary when a RUN instruction's effects depend on something outside its literal text or copied files. For example, in RUN apt-get update && apt-get install -y curl, the actual packages fetched can change over time even though the instruction's text never does. This is a common, subtle source of "why did my rebuild not pick up the latest security patches" confusion. The cache has no way to know that an identical-looking instruction might now behave differently against a changed remote package repository.

Related Resources

The anti-pattern: copying everything before installing dependencies

# BAD ORDERING
FROM node:20-slim
WORKDIR /app
COPY . .                  # copies EVERYTHING, including source code that changes constantly
RUN npm install             # this layer's cache key now depends on the ENTIRE copied tree
CMD ["node", "server.js"]

With this ordering, changing any file in the project invalidates the COPY . . layer. This is true even for a single comment in an unrelated source file that has nothing to do with dependencies. Invalidating the COPY . . layer in turn invalidates the npm install layer right after it, since its cache key depends on the preceding layer. Every single build then re-runs the full dependency installation from scratch, even though the actual dependency list (package.json) hasn't changed at all. This is a slow, entirely avoidable rebuild on every code change.

The fix: copy the dependency manifest first, install, then copy the rest

# GOOD ORDERING
FROM node:20-slim
WORKDIR /app
COPY package.json package-lock.json ./    # only the dependency manifest -- changes rarely
RUN npm ci                                  # cached, as long as the manifest hasn't changed
COPY . .                                      # application code -- changes constantly, but
                                                # this is now the LAST filesystem-changing step
CMD ["node", "server.js"]

Now, changing application code only invalidates the final COPY . . layer. The npm ci layer, which is often much slower, stays cached as long as package.json/package-lock.json haven't changed. This is the common case for most day-to-day commits.

The general principle, stated once

Order instructions from least-likely-to-change to most-likely-to-change. System package installation and dependency installation (driven by a lockfile that changes relatively rarely) belong early; application source code (which changes on nearly every commit) belongs as late as possible.

FROM python:3.12-slim
RUN apt-get update && apt-get install -y libpq-dev      # rarely changes
COPY requirements.txt .                                     # changes occasionally
RUN pip install -r requirements.txt                           # cached unless requirements.txt changes
COPY . .                                                        # changes on every commit -- last
CMD ["python", "app.py"]

Combining related RUN instructions to control layer granularity

# Creates two separate layers, and (more importantly) leaves package-manager
# cache/lists behind in the FIRST layer even after the second layer "removes" them,
# since removal in a later layer doesn't shrink an earlier, already-committed layer
RUN apt-get update
RUN apt-get install -y curl && rm -rf /var/lib/apt/lists/*

# Better: combine into ONE layer so cleanup actually reduces that layer's size
RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/*

Since each layer is immutable once committed, "deleting" a file in a later layer doesn't reclaim the space that file used in an earlier layer. It just hides that file from the merged view (recall the union filesystem question). Combining install-then-cleanup into a single RUN instruction ensures the cleanup actually shrinks that one resulting layer, rather than leaving bloat in an earlier layer that a later layer merely masks.

A quick self-check for any existing Dockerfile: change one line of application code, rebuild, and ask what the minimum set of layers should have needed to re-execute. If the actual rebuild touches an expensive dependency-installation step that has nothing to do with that change, the ordering has room to improve. This is often the difference between a multi-minute rebuild and one that takes a couple of seconds.