What is a multi-stage build, and what problem does it solve?
Quick Answer
A multi-stage build uses multiple FROM instructions in a single Dockerfile, each starting a new, independent build stage — letting you use a full-featured image with compilers and build tools in an early stage, then copy only the final compiled artifacts into a separate, minimal final stage. This solves the problem of a production image otherwise being bloated with an entire toolchain (compilers, build dependencies, source code) that's only needed to produce the application, not to run it.
Detailed Answer
The problem: build tools bloat the final image
# Single-stage build -- the final image includes EVERYTHING used to build it
FROM golang:1.22
WORKDIR /app
COPY . .
RUN go build -o server .
CMD ["./server"]
This works, but the resulting image includes the entire Go toolchain: the compiler, standard library source, and build caches. That adds up to hundreds of megabytes, even though the running application, once compiled, is just a single, small, statically-linked binary. None of that build tooling is needed at runtime. It is not even wanted, since it also increases the attack surface.
The multi-stage solution
# Stage 1: "builder" -- has the full toolchain, produces the compiled binary
FROM golang:1.22 AS builder
WORKDIR /app
COPY . .
RUN go build -o server .
# Stage 2: the FINAL image -- minimal, only what's needed to RUN the binary
FROM alpine:3.19
COPY --from=builder /app/server /usr/local/bin/server
CMD ["server"]
The COPY --from=builder instruction reaches back into the first stage's filesystem and copies out just the compiled server binary. None of the Go compiler, source code, or build-time dependencies from the builder stage make it into the final image at all. The final image can be a tiny base — even scratch, an entirely empty base image, for a fully static binary with no runtime dependencies. This often shrinks the final image from hundreds of megabytes down to tens of megabytes or less.
Multiple intermediate stages
FROM node:20 AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
FROM node:20 AS build
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build
FROM nginx:alpine AS final
COPY --from=build /app/dist /usr/share/nginx/html
Stages can be named (AS deps, AS build) and referenced by name in later COPY --from= instructions. This is useful for separating concerns — installing dependencies vs. building vs. the final runtime image — even when the language or runtime doesn't produce a single standalone compiled binary the way Go does. Note that this example's final stage even uses a completely different base image (nginx:alpine) than the build stages (node:20). The final stage just needs to serve the already-built static files, with no Node.js runtime required at all.
Why this matters beyond just image size
- Reduced attack surface — a smaller final image with no compilers, build tools, or source code present means fewer things for a compromised container to exploit or exfiltrate (see the security topic).
- Faster pulls and deployments — a smaller image transfers faster across the network to every node that needs to run it, meaningfully speeding up deployments and autoscaling events at real scale.
- A single Dockerfile, still — before multi-stage builds existed, achieving this same "build in one environment, run in a minimal one" pattern required a different approach. One option was two separate Dockerfiles, with manual artifact copying between them via a shared volume or a script. Another option was building outside Docker entirely and then
COPYing a pre-built artifact in. Both approaches are more awkward and error-prone than expressing the whole pipeline declaratively in one file.