What is Docker, and what problem does it solve?

Docker is a platform for packaging an application together with everything it needs to run — code, runtime, libraries, system tools — into a single, portable **image**, and running that image as an isolated, lightweight **container**. It solves "works on my machine": the exact same image runs identically on a developer's laptop, a CI runner, and a production server, because the image bundles the application's whole runtime environment rather than depending on whatever happens to already be installed on the host.

What's the difference between a Docker image and a container?

An image is a read-only, immutable template — a packaged set of filesystem layers plus metadata (what command to run, what ports to expose) — that never changes once built. A container is a running (or stopped) instance created *from* an image, with its own thin writable layer on top for any runtime changes, plus its own isolated process, network, and filesystem view. One image can be used to start many independent containers, each isolated from the others, the same way a class and its instances relate to each other in object-oriented programming.

How does Docker differ from a virtual machine?

A virtual machine virtualizes hardware and runs a complete, separate guest operating system (its own kernel) on top of a hypervisor — heavyweight, but with a very strong isolation boundary. A Docker container shares the host machine's single kernel, using Linux namespaces and cgroups (see that question) to isolate processes from each other — much lighter-weight (starts in milliseconds, far less overhead per instance) but with a comparatively weaker isolation boundary, since a kernel-level vulnerability can, in principle, be exploited across containers sharing that kernel in a way it can't across separate VMs.

What is the Docker Engine architecture — client, daemon, containerd, and runc?

The **Docker CLI** (`docker` command) is a thin client that sends REST API requests to the **Docker daemon** (`dockerd`), which manages images, networks, and volumes, and delegates actual container execution to **containerd** (a separate, standardized container runtime manager), which in turn uses **runc** (a low-level OCI-compliant tool) to actually create the isolated process using Linux namespaces and cgroups. Each layer exists to separate concerns and allow components to be swapped or reused independently, following the same standardized interfaces (like the CRI) that Kubernetes itself also relies on.

How do Linux namespaces and cgroups provide container isolation?

**Namespaces** control what a process can *see* — each namespace type (PID, network, mount, UTS, IPC, user) gives a container its own isolated view of processes, network interfaces, filesystem mounts, hostname, and more, so it appears to have the machine to itself even though it's really sharing the host's single kernel with other containers. **cgroups (control groups)** control what a process can *use* — limiting and accounting for CPU, memory, disk I/O, and other resources, so one container can't starve others sharing the same host. Together, namespaces provide isolation and cgroups provide resource control — the two complementary kernel mechanisms that make a container a container.

What is a union/layered filesystem, and how does Docker use it?

A union filesystem lets multiple separate filesystem layers be stacked and presented as a single merged view, where each layer only stores the differences (added, modified, or deleted files) from the layer beneath it. Docker images are built from a stack of these read-only layers (one per Dockerfile instruction that changes the filesystem), and a running container adds one more thin, writable layer on top — this is what makes layers cacheable and shareable across images, and keeps a container's actual runtime footprint small.

What's the difference between the Docker CLI and the Docker daemon (dockerd)?

The Docker CLI (`docker`) is a lightweight client program that translates commands you type into REST API calls — it holds no state itself and does none of the actual work. `dockerd` (the Docker daemon) is the long-running background process that actually does the work: managing images, containers, networks, and volumes, and maintaining all of Docker's state. This split means the CLI can connect to and manage a daemon running on a completely different, remote machine, not just the one the CLI itself is running on.

What is the Open Container Initiative (OCI), and why does it matter?

The OCI is a set of open, vendor-neutral specifications defining what a container image and a container runtime must look like — the **Image Spec** (how an image's layers and metadata are structured) and the **Runtime Spec** (how a compliant runtime should create and run a container from an unpacked image bundle). This standardization is what allows different tools built by different vendors (Docker, Podman, containerd, runc, CRI-O, Kubernetes) to interoperate — an image built by one tool can be run by any OCI-compliant runtime, without vendor lock-in to any single company's proprietary format.

What happens when you run `docker run hello-world` — describe the flow end to end?

The CLI sends a request to the daemon; the daemon checks whether the `hello-world` image exists locally, and if not, pulls it from Docker Hub (layer by layer); the daemon then asks containerd to create and start a container from that image; containerd hands off to runc, which sets up Linux namespaces and cgroups and executes the image's default command inside that isolated environment; the command runs (printing its message and exiting), and the container transitions to the `Exited` state, with its output streamed back up through containerd and the daemon to the CLI, which prints it to your terminal.

Docker Fundamentals and Architecture

What containers actually are, how Docker's components fit together, and the Linux kernel primitives underneath.

Difficulty

Open as page

The problem before containers

Deploying an application traditionally meant hoping the target machine had the right language runtime version, the right system libraries, and no conflicting versions of anything else already installed. The phrase "it works on my machine" became common for exactly this reason: a developer's laptop, a QA server, and production rarely had identical environments. Subtle mismatches, such as a different OpenSSL version or a missing system package, caused failures that were maddening to reproduce and debug.

What Docker packages together

A Docker image bundles:

The application's own code
The specific language runtime/interpreter version it needs
Every library and system dependency it depends on
Configuration and environment setup

FROM node:20-slim
WORKDIR /app
COPY package.json .
RUN npm install
COPY . .
CMD ["node", "server.js"]

Building this once produces a single artifact that contains everything needed to run the application — no more "make sure Node 20 and these specific npm packages are installed on the server first."

Images vs. containers, briefly (covered fully in the next question)

The image is the packaged, immutable artifact. A container is a running instance of that image, isolated from other processes on the host via Linux kernel features (namespaces and cgroups — see that question). It shares the host's kernel rather than virtualizing an entire separate operating system.

Why this matters practically

docker run myapp:1.0

Running this exact command, with this exact image, produces the exact same running environment whether it's executed on a developer's laptop, a CI server, or production. The image is the single source of truth for "what the application needs to run," eliminating an entire class of environment-mismatch bugs. This also makes deployment portable across infrastructure: the same image can run on a bare-metal server, a cloud VM, or be scheduled by an orchestrator like Kubernetes (see that stack), without rebuilding anything for each target.

Beyond consistency: additional benefits

Isolation — a container's process, filesystem, and network namespace are separated from the host and from other containers, so one application's dependencies can't silently conflict with another's. For example, two apps needing different, incompatible versions of the same library can run side by side, each in its own container, with no conflict.
Efficiency relative to virtual machines — containers share the host's kernel rather than each running a full separate OS, making them dramatically lighter-weight to start and to run many of them side by side (see the VM comparison question for the full contrast).
A standard packaging and distribution format — images can be pushed to and pulled from a registry (see that topic), giving teams a consistent way to share, version, and deploy applications.

The core mental model

Docker essentially answers: "how do I package an application so it carries its own environment with it, and run that package in a way that's isolated from everything else on the machine, without the overhead of a full virtual machine per application?" Every other Docker concept — images, layers, volumes, networks — exists in service of that core idea.

Related Resources

Docker: What is a Container?

Open as page

The image: a read-only template

docker build -t myapp:1.0 .
docker images
# REPOSITORY   TAG    IMAGE ID       SIZE
# myapp        1.0    a1b2c3d4e5f6   180MB

An image is the packaged, immutable result of a build — a stack of filesystem layers (see the layer caching question) plus metadata describing how a container from it should run (its default command, exposed ports, environment defaults). An image itself is never "running" — it's inert, stored data, the same way a class definition or a compiled binary on disk isn't itself "executing."

The container: a running instance, with its own writable layer

docker run -d --name web1 myapp:1.0
docker run -d --name web2 myapp:1.0
docker ps
# CONTAINER ID   IMAGE       NAMES
# 7f8e9d0c1b2a   myapp:1.0   web1
# 3c4d5e6f7a8b   myapp:1.0   web2

Each docker run from the same image creates a genuinely separate, independent container — its own process namespace, its own network namespace and IP, and its own thin writable layer stacked on top of the image's read-only layers. Any file changes a container makes at runtime (writing a log file, a temp file) go into that container's own writable layer. This layer is completely invisible to, and independent of, any other container started from the same image. It is lost when that specific container is removed.

The class/instance analogy

Image  ≈ a class definition        (myapp:1.0 — describes what to run and how)
Container ≈ an instance of that class  (web1, web2 — each independently running,
                                          independently stateful, independently
                                          destroyable, without affecting the others
                                          or the original image)

Starting web2 doesn't consume or modify myapp:1.0 in any way — the image stays exactly as it was, ready to spawn any number of further independent containers.

What happens to a container's writable-layer data

docker stop web1
docker start web1   # any files written to web1's writable layer are still there
docker rm web1      # NOW that writable layer, and everything in it, is gone permanently

Stopping and restarting a container preserves its writable layer's contents. Removing a container discards it entirely. This is exactly why anything meant to persist beyond a single container's lifetime, such as real application data, belongs in a volume rather than the container's own writable layer (see the storage topic).

Why this distinction is foundational to everything else in Docker

Nearly every other Docker concept builds directly on this image/container split. Layer caching and multi-stage builds are about how images are constructed efficiently. Volumes and bind mounts exist specifically because a container's own writable layer is ephemeral and tied to that one container. Registries exist to distribute images (not containers), so that any number of independent containers can be started from the same shared, versioned artifact across many different machines.

Related Resources

Docker: Images and Containers

Open as page

The architectural difference

Virtual Machines                          Containers
┌─────────┐ ┌─────────┐                  ┌─────────┐ ┌─────────┐
│  App A   │ │  App B   │                  │  App A   │ │  App B   │
│ Bins/Libs│ │ Bins/Libs│                  │ Bins/Libs│ │ Bins/Libs│
│ Guest OS │ │ Guest OS │                  └─────────┘ └─────────┘
│ (own     │ │ (own     │                  ┌───────────────────────┐
│  kernel) │ │  kernel) │                  │   Docker Engine         │
└─────────┘ └─────────┘                  ├───────────────────────┤
┌───────────────────────┐                  │   Host OS (ONE kernel,  │
│      Hypervisor          │                  │   shared by all         │
├───────────────────────┤                  │   containers)            │
│      Host OS               │                  └───────────────────────┘
└───────────────────────┘

Virtual machines: virtualize hardware, run a full guest OS each

Each VM includes its own complete guest operating system with its own kernel, running atop a hypervisor (which itself virtualizes CPU, memory, disk, and network for each guest). This gives very strong isolation: a compromise inside one VM's guest kernel doesn't directly threaten another VM's kernel. But this comes at a real cost. Each VM's guest OS consumes its own chunk of memory and disk just to boot and run, and starting a VM typically takes tens of seconds to minutes (booting an entire operating system).

Containers: share the host's kernel, isolate at the process level

A container is, at its core, just a regular process on the host — isolated from other processes using Linux kernel features (namespaces for what it can see, cgroups for what resources it can use — see that question) rather than running its own separate kernel at all. This means containers start in milliseconds. There's no OS to boot: the kernel is already running, and a container is just a newly isolated process within it. Containers also have far lower memory/disk overhead per instance, since there's no duplicated guest-OS footprint for every single container.

The isolation tradeoff, stated plainly

	Virtual Machines	Containers
Isolation boundary	Separate kernel per VM — very strong	Shared host kernel — weaker, process-level isolation
Startup time	Seconds to minutes (booting an OS)	Milliseconds (starting a process)
Resource overhead per instance	High (a full guest OS each)	Low (just the process and its isolated view)
Density (instances per host)	Lower	Much higher
Kernel vulnerabilities	Isolated to that VM's own kernel	Can, in principle, be exploited to escape container isolation and affect the shared host kernel/other containers

Why this tradeoff matters in practice

Containers are the better fit when you need to run many instances of many different applications efficiently, with fast startup, and where the isolation the shared-kernel model provides is sufficient for your trust boundary (see the multi-tenancy discussion in the Kubernetes stack for when it isn't). VMs remain the right choice when you genuinely need the strongest possible isolation between workloads (e.g., running truly untrusted, mutually adversarial code, or needing entirely different kernels/operating systems side by side on the same hardware). The extra overhead is the price paid for a meaningfully stronger security boundary.

They aren't mutually exclusive

In practice, most container workloads run inside VMs anyway. A cloud provider's "bare metal" host running Docker is unusual. More commonly, Docker runs inside a cloud VM instance, which itself runs on a hypervisor shared with other tenants' VMs. This layered approach combines the VM's strong isolation between different customers/tenants at the infrastructure level with the container's lightweight, fast-starting isolation for individual applications within one tenant's own workloads.

Related Resources

Docker: Containers vs Virtual Machines

Open as page

The layered architecture

docker CLI  ──(REST API)──▶  dockerd (Docker daemon)
                                    │
                                    ▼
                             containerd (container lifecycle manager)
                                    │
                                    ▼
                             runc (OCI runtime — creates the actual isolated process)
                                    │
                                    ▼
                       Linux kernel (namespaces + cgroups)

Docker CLI — a thin client

docker run -d -p 8080:80 nginx

The docker command itself does almost no work directly — it constructs an HTTP request describing what you asked for and sends it to the Docker daemon's REST API (typically over a Unix socket, /var/run/docker.sock, or a TCP socket if configured for remote access). This is why remote Docker management tools, and Docker's own CLI running against a remote daemon, both work — the CLI is just one possible client of a well-defined API.

dockerd — the daemon, managing the bigger picture

The Docker daemon handles the higher-level concerns: building images, managing networks and volumes, handling the REST API, and enforcing Docker-level configuration. But it delegates the actual work of running a container to containerd, rather than doing it directly itself.

containerd — container lifecycle management

containerd is a separate, standalone component (donated to and now governed by the CNCF, the same foundation that hosts Kubernetes) responsible for the full container lifecycle: pulling images, managing storage, and supervising running containers. Notably, containerd itself has no CLI or user-facing API in the way docker does. It's designed to be used by a higher-level tool, such as dockerd, or directly by a Kubernetes node's kubelet via the CRI (see the Kubernetes stack's CRI question), rather than by an end user directly.

runc — the low-level OCI runtime

runc is the component that does the actual, final work of creating an isolated container process — setting up Linux namespaces (see that question), configuring cgroups, and then executing the container's process within that isolated environment. runc implements the OCI (Open Container Initiative) Runtime Specification (see that question). This is exactly why alternative low-level runtimes, like Kata Containers or gVisor's runtime, can be swapped in for stronger isolation without containerd or dockerd needing runtime-specific code for each one.

Why this many layers, rather than one monolithic tool

Each layer standardizes a different concern, allowing components above and below it to be swapped independently. containerd can be used directly by Kubernetes without needing dockerd at all (bypassing Docker entirely, which is exactly what happened when Kubernetes deprecated dockershim — see that stack's question). runc can be swapped for a stronger-isolation OCI-compliant runtime without containerd needing to change. This layered, standardized design is precisely why the broader container ecosystem (Docker, Kubernetes, Podman, and others) can share and interoperate around common lower-level components rather than each reimplementing container execution from scratch.

Practical relevance

When troubleshooting a Docker issue, understanding this chain tells you where to look. docker CLI errors about connecting to the daemon point at dockerd's availability. A container failing to actually start (versus the image failing to build) often points further down the stack toward containerd or runc-level issues, such as kernel feature availability or cgroup configuration. Understanding that containerd predates and outlives any specific Docker CLI experience also explains why the same container images and runtime concepts apply, whether you're using plain Docker or a Kubernetes cluster built on the same underlying containerd.

Related Resources

Docker: Docker Architecture

Open as page

Namespaces — controlling what a process can see

A Linux namespace wraps a global system resource so that processes inside the namespace see their own isolated instance of it, while processes outside see the normal, unwrapped resource (or a different namespace's instance entirely).

Namespace	What it isolates
PID	Process IDs — a container's process sees itself as PID 1, unaware of any other processes running on the host or in other containers
NET	Network interfaces, IP addresses, routing tables, ports — a container gets its own virtual network stack, distinct from the host's
MNT	Filesystem mount points — a container sees only its own filesystem view (its image's layers plus any mounted volumes), not the host's real filesystem
UTS	Hostname and domain name — a container can have its own hostname, independent of the host machine's
IPC	Inter-process communication resources (shared memory, semaphores) — prevents one container's IPC objects from being visible to or colliding with another's
USER	User and group IDs — lets a process be root inside the container's namespace while mapping to an unprivileged, non-root user on the actual host, reducing the impact of a container escape

# Inside a container, the container's own main process appears as PID 1
docker exec my-container ps aux
# PID   USER   COMMAND
# 1     root   node server.js      <- this process's REAL host PID might be, say, 48213

This is why a container "sees" only its own processes, its own network configuration, and its own filesystem. This gives the strong illusion of running on a dedicated machine, even though it's really just an ordinarily-scheduled process on the shared host, viewed through a namespace-restricted lens.

cgroups — controlling what a process can use

Namespaces control visibility. cgroups control resource consumption: how much CPU, memory, disk I/O, and network bandwidth a process (or group of processes) is allowed to use. cgroups also provide accounting and metrics for actual usage.

docker run --memory="512m" --cpus="1.5" myapp:1.0

This translates directly into cgroup configuration. The kernel's cgroup subsystem enforces that this container's processes can never allocate more than 512MB of memory (triggering an OOM kill if exceeded — the same underlying mechanism covered in the Kubernetes stack's OOMKilled question). It also caps the container at 1.5 CPU cores' worth of scheduling time, regardless of how much the host machine actually has available.

Why both are needed together

Namespaces alone would let a container see only itself. But without cgroups, nothing would stop that container from consuming all of the host's CPU or memory, starving every other container sharing the machine. Isolation of view without control of consumption isn't enough for a genuinely multi-tenant host. Conversely, cgroups alone (limiting resource usage) without namespaces would still let one container's processes see and potentially interfere with every other process on the host. Together, namespaces provide the illusion of a dedicated machine, and cgroups provide the guarantee that one tenant can't monopolize the shared machine's real resources.

Why this matters beyond trivia

Understanding that "a container" is really just an ordinary Linux process — made to look isolated via namespaces and made resource-bounded via cgroups, not some fundamentally different kind of virtualized entity — explains a lot of otherwise-surprising container behavior. It explains why docker top can show container processes' real host PIDs. It explains why a container "escape" vulnerability is fundamentally about breaking out of namespace/cgroup confinement rather than "hacking a virtual machine." And it explains why Kubernetes's resource requests/limits (see that stack) map directly onto these exact same underlying cgroup mechanisms.

Related Resources

Linux man-pages: namespaces(7)

Open as page

The layered structure of an image

FROM node:20-slim          # Layer 1: the base image itself (many layers internally)
WORKDIR /app                # Layer 2: metadata-only, creates a directory
COPY package.json .          # Layer 3: adds package.json
RUN npm install               # Layer 4: adds node_modules (often the largest layer)
COPY . .                       # Layer 5: adds the rest of the application code

Each instruction that changes the filesystem (COPY, RUN, ADD) produces a new, read-only layer representing just the diff from the layer beneath it — not a full copy of the entire filesystem at that point. docker history myapp:1.0 shows exactly these layers and their individual sizes.

OverlayFS — the modern default union filesystem

Docker's default storage driver on Linux, OverlayFS, merges these stacked layers into what looks like a single, normal filesystem to the running container. Reading a file transparently checks the topmost layer first, falling back through lower layers until the file is found. Writing a file — or the container's own runtime writes — go into a fresh, thin writable layer added on top of all the image's read-only layers when a container starts.

Container's view (merged):        Actual storage (layered):
/app/server.js                    Writable layer:   (container's own runtime writes)
/app/node_modules/...              Layer 5 (read-only): application code
/app/package.json                  Layer 4 (read-only): node_modules
                                    Layer 3 (read-only): package.json
                                    Layer 2 (read-only): WORKDIR metadata
                                    Layer 1 (read-only): base image (node:20-slim)

Copy-on-write — why modifying a file doesn't touch the underlying image layer

If a running container "modifies" a file that exists in a read-only lower layer, OverlayFS doesn't actually change that lower layer at all (it can't — it's read-only and potentially shared with other containers/images). Instead, it copies the file up into the container's own writable layer first, and the modification happens there. This copy-on-write behavior is exactly what allows many containers to be started from the identical same image simultaneously, each with its own independent, isolated writable layer. All of them share the same underlying read-only image layers on disk, with no risk of one container's changes affecting another's, or the original image.

Why layers are cacheable and shareable

Because each layer is content-addressed (identified by a hash of its contents) and immutable once built, Docker can reuse an identical layer across many different images that happen to share it. For example, two different application images both built FROM node:20-slim share every one of that base image's layers on disk, storing that shared content only once, not duplicated per image. This is also the exact mechanism behind Docker's build cache (see the layer caching question): if a layer's inputs haven't changed, Docker reuses the previously-built layer instead of rebuilding it.

Why this matters practically

Understanding the layered/copy-on-write model explains several things. It explains why a large RUN instruction early in a Dockerfile bloats every layer after it, since each subsequent layer's diff is computed relative to an already-large filesystem state. It explains why multi-stage builds (see that question) can discard entire heavyweight build-only layers from the final image. And it explains why an image's total reported size isn't just "the sum of all layers naively added up," but accounts for shared, deduplicated layers across images stored on the same host.

Related Resources

Docker: Storage Drivers

Open as page

The CLI is a thin, stateless client

docker ps
docker run -d nginx
docker images

Every one of these commands does the same basic thing internally: the docker binary formats an HTTP request describing the action, sends it to the daemon's REST API, and formats whatever JSON response comes back into the human-readable output you see. The CLI itself holds no persistent state about running containers, images, or anything else — ask it something without a daemon to talk to, and it has nothing to report.

docker ps
# Cannot connect to the Docker daemon at unix:///var/run/docker.sock.
# Is the docker daemon running?

This exact error — extremely common for anyone new to Docker — makes the client/daemon split concrete: the CLI genuinely can't do anything on its own without a running daemon to actually talk to.

dockerd holds and manages all real state

The daemon is a persistent background process (typically started as a system service) that:

Manages the actual container lifecycle (working with containerd/runc underneath — see that question)
Builds and stores images, and manages their layers on disk
Manages networks and volumes
Exposes the REST API that the CLI (and any other client) communicates with
Persists all of this state across CLI invocations — the daemon keeps running and tracking everything even when no docker command is currently executing

The API socket

ls -la /var/run/docker.sock

By default, the CLI and daemon communicate over a Unix socket on the local machine (/var/run/docker.sock). This is also exactly why membership in the docker group, which grants read/write access to that socket, is functionally equivalent to root access on the host — a detail covered more fully in the security topic's Docker-socket question.

Why this split matters: remote Docker management

docker -H tcp://remote-server:2376 ps
# or via the DOCKER_HOST environment variable
export DOCKER_HOST=tcp://remote-server:2376
docker ps

Because the CLI is just a client speaking a well-defined REST API, it can connect to a daemon running on an entirely different machine (over TCP, ideally secured with TLS client certificates) just as easily as it connects to a local Unix socket. This is what lets tools and workflows manage remote Docker hosts, and what lets a CI runner build images against a Docker daemon running on a separate build server.

The same pattern shows up elsewhere in the ecosystem

This client/daemon separation is a recurring architectural pattern well beyond Docker. The Kubernetes API server plays a directly analogous role (kubectl is a thin client, exactly the way docker is). Recognizing this pattern — a thin CLI translating commands into API calls against a stateful backend service — transfers directly to understanding how most modern infrastructure tooling is built.

Practical troubleshooting relevance

"Cannot connect to the Docker daemon" errors point specifically at the daemon not running, not being reachable at the expected socket/address, or a permissions issue accessing the socket. They do not point at anything wrong with the CLI itself, since the CLI has essentially nothing of its own that can fail independently of its ability to reach the daemon.

Related Resources

Docker: Docker Architecture

Open as page

Why standardization became necessary

In Docker's early years, "a Docker image" and "how a container runs" were both effectively defined by Docker's own proprietary implementation — there was no independent, vendor-neutral specification. As container adoption grew, other tools — including rkt, and later Kubernetes itself — wanted to work with container images and run containers without depending entirely on Docker's specific implementation. In response, the industry, with Docker itself as a founding contributor, created the OCI in 2015 to formally standardize the format. The OCI is hosted under the Linux Foundation.

The two core specifications

The OCI Image Specification defines exactly how a container image is structured — its layered filesystem format, its configuration (default command, environment variables, exposed ports), and how these are packaged and content-addressed. Any tool that produces an OCI-compliant image (Docker's docker build, Podman's podman build, buildah, Google's ko, and others) produces something any OCI-compliant runtime can consume, regardless of which tool built it.

The OCI Runtime Specification defines what a compliant runtime must do given an unpacked filesystem bundle and a configuration file: how to actually create the isolated, running container (setting up the namespaces and cgroups covered earlier) and manage its lifecycle. runc (see the architecture question) is the reference implementation of this spec, but it's not the only one. gVisor and Kata Containers are both OCI Runtime Spec-compliant runtimes that provide stronger isolation than runc's standard approach (gVisor via a user-space kernel intercepting syscalls, Kata via lightweight per-container VMs). Both can be swapped in for runc in an OCI-compliant system without the layers above (containerd, Docker, Kubernetes) needing to change.

Why this matters practically

Image built with: docker build, podman build, buildah, or any OCI-compliant builder
        ↓ (all produce the same standardized image format)
Runnable by: Docker, Podman, containerd (directly, or via Kubernetes), CRI-O,
             or any other OCI-compliant runtime

This is precisely what makes an image built and tested locally with Docker deployable, unchanged, onto a Kubernetes cluster using containerd or CRI-O directly. Recall from the Kubernetes stack that dockershim was removed: Kubernetes talks to containerd/CRI-O via the CRI, not to Docker itself. Yet Docker-built images run there without any modification, precisely because both sides honor the same OCI Image Spec. It also means an organization isn't locked into any single vendor's tooling. Switching from Docker to Podman for local development, or from one CRI-compliant runtime to another in a cluster, doesn't require rebuilding or reformatting existing images.

The broader pattern

The OCI is one instance of a recurring theme across the container ecosystem: standardized interfaces exist specifically so that different vendors' implementations of each layer can be mixed and matched, rather than requiring a single vendor's full, proprietary stack top to bottom. These interfaces include OCI for images/runtimes, CRI for how Kubernetes talks to runtimes, CNI for networking, and CSI for storage — several of these are covered in the Kubernetes stack. Recognizing this pattern — and being able to name OCI specifically as the image/runtime layer of it — signals a genuine understanding of why the container ecosystem looks the way it does today. It is more than just familiarity with Docker's specific commands.

Related Resources

Open Container Initiative

Open as page

Tracing this single, simple command through every layer of the architecture ties together everything covered elsewhere in this topic.

docker run hello-world

Step 1: The CLI sends a request to the daemon

The docker CLI has no logic of its own for running containers — it constructs an HTTP request (POST /containers/create, followed by POST /containers/{id}/start) and sends it to dockerd over the local Unix socket (see the CLI/daemon question).

Step 2: The daemon checks for the image locally

Unable to find image 'hello-world:latest' locally

If you've never pulled hello-world before, the daemon doesn't have it in its local image store — it needs to fetch it.

Step 3: Pulling the image from a registry

latest: Pulling from library/hello-world
719385e32844: Pull complete
Digest: sha256:...
Status: Downloaded newer image for hello-world:latest

The daemon contacts Docker Hub (the default registry — see that topic), downloading the image's layers (each identified by content hash) and its manifest/configuration, storing them in its local layer cache.

Step 4: The daemon delegates actual container creation to containerd

dockerd doesn't create the container's isolated process itself. It calls into containerd (see the architecture question), which manages the container's lifecycle: unpacking the image's layers into a filesystem bundle, and preparing everything a runtime needs to actually start the container.

Step 5: containerd hands off to runc

containerd invokes runc (or whichever OCI-compliant runtime is configured), passing it the prepared filesystem bundle and an OCI runtime configuration. runc does the actual low-level work: creating new Linux namespaces (PID, network, mount, etc. — see that question) and configuring cgroups for the new container. It then executes the image's configured entrypoint/command inside that newly isolated environment.

Step 6: The container's process runs

Inside its isolated namespace, the hello-world image's program executes. In this specific image's case, it prints an explanatory message describing this exact flow and then exits immediately (hello-world is deliberately a minimal, self-documenting image with no long-running server process).

Step 7: Container exit and output streaming

Once the process inside exits, the container's state transitions to Exited — there's no long-running process left inside its PID namespace, so the container itself stops. Throughout this whole process, standard output from inside the container is streamed back up through runc → containerd → dockerd → the CLI, which is why you see the message printed directly in your terminal.

docker ps -a
# CONTAINER ID   IMAGE          STATUS                     NAMES
# a1b2c3d4e5f6   hello-world    Exited (0) 2 seconds ago   happy_euler

Why walking through this matters for an interview

Being able to narrate this full chain demonstrates that you understand Docker as a layered system built on standardized, swappable components, rather than treating docker run as an unexplained black box. The chain runs from the client request, through image resolution and pulling, the daemon delegating to containerd, containerd delegating to runc, and runc creating the actual isolated process via namespaces and cgroups. This is exactly the kind of question that requires tracing a request across every layer of the stack, and it distinguishes surface-level Docker familiarity from a deeper systems-level understanding.

Related Resources

Docker: Getting Started