What is a read-only root filesystem, and how do you configure one?
Quick Answer
A read-only root filesystem prevents a container's main filesystem from being written to at all at runtime — any attempt to write outside of explicitly mounted, writable volumes fails. This meaningfully limits what a compromised process running inside the container can do (it can't modify application binaries, install new tools, or write a persistent backdoor to the container's own filesystem), at the cost of needing to explicitly identify and mount writable volumes for any directories the application genuinely needs to write to (temp files, caches, logs, if not sent to stdout).
Detailed Answer
Enabling a read-only root filesystem
docker run --read-only myapp:1.0
With this flag, the container's entire root filesystem (everything from its image layers) becomes immutable at runtime — any attempt by the application to write a file anywhere outside of an explicitly writable mount fails.
docker run myapp:1.0 touch /app/test.txt
# touch: /app/test.txt: Read-only file system
Why this meaningfully limits a compromised container's blast radius
If an attacker achieves code execution inside a container with a read-only root filesystem, they cannot: modify the application's own binaries or configuration files to plant a persistent backdoor, install additional tools via a package manager (which needs to write to the filesystem), or drop and execute an arbitrary downloaded payload anywhere in the container's normal filesystem. This is a genuinely strong, low-effort hardening measure. It doesn't prevent an attacker from doing damage entirely — they can still act within memory, make network requests, or read existing files — but it closes off an entire category of "make the compromise persistent or install further tooling" techniques.
The practical challenge: identifying what genuinely needs to be writable
docker run --read-only \
--tmpfs /tmp \
--tmpfs /app/cache \
-v app-uploads:/app/uploads \
myapp:1.0
Most real applications need some writable space — temporary files, an in-memory or on-disk cache, actual persistent data (uploaded content, database files). The pattern is to keep the root filesystem read-only overall, while explicitly providing writable space exactly where it's genuinely needed:
--tmpfsfor ephemeral, memory-backed writable space (temp files, caches that don't need to survive a restart) — see the storage topic's tmpfs question.- Named volumes for anything that genuinely needs to persist (see that topic).
Application-level considerations this surfaces
Enabling a read-only root filesystem often surfaces assumptions baked into an application or its dependencies that weren't previously visible — a logging library that defaults to writing to a local file instead of stdout, a language runtime that writes temporary compiled artifacts to a directory within the application's own tree, or a package that expects to write a lock file or cache somewhere under its own installation directory. Identifying and explicitly accommodating every one of these legitimate write paths (via --tmpfs or a volume) is usually the real work involved in successfully adopting a read-only root filesystem for an existing application. This is often more work than the Docker configuration itself.
In Kubernetes
securityContext:
readOnlyRootFilesystem: true
This maps directly onto the exact same underlying mechanism, configured as part of a Pod's SecurityContext (see that stack's question). It's specifically one of the requirements of the restricted Pod Security Standard, reflecting how significant a hardening measure this is considered industry-wide, not just a Docker-specific nicety.