What's the difference between a volume, a bind mount, and tmpfs?

A **volume** is storage fully managed by Docker itself, living in Docker's own managed area on disk, independent of any specific host directory structure — the standard, recommended choice for persisting real application data. A **bind mount** maps an arbitrary, specific path on the host's filesystem directly into the container — full control over exactly where the data lives on the host, but tightly coupled to that host's specific directory layout. A **tmpfs mount** stores data only in the host's memory (RAM), never touching disk at all — fast, but the data is gone the moment the container stops.

Why are named volumes generally preferred over bind mounts in production?

Named volumes are portable (referenced by name, not tied to a specific host's directory layout), managed consistently by Docker's own tooling (backup, inspection, and driver-based extensibility all work uniformly), and don't depend on a specific host path existing with the right permissions already set up — all of which make them a better fit for production deployments meant to run identically across different, possibly-changing infrastructure. Bind mounts remain valuable for specific, deliberate use cases (mostly local development) where you genuinely need direct, transparent access to a specific host path.

How do you back up and restore data in a Docker volume?

The standard technique is running a temporary, throwaway container that mounts the target volume alongside a bind-mounted host directory, and uses a simple archiving tool (like `tar`) to copy the volume's contents out to (for backup) or in from (for restore) a compressed archive file on the host. This works because a volume can be mounted by any container, regardless of which "real" container normally uses it — the backup/restore process doesn't need to touch or understand the application itself at all.

What happens to a container's writable-layer data when the container is removed?

Any data written to a container's own writable layer — files created or modified that aren't stored in a mounted volume or bind mount — is permanently and irrecoverably deleted the moment that container is removed (`docker rm`), since the writable layer belongs exclusively to that one specific container and has no independent existence beyond it. This is precisely why any data meant to outlive a single container's removal must live in a separate, explicitly mounted volume, not in the container's own filesystem.

How do volume drivers extend Docker's storage capabilities?

A volume driver is a pluggable backend that determines *where and how* a named volume's actual data is stored — the default `local` driver simply uses a directory on the host's own disk, but third-party and cloud-provider volume drivers can back a volume with NFS, cloud block/object storage, or other distributed storage systems, all while the container itself still just references the volume by name with the same `-v`/`--mount` syntax. This lets the same container configuration work with dramatically different underlying storage infrastructure, without the application or its Docker configuration needing to know or care which one is actually in use.

How do you share data between multiple containers?

Mount the same named volume (or the same bind-mounted host path) into more than one container — since a volume isn't exclusively owned by any single container, any number of containers can mount and read/write the same underlying data simultaneously, as long as the application-level access pattern is safe for concurrent access. This is the standard way to share files between, for example, a main application container and a sidecar-style helper container (mirroring the same pattern covered in the Kubernetes stack's sidecar question).

Storage and Volumes

Persisting data beyond a container's lifetime with volumes, bind mounts, and tmpfs.

Difficulty

Open as page

Volumes — Docker-managed, the recommended default

docker volume create my-data
docker run -d -v my-data:/app/data myapp:1.0

Docker creates and manages the actual storage location on the host's filesystem (typically under /var/lib/docker/volumes/), completely abstracted away from you. You reference it by name (my-data), not by a specific host path. Volumes are the recommended mechanism for persisting any real application data (a database's files, uploaded content). This is precisely because Docker manages their lifecycle, backup tooling, and driver options consistently, independent of the host's own directory structure.

Bind mounts — an arbitrary host path, mapped directly in

docker run -d -v /home/user/my-config:/app/config myapp:1.0
# or, using the more explicit --mount syntax:
docker run -d --mount type=bind,source=/home/user/my-config,target=/app/config myapp:1.0

Maps a specific, existing path on the host directly into the container. This gives full transparency and control over exactly where the data lives on the host's own filesystem, which is genuinely useful for specific scenarios: mounting your local source code directory into a container for live-reload development, or sharing a specific existing host directory. The tradeoff is that the container's behavior now depends on that exact host path existing and having the right permissions and content. This tightly couples the container to that specific host's directory layout, in a way that hurts portability across different machines and environments.

tmpfs mounts — memory-only, never touches disk

docker run -d --tmpfs /app/cache myapp:1.0

Data written here lives entirely in the host's RAM. This is extremely fast, but it is completely lost the moment the container stops. This isn't just on removal — even a docker stop/docker start cycle loses it, unlike a volume or bind mount. This is useful for genuinely temporary data that is sensitive to persist: a cache that's fine to lose, or explicitly avoiding writing sensitive temporary data (like a decrypted secret) to disk at all, even transiently.

Side-by-side comparison

	Volume	Bind mount	tmpfs
Managed by	Docker	You (an arbitrary host path)	Docker (in-memory only)
Survives container removal	Yes	Yes (it's just a host directory)	No — gone even on container stop
Portable across different hosts	Yes (referenced by name, not host path)	No (tied to that host's specific path)	N/A (never persists anywhere)
Typical use	Real application/database data	Local development (mounting source code), sharing a specific existing host resource	Temporary, sensitive, or performance-critical scratch data

Why volumes are generally preferred over bind mounts in production

Recall that a bind mount ties the container's correct behavior to the specific host's directory structure and permissions. This is exactly the kind of environment-dependent coupling containers are meant to eliminate (see the fundamentals topic's "what problem does Docker solve" question). A volume, referenced purely by name, works identically regardless of which host it's actually running on, or how that host's filesystem happens to be laid out. This is a meaningfully better fit for production deployments, where you want the same container configuration to behave identically across different machines.

Related Resources

Docker: Storage

Open as page

Portability across hosts and environments

# Bind mount: hardcodes a specific host path -- this exact path must exist,
# with correct permissions, on EVERY machine this container might ever run on
docker run -v /opt/myapp/data:/app/data myapp

# Named volume: portable -- Docker manages where it actually lives,
# and the SAME command works identically regardless of host layout
docker run -v app-data:/app/data myapp

A bind mount's correctness depends on the assumption that /opt/myapp/data exists, with the right permissions, on whatever host this container happens to run on. That assumption breaks the moment you deploy to a different server, a different developer's laptop, or a freshly provisioned machine without that exact directory structure already set up. A named volume has no such dependency. Docker creates and manages it consistently regardless of the host's own directory layout. This is precisely the same portability guarantee containers are meant to provide for application code in the first place (see the fundamentals topic).

Consistent tooling and lifecycle management

docker volume ls
docker volume inspect app-data
docker volume prune            # clean up unused volumes

Docker's own CLI and API provide first-class commands for listing, inspecting, and cleaning up volumes. A bind mount, being just an arbitrary host directory, isn't tracked or managed by Docker at all in the same way. You must figure out yourself what host directories are actually being used by which containers. Cleaning them up requires ordinary filesystem tools, rather than Docker's own consistent volume-management commands.

Volume drivers extend capability without changing application configuration

docker volume create --driver local --opt type=nfs --opt device=:/exported/path --opt o=addr=nfs-server.example.com my-nfs-volume

Named volumes support pluggable volume drivers (see that question). This lets the same -v my-nfs-volume:/app/data reference in a container's configuration be backed by local disk, NFS, a cloud storage service, or another storage backend entirely. The underlying storage implementation can be swapped without touching the container's own configuration at all. A bind mount, by definition, is always tied to whatever's actually at that literal host path. There is no equivalent abstraction layer to swap the backing storage transparently.

Permission and ownership complications specific to bind mounts

Bind mounts frequently run into UID/GID mismatch issues. A container process running as a specific user ID needs to actually have permission to read and write the bound host directory. Host-side and container-side user ID mappings don't always align cleanly, especially across different host operating systems, or when a container's internal user doesn't correspond to any real user on the host. Named volumes, being fully managed by Docker, avoid much of this complexity, since Docker handles the underlying storage directly rather than requiring alignment with an arbitrary host directory's existing ownership.

When bind mounts are still the right, deliberate choice

Local development — live-mounting your actual source code directory into a container so code changes are immediately reflected without rebuilding the image, a very common and appropriate development workflow.
Deliberately sharing a specific, known host resource — e.g., mounting /etc/localtime read-only to sync a container's timezone with the host's, or mounting a Unix socket like the Docker socket itself (with the security caveats covered in that topic's question).

Reaching for a bind mount purely out of habit, rather than for one of these deliberate reasons, is usually a sign the default should have been a named volume instead.

Related Resources

Docker: Volumes

Open as page

Backing up a volume

docker run --rm \
  -v my-app-data:/source:ro \
  -v $(pwd):/backup \
  alpine \
  tar czf /backup/my-app-data-backup.tar.gz -C /source .

Breaking this down:

-v my-app-data:/source:ro — mounts the volume you want to back up, read-only (:ro), into a temporary container, so the backup process can't accidentally modify the live data while reading it.
-v $(pwd):/backup — a bind mount, giving the temporary container access to your current host directory, so the resulting archive ends up somewhere you can actually access it afterward (outside the ephemeral container).
alpine — a minimal, throwaway image, chosen purely because it includes tar and almost nothing else needed for this one-off task.
tar czf /backup/my-app-data-backup.tar.gz -C /source . — the actual backup command, compressing everything in /source (the mounted volume) into a single archive file, written into /backup (the bind-mounted host directory).
--rm — automatically removes this temporary container once the command finishes, since it has no ongoing purpose beyond this one backup operation.

Restoring from a backup

docker volume create my-app-data-restored

docker run --rm \
  -v my-app-data-restored:/target \
  -v $(pwd):/backup \
  alpine \
  tar xzf /backup/my-app-data-backup.tar.gz -C /target

The reverse process: create a fresh (or existing, if genuinely restoring in place) volume, then mount it as the extraction target. Unpack the previously created archive into it via the same kind of temporary, throwaway container.

Why this pattern works: volumes aren't tied to any specific "owning" container

The key insight making this whole technique possible is that a named volume isn't permanently bound to whichever container originally used it. Any container can mount it, including a completely unrelated, temporary one whose only job is to run a backup/restore command. This is exactly the same underlying property that makes volumes useful for migrating data between different application versions, or even entirely different applications. It works as long as both sides agree on the expected data format inside the volume.

Database-specific backup tools are usually still the better choice for real databases

docker exec my-postgres pg_dump -U postgres mydb > backup.sql

For an actual running database, the database's own native backup tooling (pg_dump, mysqldump, and equivalents) is generally a better approach than a raw filesystem-level tar of the volume. A live database's on-disk files can be in an inconsistent, mid-write state if archived directly while the database is running, unless it's stopped first, or the tool specifically supports safe hot-backup snapshotting. A proper database dump tool, by contrast, guarantees a consistent, valid backup by working through the database's own transactional guarantees, rather than copying raw files.

Automating this as a scheduled task

# A cron job, or a scheduled CI/CD pipeline step, running the backup command
# above on a regular schedule, pushing the resulting archive to durable,
# off-host storage (cloud object storage, a dedicated backup server) --
# never leaving backups only on the SAME host as the live data.

This mirrors the same principle covered in the SQL/Databases and Kubernetes stacks' backup questions. Backups must be automated on a regular schedule, stored somewhere genuinely separate from the live data (so a single host failure can't destroy both simultaneously), and periodically tested by actually performing a restore. An untested backup isn't a real backup, regardless of which specific technology or command produced it.

Related Resources

Docker: Back up, restore, or migrate data volumes

Open as page

The demonstration

docker run -d --name my-db postgres:16    # no volume mounted -- data lives ONLY in the writable layer
docker exec my-db psql -U postgres -c "CREATE TABLE important_data (...);"
# ... insert critical data ...

docker rm -f my-db
docker run -d --name my-db postgres:16     # a FRESH container, from the same image
docker exec my-db psql -U postgres -c "SELECT * FROM important_data;"
# ERROR: relation "important_data" does not exist

The second container is entirely new. It starts from the image's original, unmodified layers, with a fresh, empty writable layer. Every change made to the first container (the new table, its data) lived exclusively in that specific container's now-deleted writable layer. That data is gone permanently, with no relationship at all to the second container, even though both were started from the identical image.

Why this is expected, correct behavior — not a bug

Recall from the fundamentals topic: an image is an immutable, read-only template, and each container gets its own independent writable layer on top of it via copy-on-write (see that question). This is precisely what allows many containers to be started from the same image simultaneously, each with fully independent state. But it also means a container's writable layer is fundamentally tied to that one container's lifetime. It is not tied to the image, and it is not shared with any other container.

The fix: mount a volume for anything that needs to survive

docker volume create db-data
docker run -d --name my-db -v db-data:/var/lib/postgresql/data postgres:16

Now the database's actual data files live in the named volume db-data, not in the container's own writable layer. Removing this container and starting a fresh one picks up exactly where the previous container left off, as long as it mounts the same volume. This works since the volume's data is independent of any specific container's lifecycle:

docker rm -f my-db
docker run -d --name my-db -v db-data:/var/lib/postgresql/data postgres:16
docker exec my-db psql -U postgres -c "SELECT * FROM important_data;"
# the data is still there -- it was never IN the removed container's writable layer at all

The mental model this reinforces

Think of the writable layer as entirely disposable, scratch space specific to one container instance, and volumes as the only place genuinely persistent data should live. Any file written outside a mounted volume path should be treated as something you're comfortable losing the instant that specific container is removed. Logs (which should generally go to stdout/stderr and be captured by Docker's logging driver instead; see the lifecycle topic), temporary caches, and anything else genuinely ephemeral are fine to leave in the writable layer. Real application data, database files, and uploaded content are not.

A common real-world mistake this explains

A surprisingly common incident pattern looks like this: a database or application was run without a mounted volume during initial setup, perhaps for a "quick test" that then quietly became the actual production deployment. Months of accumulated data are then permanently lost the first time that one specific container happens to be removed or replaced, whether during a routine update, a host migration, or simple operator error. This happens precisely because nothing was ever actually persisted outside that one container's own writable layer. Verifying that every stateful container mounts an appropriate volume for its real data, rather than just assuming it does, is a basic, essential production readiness check.

Related Resources

Docker: Storage drivers

Open as page

The default: the local driver

docker volume create my-data
docker volume inspect my-data
# "Driver": "local"
# "Mountpoint": "/var/lib/docker/volumes/my-data/_data"

Without specifying a driver, Docker uses the built-in local driver, which simply creates and manages a directory on the host machine's own disk. This is fine for single-host setups, but it means the volume's data is physically tied to that one specific host — if the container needs to move to a different machine, the volume (and its data) doesn't automatically come along.

Using an alternative volume driver

docker volume create --driver local \
  --opt type=nfs \
  --opt o=addr=192.168.1.100,rw \
  --opt device=:/exported/path \
  my-nfs-volume

docker run -d -v my-nfs-volume:/app/data myapp:1.0

This example uses the local driver's own built-in NFS mount option support, backing the "volume" with a remote NFS share instead of purely local disk. The container's own configuration (-v my-nfs-volume:/app/data) looks identical to using a plain local volume. Only the volume's own creation-time definition differs.

Third-party volume driver plugins extend this further, supporting cloud block storage services, distributed storage systems (Ceph, GlusterFS), and other backends. Each implements Docker's volume plugin API, so that, from the container's perspective, using them requires no different syntax than using any other named volume.

Why this abstraction matters

Container's perspective:  -v my-data:/app/data   (identical, regardless of backend)

Actual backend, depending on the driver used:
  - local disk (default "local" driver)
  - NFS share
  - Cloud block storage (via a cloud-specific driver)
  - A distributed storage system

This mirrors exactly the same abstraction philosophy behind Kubernetes's StorageClasses and the CSI (Container Storage Interface; see that stack's question). Application/container configuration references storage in an abstract, backend-agnostic way, while a pluggable driver layer handles the actual, potentially very different, underlying implementation. The benefit is the same in both ecosystems: you can change or upgrade the underlying storage infrastructure without needing to rewrite every container's configuration that references it.

When you'd actually reach for a non-default driver

Multi-host setups without a full orchestrator — if you're running plain Docker (not Swarm or Kubernetes) across multiple hosts, and need a container's data to be accessible regardless of which specific host it happens to run on, a network-backed volume driver (NFS, a distributed storage plugin) solves this. The default local driver's host-tied storage would not.
Cloud-native storage integration — using a cloud provider's own volume driver plugin to back Docker volumes directly with that provider's managed block/file storage service, gaining that service's own durability, snapshotting, and replication features.

For simple, single-host Docker deployments, the default local driver remains entirely sufficient and requires no extra configuration at all.

Related Resources

Docker: Volume drivers

Open as page

Mounting the same volume into multiple containers

docker volume create shared-logs

docker run -d --name app -v shared-logs:/app/logs myapp:1.0
docker run -d --name log-shipper -v shared-logs:/logs:ro fluent-bit

Both containers mount the same named volume (shared-logs). app writes its log files there. log-shipper reads and forwards them elsewhere; it is mounted read-only via :ro, since it should never need to modify the application's own logs. Neither container needs any special awareness of the other. They are simply both pointed at the same underlying storage.

Why this is the standard technique for sidecar-style patterns

This is conceptually identical to the sidecar container pattern covered in the Kubernetes stack. A helper container (log shipping, a cache-warming process, a file-processing pipeline) shares data with a main application container purely through a commonly mounted volume. Neither container needs direct knowledge of the other's internals — just an agreed-upon shared directory structure and file format.

# Example: an init-container-like pattern using plain Docker + Compose,
# where one container prepares data that another then serves
docker volume create shared-content
docker run --rm -v shared-content:/output content-fetcher:1.0    # populates the volume, then exits
docker run -d -v shared-content:/usr/share/nginx/html nginx        # serves what was fetched

Read-only mounts for safety

docker run -v shared-data:/data:ro myapp

When a container only needs to read shared data, not modify it, mounting the volume read-only (:ro) is good practice. It prevents that container from accidentally (or maliciously, if compromised) corrupting data another container depends on. This is a straightforward application of least privilege at the storage layer.

The concurrency caveat: Docker doesn't handle file-level locking for you

Container A writes to shared-file.json
Container B reads shared-file.json AT THE SAME MOMENT

Mounting the same volume into multiple containers gives them shared access to the same underlying files. But Docker itself provides no automatic coordination, locking, or consistency guarantees beyond whatever the underlying filesystem itself provides. If multiple containers might concurrently write to the exact same file, the application(s) involved are responsible for handling that safely — via file locking, an actual database instead of raw files, or a design where only one container ever writes to any given file at a time. This is exactly the same concern that would apply to multiple processes on a single non-containerized machine sharing a filesystem.

Comparison to bind mounts for the same purpose

docker run -v /host/shared/path:/app/data container-a
docker run -v /host/shared/path:/app/data container-b

The same sharing pattern also works with a bind mount instead of a named volume — both containers reference the identical host path. But this reintroduces the portability tradeoffs covered in the volumes-vs-bind-mounts question, since it's tied to a specific host path existing with correct permissions. Named volumes remain the more portable default even for this multi-container-sharing use case, for the same reasons they're generally preferred for single-container persistent storage.

Shared volumes are the right tool specifically for genuinely related containers cooperating on the same data. They are not a general-purpose way to pass data between unrelated services, which should communicate over the network instead (see the networking topic), rather than through a shared filesystem.

Related Resources

Docker: Share data between containers