On a linux system, docker stores data pertaining to Images
, Containers
, Local Volumes
, Build Cache
.
We can check all the storage size via docker system df
command. Like below.
# docker system df
TYPE TOTAL ACTIVE SIZE RECLAIMABLE
Images 11 5 4.574GB 3.436GB (75%)
Containers 7 1 17.7MB 1.546kB (0%)
Local Volumes 39 3 1.737GB 1.195GB (68%)
Build Cache 0 0 0B 0B
A Docker image is built up from a series of layers
. Each layer represents an instruction in the image’s Dockerfile. Each layer except the very last one is read-only. Each layer is only a set of differences from the layer before it. Note that both adding, and removing files will result in a new layer. Even though delete some files or directories, but will still be available in the previous layer and add up to the image’s total size. There is a example to explain how to calculate the image's total size.
This example base on debian image, add more layer to debian, each layer will add a file on the top layer or delete the file, build and check the image's size.
# docker images
debian latest 04fbdaf87a6a 5 weeks ago 124MB
# cat Dockerfile
FROM debian
CMD ["/bin/ls", "/"]
# docker build -t image_layer_test:v1 .
# docker images
image_layer_test v1 901e2dd23866 5 minutes ago 124MB
debian latest 04fbdaf87a6a 5 weeks ago 124MB
# Add aa.zip to image
# ls -lhrt
total 86M
-rw-r--r-- 1 root root 86M Mar 4 10:29 aa.zip
-rw-r--r-- 1 root root 152 Mar 4 10:31 Dockerfile
# cat Dockerfile
FROM debian
ADD aa.zip .
CMD ["/bin/ls", "/"]
# docker build -t image_layer_test:v2 .
# docker images
image_layer_test v2 2e0cf77e0921 About a minute ago 214MB
image_layer_test v1 901e2dd23866 5 minutes ago 124MB
debian latest 04fbdaf87a6a 5 weeks ago 124MB
# cat Dockerfile
FROM debian
ADD aa.zip .
ADD aa.zip /app
CMD ["/bin/ls", "/"]
# docker build -t image_layer_test:v3 .
# docker images
image_layer_test v3 060063a59a06 54 seconds ago 304MB
image_layer_test v2 2e0cf77e0921 About a minute ago 214MB
image_layer_test v1 901e2dd23866 5 minutes ago 124MB
debian latest 04fbdaf87a6a 5 weeks ago 124MB
# cat Dockerfile
FROM debian
ADD aa.zip .
ADD aa.zip /app
RUN rm -f aa.zip
CMD ["/bin/ls", "/"]
# Even though delete /aa.zip on the top layer, the image size is still 304MB
# docker build -t image_layer_test:v4 .
# docker images
image_layer_test v4 76712176d6c0 12 seconds ago 304MB
image_layer_test v3 060063a59a06 54 seconds ago 304MB
image_layer_test v2 2e0cf77e0921 About a minute ago 214MB
image_layer_test v1 901e2dd23866 5 minutes ago 124MB
debian latest 04fbdaf87a6a 5 weeks ago 124MB
The major difference between a container and an image is the top writable layer. All writes to the container that add new or modify existing data are stored in this writable layer. When the container is deleted, the writable layer is also deleted. The underlying image remains unchanged.
Because each container has its own writable container layer, and all changes are stored in this container layer, multiple containers can share access to the same underlying image and yet have their own data state. The diagram below shows multiple containers sharing the same Ubuntu 15.04 image.
To view the approximate size of a running container, you can use the docker ps -s
command. Two different columns relate to size.
size
: the amount of data (on disk) that is used for the writable layer of each container.virtual size
: the amount of data used for the read-only image data used by the container plus the container’s writable layer size. Multiple containers may share some or all read-only image data. Two containers started from the same image share 100% of the read-only data, while two containers with different images which have layers in common share those common layers. Therefore, you can’t just total the virtual sizes. This over-estimates the total disk usage by a potentially non-trivial amount.This example below is for calculate container's size:
# docker images
nginx latest c316d5a335a5 5 weeks ago 142MB
# docker run -idt --name container_size nginx
# docker ps -s
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES SIZE
b4739b175580 nginx "/docker-entrypoint.…" 4 seconds ago Up 3 seconds 80/tcp container_size 1.09kB (virtual 142MB)
# Attach into container_size, build a lager file aa.txt
# docker exec -it container_size /bin/bash
root@b4739b175580:/# for i in `seq 1 10000`;do cat /var/log/dpkg.log >> aa.txt;done
root@b4739b175580:/# ls -lhrt
-rw-r--r-- 1 root root 326M Mar 4 22:21 aa.txt
# The container size raise to 482MB(341M + 142MB)
# docker ps -s
b4739b175580 nginx "/docker-entrypoint.…" 4 minutes ago Up 4 minutes 80/tcp container_size 341MB (virtual 482MB)
Docker has two options for containers to store files on the host machine, so that the files are persisted even after the container stops: volumes
, and bind mounts
.
Docker also supports containers storing files in-memory on the the host machine. Such files are not persisted. If you’re running Docker on Linux, tmpfs mount
is used to store files in the host’s system memory. If you’re running Docker on Windows, named pipe
is used to store files in the host’s system memory.
Let's take a deep dive into these three storage types on Linux.
Volumes
is created and managed by Docker(/var/lib/docker/volumes/ on Linux). Non-Docker processes should not modify this part of the filesystem. Volumes are the best way to persist data in Docker.
We can create a volume explicitly using the docker volume create
command, or Docker can create a volume during container or service creation. Using the docker volume ls
to check what volumes do you have. Using the docker volume rm/prune
to delete a specified volume or delete all unused local volumes. Some operation example like below:
# Usage syntax: docker volume create [OPTIONS] [VOLUME]
Options:
-d, --driver string Specify volume driver name (default "local")
--label list Set metadata for a volume
-o, --opt map Set driver specific options (default map[])
# Create a unnamed volume, the name will generated by Docker
# docker volume create
5efb3046925cde162a2ffcc1b1d1b56a2c527b6763137a62ddee7eb9912c45df
# Create a named volume by yourself
# docker volume create my_new_volume
my_new_volume
# Usage syntax: docker volume ls [OPTIONS]
Options:
-f, --filter filter Provide filter values (e.g. 'dangling=true')
--format string Pretty-print volumes using a Go template
-q, --quiet Only display volume names
# List all volumes
# docker volume ls
DRIVER VOLUME NAME
local 0e5a2e7a8e07715035e7ce765faf7c25e09818811bb039a7651ccbb94bb39009
local 0f04870987068d8b5114db594f2ba2af6e736f61ce01160d934e6d708688634f
...
local 9f9a4edf4027d9404183eeff50e019ad914dd384f985f5512441b8f2ac6ba1c1
local 31cf42ffa01dd93fe1be48e3cfb13d69f54be18566fb65040c7a3042eb1c1f9e
# check the details for volume
# docker volume inspect c1fd3cf6d559ced49e9a520772cc46cf642a54c8136fd45bcec1b845293c6641
[
{
"CreatedAt": "2022-02-11T17:00:20-08:00"",
"Driver": "local",
"Labels": null,
"Mountpoint": "/var/lib/docker/volumes/c1fd3cf6d559ced49e9a520772cc46cf642a54c8136fd45bcec1b845293c6641/_data",
"Name": "c1fd3cf6d559ced49e9a520772cc46cf642a54c8136fd45bcec1b845293c6641",
"Options": null,
"Scope": "local"
}
]
# Usage syntax: docker volume rm [OPTIONS] VOLUME [VOLUME...]
Options:
-f, --force Force the removal of one or more volumes
# Delete one/more specified volumes
# docker volume rm c283b903dd5c66479442a56c9e245a1e586b3e6fe49d9d78045195030f60ff5e 188cb41715bed5e30ccae2a2740814ab052bcf44cd096c4dddde243a771d4a53 dc493799f909cd96590c8d1c9a2294523deae62d1b9e95942db252f4f08db4e2
c283b903dd5c66479442a56c9e245a1e586b3e6fe49d9d78045195030f60ff5e
188cb41715bed5e30ccae2a2740814ab052bcf44cd096c4dddde243a771d4a53
dc493799f909cd96590c8d1c9a2294523deae62d1b9e95942db252f4f08db4e2
# Delete all unused volumes
# docker volume prune
WARNING! This will remove all local volumes not used by at least one container.
Are you sure you want to continue? [y/N] y
Deleted Volumes:
7672e892218087df956a1c0e41c84b4d63f587dec20db4db26d0d7e9edbfd26f
ed58df87c296e9d74b59b8a7909fc5fc2fae87e34fddcf6aac5f239a58a160a8
my_new_volume
...
b8eb7a62e5db6b3ec1c1f635136c8c29b718ae3a42e7a30cdeef1d10cf79551e
fe226d20443149f054d4afa686c9b974122bae836a510f83554856600e72b40d
Total reclaimed space: 1.195GB
When you create a volume, it is stored within a directory(/var/lib/docker/volumes) on the Docker host. When you mount the volume into a container, this directory is what is mounted into the container. This is similar to the way that bind mounts work, except that volumes are managed by Docker and are isolated from the core functionality of the host machine.
A given volume can be mounted into multiple containers simultaneously. When no running container is using a volume, the volume is still available to Docker and is not removed automatically. Volumes also support the use of volume drivers
, which allow you to store your data on remote hosts or cloud providers, among other possiblities. We will explain the volume driver
later.
If you start a container with a volume that does not yet exist, Docker creates the volume for you. The following example mounts the volume myvol2 into /app in the container.
--mount
to start a container with a volume-v
to start a container with a volume# Check the currently volumes
# docker volume ls
DRIVER VOLUME NAME
local 3bdbc2505acde4689851bcd82f8045840a9501aafe1c35f855ab202caba486a4
local c1fd3cf6d559ced49e9a520772cc46cf642a54c8136fd45bcec1b845293c6641
local e91737fb85e712c3d237326fc5c8f90f34890dbc963859246b09f5e704771425
# Start a new container with a new volume
# docker run -d --name devtest --mount src=myvol2,dst=/app nginx
afd93967d83215e19557ce14d2ba0098ccd595c469b3922d44d72720141719c7
# The new volume myvol2 had created by Docker automatically, and check details for myvol2
# docker volume ls
DRIVER VOLUME NAME
local 3bdbc2505acde4689851bcd82f8045840a9501aafe1c35f855ab202caba486a4
local c1fd3cf6d559ced49e9a520772cc46cf642a54c8136fd45bcec1b845293c6641
local e91737fb85e712c3d237326fc5c8f90f34890dbc963859246b09f5e704771425
local myvol2
# docker volume inspect myvol2
[
{
"CreatedAt": "2022-03-04T01:11:32Z",
"Driver": "local",
"Labels": null,
"Mountpoint": "/var/lib/docker/volumes/myvol2/_data",
"Name": "myvol2",
"Options": null,
"Scope": "local"
}
]
# Check devtest container's Mounts informations
# docker inspect devtest
...
"Mounts": [
{
"Type": "volume",
"Name": "myvol2",
"Source": "/var/lib/docker/volumes/myvol2/_data",
"Destination": "/app",
"Driver": "local",
"Mode": "z",
"RW": true,
"Propagation": ""
}
],
...
# Check the currently volumes
# docker volume ls
DRIVER VOLUME NAME
local 3bdbc2505acde4689851bcd82f8045840a9501aafe1c35f855ab202caba486a4
local c1fd3cf6d559ced49e9a520772cc46cf642a54c8136fd45bcec1b845293c6641
local e91737fb85e712c3d237326fc5c8f90f34890dbc963859246b09f5e704771425
# Start a new container with a new volume
# docker run -d --name devtest -v myvol2:/app nginx
f36ceaacfe383833456be94b849983a85480df28e1c5a4c258b9181b3520eef1
# The new volume myvol2 had created by Docker automatically, and check details for myvol2
# docker volume ls
DRIVER VOLUME NAME
local 3bdbc2505acde4689851bcd82f8045840a9501aafe1c35f855ab202caba486a4
local c1fd3cf6d559ced49e9a520772cc46cf642a54c8136fd45bcec1b845293c6641
local e91737fb85e712c3d237326fc5c8f90f34890dbc963859246b09f5e704771425
local myvol2
# docker volume inspect myvol2
[
{
"CreatedAt": "2022-03-04T01:18:53Z",
"Driver": "local",
"Labels": null,
"Mountpoint": "/var/lib/docker/volumes/myvol2/_data",
"Name": "myvol2",
"Options": null,
"Scope": "local"
}
]
# Check devtest container's Mounts informations
# docker inspect devtest
...
"Mounts": [
{
"Type": "volume",
"Name": "myvol2",
"Source": "/var/lib/docker/volumes/myvol2/_data",
"Destination": "/app",
"Driver": "local",
"Mode": "z",
"RW": true,
"Propagation": ""
}
],
...
How to choose the
-v
or--mount
flag ??
In general, --mount
is more explicit and verbose. The biggest difference is that the -v
syntax combines all the options together in one field, while the --mount
syntax separates them. Here is a comparison of the syntax for each flag. If you need to specify volume driver options, you must use --mount
.
-v
or --volume
: Consists of three fields, separated by colon characters :
. The fields must be in the correct order, and the meaning of each field is not immediately obvious.
--mount
: Consists of multiple key-value pairs, separated by commas and each consisting of a <key>=<value>
tuple. The --mount
syntax is more verbose than -v
or --volume
, but the order of the keys is not significant, and the value of the flag is easier to understand.
bind
, volume
, or tmpfs
. This topic discusses volumes, so the type is always volume.Good use cases for volumes
Volumes are the preferred way to persist data in Docker containers and services. Some use cases for volumes include:
Bind mounts
have been around since the early days of Docker. Bind mounts have limited functionality compared to volumes. When you use a bind mount, a file or directory on the host machine is mounted into a container. The file or directory is referenced by its absolute path on the host machine. By contrast, when you use a volume, a new directory is created within Docker’s storage directory on the host machine, and Docker manages that directory’s contents.
The file or directory does not need to exist on the Docker host already. It is created on demand if it does not yet exist. Bind mounts are very performant, but they rely on the host machine’s filesystem having a specific directory structure available. If you are developing new Docker applications, consider using named volumes instead. You can’t use Docker CLI commands to directly manage bind mounts.
--mount
to start a container with bind mount-v
to start a container with bind mount# List the file in dockerFile directory
# ls -lhrt dockfile_practice/
total 396K
-rw-r--r-- 1 root root 6 Feb 25 10:51 add_file
-rw-r--r-- 1 root root 384K Feb 25 10:51 add_tar.tar.gz
drwxr-xr-x 2 root root 4.0K Feb 25 10:58 add_dir
-rw-r--r-- 1 root root 109 Feb 25 11:30 Dockerfile
# docker run -d --rm -it --name bind_mount_nginx --mount type=bind,src=/root/dockfile_practice,target=/app nginx
fe969991cf807076708b45e4ae574b4213baacb236a51ecf3eeaa9df3a35f775
# docker exec bind_mount_nginx ls -lhrt /app
total 396K
-rw-r--r-- 1 root root 6 Feb 25 18:51 add_file
-rw-r--r-- 1 root root 384K Feb 25 18:51 add_tar.tar.gz
drwxr-xr-x 2 root root 4.0K Feb 25 18:58 add_dir
-rw-r--r-- 1 root root 109 Feb 25 19:30 Dockerfile
# docker inspect bind_mount_nginx
...
"Mounts": [
{
"Type": "bind",
"Source": "/root/dockfile_practice",
"Destination": "/app",
"Mode": "",
"RW": true,
"Propagation": "rprivate"
}
],
...
# docker run -itd --name bind_mount_nginx -v /root/dockfile_practice/:/app nginx
ab356d08136aaa9d82bc90fd96da585e296c05898db357c51aaf772131625413
# List the file in dockerFile directory
# ls -lhrt dockfile_practice/
total 396K
-rw-r--r-- 1 root root 6 Feb 25 10:51 add_file
-rw-r--r-- 1 root root 384K Feb 25 10:51 add_tar.tar.gz
drwxr-xr-x 2 root root 4.0K Feb 25 10:58 add_dir
-rw-r--r-- 1 root root 109 Feb 25 11:30 Dockerfile
# docker exec bind_mount_nginx ls -lhrt /app
total 396K
-rw-r--r-- 1 root root 6 Feb 25 18:51 add_file
-rw-r--r-- 1 root root 384K Feb 25 18:51 add_tar.tar.gz
drwxr-xr-x 2 root root 4.0K Feb 25 18:58 add_dir
-rw-r--r-- 1 root root 109 Feb 25 19:30 Dockerfile
# docker inspect bind_mount_nginx
...
"Mounts": [
{
"Type": "bind",
"Source": "/root/dockfile_practice",
"Destination": "/app",
"Mode": "",
"RW": true,
"Propagation": "rprivate"
}
],
...
Good use cases for bind mounts
In general, you should use volumes where possible. Bind mounts
are appropriate for the following types of use case:
Volumes and bind mounts let you share files between the host machine and container so that you can persist data even after the container is stopped. As opposed to volumes and bind mounts, a tmpfs
mount is temporary, and only persisted in the host memory. When the container stops, the tmpfs mount is removed, and files written there won’t be persisted.
Limitations of tmpfs mounts
Differences between --tmpfs and --mount behavior
--tmpfs
flag does not allow you to specify any configurable options.--tmpfs
flag cannot be used with swarm services. You must use --mount
.Start a container with a tmpfs
To use a tmpfs mount in a container, use the --tmpfs
flag, or use the --mount
flag with type=tmpfs and destination options. ==There is no source for tmpfs mounts==. The following example creates a tmpfs mount at /app in a Nginx container. The first example uses the --mount flag and the second uses the --tmpfs flag.
--mount
to start a container with tmpfs--tmpfs
to start a container with tmpfs# docker run -idt --name tmpfs_nginx --mount type=tmpfs,dst=/app nginx
1fbd4fa2f6c8d0b9cf593aabf2872b967743b2817aba36331652e9071a7dc495
# docker exec tmpfs_nginx df -Th
Filesystem Type Size Used Avail Use% Mounted on
overlay overlay 9.2G 6.5G 2.2G 75% /
tmpfs tmpfs 64M 0 64M 0% /dev
tmpfs tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup
shm tmpfs 64M 0 64M 0% /dev/shm
tmpfs tmpfs 2.0G 0 2.0G 0% /app
/dev/vda1 ext4 9.2G 6.5G 2.2G 75% /etc/hosts
tmpfs tmpfs 2.0G 0 2.0G 0% /proc/asound
tmpfs tmpfs 2.0G 0 2.0G 0% /proc/acpi
tmpfs tmpfs 2.0G 0 2.0G 0% /sys/firmware
# docker inspect tmpfs_nginx
...
"Mounts": [
{
"Type": "tmpfs",
"Source": "",
"Destination": "/app",
"Mode": "",
"RW": true,
"Propagation": ""
}
],
...
# docker run -idt --name tmpfs_nginx --tmpfs /app nginx
5b1a015384b395a288b443965f1b86e5f028cc4329c463d83cc8aba77627516f
# docker exec tmpfs_nginx df -Th
Filesystem Type Size Used Avail Use% Mounted on
overlay overlay 9.2G 6.5G 2.2G 75% /
tmpfs tmpfs 64M 0 64M 0% /dev
tmpfs tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup
shm tmpfs 64M 0 64M 0% /dev/shm
tmpfs tmpfs 2.0G 0 2.0G 0% /app
/dev/vda1 ext4 9.2G 6.5G 2.2G 75% /etc/hosts
tmpfs tmpfs 2.0G 0 2.0G 0% /proc/asound
tmpfs tmpfs 2.0G 0 2.0G 0% /proc/acpi
tmpfs tmpfs 2.0G 0 2.0G 0% /sys/firmware
# docker inspect tmpfs_nginx
...
"Mounts": [
{
"Type": "tmpfs",
"Source": "",
"Destination": "/app",
"Mode": "",
"RW": true,
"Propagation": ""
}
],
...
Good use cases for tmpfs mounts
tmpfs mounts
are best used for cases when you do not want the data to persist either on the host machine or within the container. This may be for security reasons or to protect the performance of the container when your application needs to write a large volume of non-persistent state data.
Docker supports several storage drivers, using a pluggable architecture. The storage driver controls how images and containers are stored and managed on Docker host. The Docker Engine provides the following storage drivers on Linux:
Storage driver | Description | Supported backing filesystems |
---|---|---|
overlay2(Recommended) | overlay2 is the preferred storage driver for all currently supported Linux distributions, and requires no extra configuration. | xfs with ftype=1, ext4 |
fuse-overlayfs | fuse-overlayfsis preferred only for running Rootless Docker on a host that does not provide support for rootless overlay2. On Ubuntu and Debian 10, the fuse-overlayfs driver does not need to be used, and overlay2 works even in rootless mode. Refer to the rootless mode documentation for details. | any filesystem |
btrfs and zfs | The btrfs and zfs storage drivers allow for advanced options, such as creating “snapshots”, but require more maintenance and setup. Each of these relies on the backing filesystem being configured correctly. | btrfs/zfs |
vfs | The vfs storage driver is intended for testing purposes, and for situations where no copy-on-write filesystem can be used. Performance of this storage driver is poor, and is not generally recommended for production use. | any filesystem |
aufs | The aufs storage driver Was the preferred storage driver for Docker 18.06 and older, when running on Ubuntu 14.04 on kernel 3.13 which had no support for overlay2. However, current versions of Ubuntu and Debian now have support for overlay2, which is now the recommended driver. | xfs, ext4 |
devicemapper | The devicemapper storage driver requires direct-lvm for production environments, because loopback-lvm, while zero-configuration, has very poor performance. devicemapper was the recommended storage driver for CentOS and RHEL, as their kernel version did not support overlay2. However, current versions of CentOS and RHEL now have support for overlay2, which is now the recommended driver. | direct-lvm |
overlay | The legacy overlay driver was used for kernels that did not support the "multiple-lowerdir" feature required for overlay2 All currently supported Linux distributions now provide support for this, and it is therefore deprecated. | xfs with ftype=1, ext4 |