Today, I was interested to know how does Docker uses cgroups to set resource limits. In this short post, I will share with you what I learnt.
I will assume that you have a machine on which Docker is installed.
Docker allows you to pass resource limits using the command-line options. Let’s assume that you want to limit the IO read rate to 1mb per second for a container. You can start a new container with the device-read-bps
option as shown below
$ docker run -it --device-read-bps /dev/sda:1mb centos
In the above command, we are instantiating a new centos container. We specified device-read-bps
option to limit the read rate to 1mb per second for /dev/sda
device.
The above command will start the container and you will inside the container shell.
We will create a new file inside the container and then try to read the file. To create a file with random content, we will use dd
utility as shown below.
[root@container-id ~]# dd if=/dev/zero of=afile bs=1M count=100
The above will create a file with 100MB size.
Now, let’s try to read afile
file.
But before that, we will start the iotop
utility on the docker host
$ iotop -o
To do that, we will again use dd
utility as shown below.
[root@container-id ~]# dd if=/root/afile of=/dev/null
As you can see below in the iotop
screenshot, the disk read speed was close to 1mb per second.
If you do the above in an unconstrained container, you will find that read speed is much higher.
Let’s start a new container without the limits
$ docker run -it centos
Now, again create a file as we did above. This time we will create a file of 5Gb size.
[root@container-id ~]# dd if=/dev/zero of=afile bs=1M count=5000
Next, we will read the file using dd
command as we did previously. This time if you look at iotop
, you will find that disk read speed is 591.89 Mb per second.
How does Docker uses cgroup?
Cgroup is a linux feature to limit, police, and account the resource usage for a set of processes. It provides mechanism to limit and monitor system resources like CPU time, system memory, disk bandwidth, network bandwidth, etc.
The cgroups works by dividing resources into groups and then assigning tasks to those groups.
Docker uses cgroups to limit the system resources.
When you install Docker binary on a linux box like ubuntu it will install cgroup related packages and create subsystem directories. You can list all the subsystems that you can manage using cgroups via the lscgroup
command.
$ lscgroup
cpuset:/ cpu:/ cpuacct:/ memory:/ devices:/ freezer:/ blkio:/ perf_event:/ hugetlb:/
If lscgroup is not installed, then you can install it using
sudo apt-get install cgroup-bin
command.
On Ubuntu, these corresponds to directories inside the /sys/fs/cgroup
directory.
$ cd /sys/fs/cgroup/
Once inside the cgroup
directory you can list its contents.
$ ls -l
total 0 drwxr-xr-x 2 root root 0 Jan 3 14:50 blkio drwxr-xr-x 2 root root 0 Jan 3 14:50 cpu drwxr-xr-x 2 root root 0 Jan 3 14:50 cpuacct drwxr-xr-x 2 root root 0 Jan 3 14:50 cpuset drwxr-xr-x 2 root root 0 Jan 3 14:50 devices drwxr-xr-x 2 root root 0 Jan 3 14:50 freezer drwxr-xr-x 2 root root 0 Jan 3 14:50 hugetlb drwxr-xr-x 2 root root 0 Jan 3 14:50 memory drwxr-xr-x 2 root root 0 Jan 3 14:50 perf_event drwxr-xr-x 3 root root 0 Jan 3 14:45 systemd
The blkio
directory is used to manage block devices. Similarly other directories are used to manage other system resources.
Let’s look inside the contents of blkio
directory.
/sys/fs/cgroup/blkio$ ls -l
total 0 -r--r--r-- 1 root root 0 Jan 3 14:50 blkio.io_merged -r--r--r-- 1 root root 0 Jan 3 14:50 blkio.io_merged_recursive -r--r--r-- 1 root root 0 Jan 3 14:50 blkio.io_queued -r--r--r-- 1 root root 0 Jan 3 14:50 blkio.io_queued_recursive -r--r--r-- 1 root root 0 Jan 3 14:50 blkio.io_service_bytes -r--r--r-- 1 root root 0 Jan 3 14:50 blkio.io_service_bytes_recursive -r--r--r-- 1 root root 0 Jan 3 14:50 blkio.io_service_time -r--r--r-- 1 root root 0 Jan 3 14:50 blkio.io_service_time_recursive -r--r--r-- 1 root root 0 Jan 3 14:50 blkio.io_serviced -r--r--r-- 1 root root 0 Jan 3 14:50 blkio.io_serviced_recursive -r--r--r-- 1 root root 0 Jan 3 14:50 blkio.io_wait_time -r--r--r-- 1 root root 0 Jan 3 14:50 blkio.io_wait_time_recursive -rw-r--r-- 1 root root 0 Jan 3 14:50 blkio.leaf_weight -rw-r--r-- 1 root root 0 Jan 3 14:50 blkio.leaf_weight_device --w------- 1 root root 0 Jan 3 14:50 blkio.reset_stats -r--r--r-- 1 root root 0 Jan 3 14:50 blkio.sectors -r--r--r-- 1 root root 0 Jan 3 14:50 blkio.sectors_recursive -r--r--r-- 1 root root 0 Jan 3 14:50 blkio.throttle.io_service_bytes -r--r--r-- 1 root root 0 Jan 3 14:50 blkio.throttle.io_serviced -rw-r--r-- 1 root root 0 Jan 3 14:50 blkio.throttle.read_bps_device -rw-r--r-- 1 root root 0 Jan 3 14:50 blkio.throttle.read_iops_device -rw-r--r-- 1 root root 0 Jan 3 14:50 blkio.throttle.write_bps_device -rw-r--r-- 1 root root 0 Jan 3 14:50 blkio.throttle.write_iops_device -r--r--r-- 1 root root 0 Jan 3 14:50 blkio.time -r--r--r-- 1 root root 0 Jan 3 14:50 blkio.time_recursive -rw-r--r-- 1 root root 0 Jan 3 14:50 blkio.weight -rw-r--r-- 1 root root 0 Jan 3 14:50 blkio.weight_device -rw-r--r-- 1 root root 0 Jan 3 14:50 cgroup.clone_children --w--w--w- 1 root root 0 Jan 3 14:50 cgroup.event_control -rw-r--r-- 1 root root 0 Jan 3 14:50 cgroup.procs -r--r--r-- 1 root root 0 Jan 3 14:50 cgroup.sane_behavior -rw-r--r-- 1 root root 0 Jan 3 14:50 notify_on_release -rw-r--r-- 1 root root 0 Jan 3 14:50 release_agent -rw-r--r-- 1 root root 0 Jan 3 14:50 tasks
The three important file from the above are:
- tasks: This contains pids for the tasks attached to this control group
- cgroup.procs: This file contain thread group ids which is useful if you have multi threaded application.
- cgroup.event_control: This file is used to hook in to notification API.
When you run a new docker container using docker run
command then docker will create a new child group under each of the sub systems. The name of the child group will be docker/container_id
.
So, when you run a new container using the command shown below
$ docker run -it --device-read-bps /dev/sda:1mb centos
Then, directories will be created for the container. If you list contents of the directory blkio
you will notice following
$ ls -l blkio/docker/26dc49635757074a2119039dc74634f72e9eddff41bee9dd8f761d73d3780a5c/
total 0 -r--r--r-- 1 root root 0 Jan 3 15:10 blkio.io_merged -r--r--r-- 1 root root 0 Jan 3 15:10 blkio.io_merged_recursive -r--r--r-- 1 root root 0 Jan 3 15:10 blkio.io_queued -r--r--r-- 1 root root 0 Jan 3 15:10 blkio.io_queued_recursive -r--r--r-- 1 root root 0 Jan 3 15:10 blkio.io_service_bytes -r--r--r-- 1 root root 0 Jan 3 15:10 blkio.io_service_bytes_recursive -r--r--r-- 1 root root 0 Jan 3 15:10 blkio.io_service_time -r--r--r-- 1 root root 0 Jan 3 15:10 blkio.io_service_time_recursive -r--r--r-- 1 root root 0 Jan 3 15:10 blkio.io_serviced -r--r--r-- 1 root root 0 Jan 3 15:10 blkio.io_serviced_recursive -r--r--r-- 1 root root 0 Jan 3 15:10 blkio.io_wait_time -r--r--r-- 1 root root 0 Jan 3 15:10 blkio.io_wait_time_recursive -rw-r--r-- 1 root root 0 Jan 3 15:10 blkio.leaf_weight -rw-r--r-- 1 root root 0 Jan 3 15:10 blkio.leaf_weight_device --w------- 1 root root 0 Jan 3 15:10 blkio.reset_stats -r--r--r-- 1 root root 0 Jan 3 15:10 blkio.sectors -r--r--r-- 1 root root 0 Jan 3 15:10 blkio.sectors_recursive -r--r--r-- 1 root root 0 Jan 3 15:10 blkio.throttle.io_service_bytes -r--r--r-- 1 root root 0 Jan 3 15:10 blkio.throttle.io_serviced -rw-r--r-- 1 root root 0 Jan 3 15:10 blkio.throttle.read_bps_device -rw-r--r-- 1 root root 0 Jan 3 15:10 blkio.throttle.read_iops_device -rw-r--r-- 1 root root 0 Jan 3 15:10 blkio.throttle.write_bps_device -rw-r--r-- 1 root root 0 Jan 3 15:10 blkio.throttle.write_iops_device -r--r--r-- 1 root root 0 Jan 3 15:10 blkio.time -r--r--r-- 1 root root 0 Jan 3 15:10 blkio.time_recursive -rw-r--r-- 1 root root 0 Jan 3 15:10 blkio.weight -rw-r--r-- 1 root root 0 Jan 3 15:10 blkio.weight_device -rw-r--r-- 1 root root 0 Jan 3 15:10 cgroup.clone_children --w--w--w- 1 root root 0 Jan 3 15:10 cgroup.event_control -rw-r--r-- 1 root root 0 Jan 3 15:10 cgroup.procs -rw-r--r-- 1 root root 0 Jan 3 15:10 notify_on_release -rw-r--r-- 1 root root 0 Jan 3 15:10 tasks
This has the same file structure as the blkio
directory.
The two important things to note are:
- If you
cat
the contents of thetasks
file then you will notice that it has the process id of the container.:/sys/fs/cgroup/blkio/docker/26dc49635757074a2119039dc74634f72e9eddff41bee9dd8f761d73d3780a5c$ cat tasks 6347
This is the process id of the bash process running inside the container.
vagrant@vagrant-ubuntu-trusty-64:/sys/fs/cgroup/blkio/docker/26dc49635757074a2119039dc74634f72e9eddff41bee9dd8f761d73d3780a5c$ ps -ef|grep bash root 6347 6328 0 15:10 pts/0 00:00:00 /bin/bash
- There is an entry made to the
blkio.throttle.read_bps_device
with the read limit on the device.$ cat blkio.throttle.read_bps_device 8:0 1048576
The above shows how Docker uses Cgroup to define limits on different resources. The similar happen for other resources like CPU, memory, etc.
Conclusion
In this post, we learn how Docker uses Cgroups to set resource constraints. Docker provides the plumbing and tooling that make it easy for developer to consume advance linux features.
Hi.
What kernel you use? And distributive?
With kernel-4.19.102 and 5.4.41 i have warning:
level=warning msg=”Your kernel does not support cgroup blkio weight”
level=warning msg=”Your kernel does not support cgroup blkio weight_device”