5

While investigating sharing the PID namespace with containers, I noticed something interesting that I don't understand. When a container shares the PID namespace with the host, some processes have their environmental variables protected while others do not.

Let's take, for example, mysql. I'll start a container with a env variable set:

ubuntu@sandbox:~$ docker container run -it -d --env MYSQL_ROOT_PASSWORD=SuperSecret mysql 551b309513926caa9d5eab5748dbee2f562311241f72c4ed5d193c81148729a6 

I'll start another container which shares the host PID namespace and try to access the environ file:

ubuntu@sandbox:~$ docker container run -it --rm --pid host ubuntu /bin/bash root@1c670d9d7138:/# ps aux | grep mysql 999 18212 5.0 9.6 2006556 386428 pts/0 Ssl+ 17:55 0:00 mysqld root 18573 0.0 0.0 2884 1288 pts/0 R+ 17:55 0:00 grep --color=auto mysql root@1c670d9d7138:/# cat /proc/18212/environ cat: /proc/18212/environ: Permission denied 

Something is blocking my access to read the environmental variables. I was able to find out that I need CAP_SYS_PTRACE to read it in a container:

ubuntu@sandbox:~$ docker container run -it --rm --pid host --cap-add SYS_PTRACE ubuntu /bin/bash root@079d4c1d66d8:/# cat /proc/18212/environ MYSQL_PASSWORD=HOSTNAME=551b30951392MYSQL_DATABASE=MYSQL_ROOT_PASSWORD=SuperSecretPWD=/HOME=/var/lib/mysqlMYSQL_MAJOR=8.0GOSU_VERSION=1.14MYSQL_USER=MYSQL_VERSION=8.0.30-1.el8TERM=xtermSHLVL=0MYSQL_ROOT_HOST=%PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/binMYSQL_SHELL_VERSION=8.0.30-1.el8 

However, not all processes are protected in this way.

For example, I'll start another container ubuntu container with a env variable set and run the tail command.

ubuntu@sandbox:~$ docker container run --rm --env SUPERSECRET=helloworld -d ubuntu tail -f /dev/null 42023615a4415cd4064392e890622530adee1f42a8a2c9027f4921a522d5e1f2 

Now when I run the container with the shared pid namespace, I can access the environmental variables.

ubuntu@sandbox:~$ docker container run -it --rm --pid host ubuntu /bin/bash root@3a774156a364:/# ps aux | grep tail root 19056 0.0 0.0 2236 804 ? Ss 17:57 0:00 tail -f /dev/null root 19176 0.0 0.0 2884 1284 pts/0 S+ 17:58 0:00 grep --color=auto tail root@3a774156a364:/# cat /proc/19056/environ PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/binHOSTNAME=42023615a441SUPERSECRET=helloworldHOME=/root 

What mechanism is preventing me from reading the mysqld environmental variables and not the tail -f process?

0

1 Answer 1

5
+50

What mechanism is preventing me from reading the mysqld environmental variables and not the tail -f process?

The fact that you're running with a different user ID in the first case. If we start up your two examples:

docker run --name mysql -it -d --env MYSQL_ROOT_PASSWORD=SuperSecret mysql:latest docker run --name tail -it -d --env MYSQL_ROOT_PASSWORD=SuperSecret ubuntu:latest tail -f /dev/null 

And then look at the resulting processes:

$ ps -fe n |grep -E 'tail|mysqld' | grep -v grep 999 422026 422005 2 22:50 pts/0 Ssl+ 0:00 mysqld 0 422170 422144 0 22:50 pts/0 Ss+ 0:00 tail -f /dev/null 

We see that mysqld is running as UID 999, while the tail command is running as UID 0. When we start up a new container in the host pid namespace, we can only read the environ for processes that are owned by the same UID and GID. So this works, because by default a container runs with UID 0:

$ docker run --rm --pid host ubuntu:latest cat /proc/422170/environ | tr '\0' '\n' PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOSTNAME=e89c069d4674 TERM=xterm MYSQL_ROOT_PASSWORD=SuperSecret HOME=/root 

And this fails:

$ docker run --rm --pid host ubuntu:latest cat /proc/422026/environ | tr '\0' '\n' cat: /proc/422026/environ: Permission denied 

We can only read the environ file for a process running under a different UID or GID if we have the CAP_SYS_PTRACE capability. The logic for this check is in the ptrace_may_access function in the kernel:

 if (uid_eq(caller_uid, tcred->euid) && uid_eq(caller_uid, tcred->suid) && uid_eq(caller_uid, tcred->uid) && gid_eq(caller_gid, tcred->egid) && gid_eq(caller_gid, tcred->sgid) && gid_eq(caller_gid, tcred->gid)) goto ok; if (ptrace_has_cap(tcred->user_ns, mode)) goto ok; 

We can make that failing example work by having the container run with the same UID and GID as the mysql process:

$ docker run -u 999:999 --rm --pid host ubuntu:latest cat /proc/422026/environ | tr '\0' '\n' MYSQL_PASSWORD= HOSTNAME=bde980104dcd MYSQL_DATABASE= MYSQL_ROOT_PASSWORD=SuperSecret PWD=/ HOME=/var/lib/mysql MYSQL_MAJOR=8.0 GOSU_VERSION=1.14 MYSQL_USER= MYSQL_VERSION=8.0.31-1.el8 TERM=xterm SHLVL=0 MYSQL_ROOT_HOST=% PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin MYSQL_SHELL_VERSION=8.0.31-1.el8 

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.