How to make /dev inside linux namespaces

lynx

AFAIK the container terminology, what I'm essentially trying to accomplish is to write my own "container runtime".

What I'm doing:

user@host:~$ mkdir test
user@host:~$ cd test
user@host:~/test$ mkdir dev
user@host:~/test$ mkdir proc
user@host:~/test$ echo 1 |sudo tee /proc/sys/kernel/unprivileged_userns_clone
user@host:~$ unshare --ipc --mount --net --pid --uts --cgroup --user \
  --map-root-user --fork bash
root@host:~/test# mount none -t tmpfs dev/
root@host:~/test# touch dev/zero
root@host:~/test# mount /dev/zero -o bind dev/zero
root@host:~/test# echo 1 > dev/zero
bash: dev/zero: Permission denied
root@host:~/test# ls -lah dev
total 4.0K
drwxrwxrwt 2 root   root      60 Sep  1 15:12 .
drwxr-xr-x 3 root   root    4.0K Sep  1 13:47 ..
crw-rw-rw- 1 nobody nogroup 1, 5 Sep  1 13:55 zero
root@host:~/test# mount # we are still looking at hosts /proc
<...>
none on /home/user/test/dev type tmpfs (rw,relatime,uid=1000,gid=1000)
udev on /home/user/test/dev/zero type devtmpfs (rw,nosuid,relatime,size=3921088k,nr_inodes=980272,mode=755)
root@host:~/test# mount none -t proc proc/
root@host:~/test# cat proc/mounts
<...>
none /home/user/test/dev tmpfs rw,relatime,uid=1000,gid=1000 0 0
udev /home/user/test/dev/zero devtmpfs rw,nosuid,relatime,size=3921088k,nr_inodes=980272,mode=755 0 0
none /home/user/test/proc proc rw,relatime 0 0

Echo-ing to dev/zero produces error. Can anybody enlighten me as to what I'm doing wrong?

I took the idea from dockers runc(libcontainer): https://github.com/docker/runc/blob/ae2948042b08ad3d6d13cd09f40a50ffff4fc688/libcontainer/rootfs_linux.go#L463

This question might be relevant: -bash: /dev/null: Permission denied

os: Debian buster kernel: 4.19.37

mosvy

Quick fix for your problem: mount the tmpfs with a mode= option (which sets the permissions of the root dir of the filesystem), and use a mode without either the sticky bit (01000) or the write permission for others (002) -- which are both on by default:

root@host:~/test# mount -o mode=0755 -t tmpfs none dev/

You're not losing anything; having the /dev directory inside your container sticky writable by everybody wasn't a good idea in the first place (it would be interesting to know if docker is doing that, too ;-))


This happens because of the tmpfs mount inside the user namespace: the kernel sets the owner of the root dir of the tmpfs filesystem from the untranslated credentials of the process doing the mount (thence the uid=1000,gid=1000 in the mount(1) output), and its permissions to the default 01777 (rwx for everybody + the sticky bit).

The combination of a non-root owner, write permissions for others (S_IWOTH) and a sticky bit (S_ISVTX) on a directory triggers a curious behaviour in newer Linux kernels: trying to open(2) an existing character device inside such a directory will fail with EACCES if the O_CREAT flag is used, even if done by the owner of the directory:

$ mkdir doo; chmod 1777 doo 
$ su -c 'mknod -m 666 doo/null c 1 3'  # or touch doo/null; mount -B /dev/null doo/null 
$ echo > doo/null
bash: doo/null: Permission denied

$ perl -MFcntl -e 'sysopen(NULL, "doo/null", O_WRONLY|O_CREAT, 0666) or die $!'
Permission denied at -e line 1.
$ perl -MFcntl -e 'sysopen(NULL, "doo/null", O_WRONLY, 0666) or die $!'
$ # ok!

$ chmod -t doo
$ echo > doo/null
$ # ok!

This is reproducible on Debian Buster / 4.19.0-5 and on a more recent kernels, but not on Debian Stretch / 4.9.0-4.

The behavior was introduced as a side effect of this commit:

commit 30aba6656f61ed44cba445a3c0d38b296fa9e8f5
Author: Salvatore Mesoraca <[email protected]>
Date:   Thu Aug 23 17:00:35 2018 -0700

    namei: allow restricted O_CREAT of FIFOs and regular files
...
diff --git a/fs/namei.c b/fs/namei.c
...
+static int may_create_in_sticky(struct dentry * const dir,
+                               struct inode * const inode)
+{
+       if ((!sysctl_protected_fifos && S_ISFIFO(inode->i_mode)) ||
+           (!sysctl_protected_regular && S_ISREG(inode->i_mode)) ||
+           likely(!(dir->d_inode->i_mode & S_ISVTX)) ||
+           uid_eq(inode->i_uid, dir->d_inode->i_uid) ||
+           uid_eq(current_fsuid(), inode->i_uid))
+               return 0;
+
+       if (likely(dir->d_inode->i_mode & 0002) ||
+           (dir->d_inode->i_mode & 0020 &&
+            ((sysctl_protected_fifos >= 2 && S_ISFIFO(inode->i_mode)) ||
+             (sysctl_protected_regular >= 2 && S_ISREG(inode->i_mode))))) {
+               return -EACCES;
+       }
+       return 0;
+}

The doo/null will not clear the first if because it is neither a fifo, nor a regular file, the containing dir really has its sticky bit set and its uid differs from that of the dir and from that of the process trying to open it.

It will immediately match the second if because the containing dir has the 002 (write-for-others) bit set.

I haven't found in the lkml discussions where this was introduced (1,2,3) whether this effect was even considered. Anyways, putting character devices inside world-writable sticky directories is unlikely to be intentional.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

How to perform chroot with Linux namespaces?

From Dev

How to make reachable macvlan aliases in a different namespaces?

From Dev

How to forward traffic between Linux network namespaces?

From Dev

Eclipse plugin dev: make bundle to explode inside `plugins` folder

From Dev

How to make functions from all namespaces available in the .core namespace in Clojure?

From Dev

How to make propel build models with namespaces in the right directory?

From Dev

How to make functions from all namespaces available in the .core namespace in Clojure?

From Dev

How to make propel build models with namespaces in the right directory?

From Dev

How to use Linux Network Namespaces for per processes routing?

From Dev

How to find official documentation about Linux kernel namespaces?

From Dev

How Linux uses /dev/tty and /dev/tty0

From Dev

routing between linux namespaces

From Dev

how to make a counter in linux?

From Dev

How make a switch in linux

From Dev

How to rename /dev/sdax(partitions) in Linux

From Dev

How to mount a harddisk in linux not showing as /dev/sdx

From Dev

How to rename /dev/sdax(partitions) in Linux

From Dev

How to rename /dev/sdax(partitions) in Linux

From Dev

How to rename /dev/sdax(partitions) in Linux

From Dev

How linux put SMBIOS information to /dev/mem?

From Dev

How to make a grid inside a cell?

From Dev

How to make RecyclerView inside DialogAlert

From Dev

How to call php variable inside echo statement - Wordpress Plugin Dev

From Dev

View/manipulate mount namespaces in Linux

From Dev

Is it possible to rename linux network namespaces

From Dev

how to update make 3.81 linux

From Dev

How to make notification to user in Linux?

From Dev

How to make a portable Linux app?

From Dev

how to make a module install in linux

Related Related

  1. 1

    How to perform chroot with Linux namespaces?

  2. 2

    How to make reachable macvlan aliases in a different namespaces?

  3. 3

    How to forward traffic between Linux network namespaces?

  4. 4

    Eclipse plugin dev: make bundle to explode inside `plugins` folder

  5. 5

    How to make functions from all namespaces available in the .core namespace in Clojure?

  6. 6

    How to make propel build models with namespaces in the right directory?

  7. 7

    How to make functions from all namespaces available in the .core namespace in Clojure?

  8. 8

    How to make propel build models with namespaces in the right directory?

  9. 9

    How to use Linux Network Namespaces for per processes routing?

  10. 10

    How to find official documentation about Linux kernel namespaces?

  11. 11

    How Linux uses /dev/tty and /dev/tty0

  12. 12

    routing between linux namespaces

  13. 13

    how to make a counter in linux?

  14. 14

    How make a switch in linux

  15. 15

    How to rename /dev/sdax(partitions) in Linux

  16. 16

    How to mount a harddisk in linux not showing as /dev/sdx

  17. 17

    How to rename /dev/sdax(partitions) in Linux

  18. 18

    How to rename /dev/sdax(partitions) in Linux

  19. 19

    How to rename /dev/sdax(partitions) in Linux

  20. 20

    How linux put SMBIOS information to /dev/mem?

  21. 21

    How to make a grid inside a cell?

  22. 22

    How to make RecyclerView inside DialogAlert

  23. 23

    How to call php variable inside echo statement - Wordpress Plugin Dev

  24. 24

    View/manipulate mount namespaces in Linux

  25. 25

    Is it possible to rename linux network namespaces

  26. 26

    how to update make 3.81 linux

  27. 27

    How to make notification to user in Linux?

  28. 28

    How to make a portable Linux app?

  29. 29

    how to make a module install in linux

HotTag

Archive