[lxc-users] unprivileged Debian Buster container on Debian Buster host fail to start: no cgroups, no controllers

Lukas Pirl lxc-users at lukas-pirl.de
Tue May 28 22:11:29 UTC 2019


On Tue, 2019-05-28 21:50 +0200, Xavier Gendre wrote as excerpted:
> Hello Lukas,
> 
> unprivileged buster containers on a buster host run like a charm. Your 
> config includes a lot of stuff that are not suited for an unprivileged 
> container (apparmor, ...). First, you should try with a simpler 
> configuration file as the following one.
> 
> ---%<------%<------%<---
> lxc.idmap = u 0 165536 65536
> lxc.idmap = g 0 165536 65536
> lxc.net.0.type = empty
> --->%------>%------>%---
> 
> Then,
> lxc-create -n test -f config.file -t download -- --dist debian --release 
> buster --arch amd64
> lxc-start -n test

Thanks for your reply, Xavier. No luck. This is what I have/see now:

$ egrep -v '^#' test.config 
lxc.net.0.type = empty
lxc.idmap = u 0 165536 65536
lxc.idmap = g 0 165536 65536

$ lxc-create -n test -f test.config -t download -- --dist debian \
  --release buster --arch amd64
Permission denied - Failed to open ttyPermission denied - Failed to open
ttyPermission denied - Failed to open ttycat: /proc/1/uid_map: No such file or
directory
Using image from local cache
Unpacking the rootfs

---
You just created a Debian buster amd64 (20190509_05:24) container.

To enable SSH, run: apt install openssh-server
No default root or user password are set by LXC.

$ ll /dev/tty
crw-rw-rw- 1 root tty 5, 0 2019-05-28 23:41:32 /dev/tty /dev/tty[0-9]*
crw--w---- 1 root tty 4, 0 2019-05-28 11:15:54 /dev/tty0
… # all tty<n> have the same permissions

$

Should the user ``lxc`` be in the group ``tty``?

Apparently, ``lxc-create`` queries ``/proc/1/{u,g}id_map`` which it is not
allowed to (proc mounted with hidepid=2) instead of
``/proc/self/{u,g}id_map``, no?

$ cat /proc/self/{u,g}id_map
         0          0 4294967295
         0          0 4294967295

$ egrep -v '^#' test/config
lxc.include = /usr/share/lxc/config/common.conf
lxc.include = /usr/share/lxc/config/userns.conf
lxc.arch = linux64
lxc.idmap = u 0 165536 65536
lxc.idmap = g 0 165536 65536
lxc.rootfs.path = dir:/home/lxc/test/rootfs
lxc.uts.name = test
lxc.net.0.type = empty

$ lxc-start -n test -F
Failed to mount cgroup at /sys/fs/cgroup/systemd: Permission denied
[!!!!!!] Failed to mount API filesystems.
Exiting PID 1...
$

Okay, let's first try to make ``/dev/tty`` read-writable for the user ``lxc``
(just for testing, ignoring if that is appropriate or not):

$ chmod a+rw /dev/tty /dev/tty[0-9]*
$ ll /dev/tty /dev/tty[0-9]*
crw-rw-rw- 1 root tty 5,  0 2019-05-28 23:41:32 /dev/tty
crw-rw-rw- 1 root tty 4,  0 2019-05-28 11:15:54 /dev/tty0
…

$ lxc-destroy -n test
$ lxc-create -n test -f test.config -t download -- --dist debian --release
buster --arch amd64
Permission denied - Failed to open ttyPermission denied - Failed to open
ttyPermission denied - Failed to open ttycat: /proc/1/uid_map: No such file or
directory
Using image from local cache
Unpacking the rootfs

---
You just created a Debian buster amd64 (20190509_05:24) container.

To enable SSH, run: apt install openssh-server
No default root or user password are set by LXC.
$

The error when starting persist as above.

Hm, okay let's mount ``/proc`` with ``hidepid=0``. Destroy and create: The tty
error persists, the cat error is gone. Start: errors persist as above.

Alright, last thing, let's try to remove the includes from the config file
(changes until now still in place):

$ egrep -v '^#' test/config
lxc.arch = linux64
lxc.idmap = u 0 165536 65536
lxc.idmap = g 0 165536 65536
lxc.rootfs.path = dir:/home/lxc/test/rootfs
lxc.uts.name = test
lxc.net.0.type = empty

$ lxc-start -n test -F
Failed to lookup module alias 'autofs4': Function not implemented
Failed to lookup module alias 'unix': Function not implemented
Failed to mount sysfs at /sys: Operation not permitted
Failed to mount proc at /proc: Operation not permitted
Failed to mount cgroup at /sys/fs/cgroup/systemd: Permission denied
[!!!!!!] Failed to mount API filesystems.
Exiting PID 1...

My lxc.conf does not include anything exotic as well:

$ egrep -v '^#' .config/lxc/lxc.conf 
lxc.lxcpath = /home/lxc

Any ideas?

Cheers,

Lukas

> Le 28/05/2019 à 15:54, Lukas Pirl a écrit :
> > Dear all,
> > 
> > first, thanks for the friendly and supportive help you all provide in
> > issue
> > trackers, on mailing lists, etc. – it is very helpful to find all this
> > online.
> > 
> > However, I struggle to run unprivileged (Debian Buster) containers (on a
> > Debian Buster host). LXC does not seem to mount the cgroup mount points
> > for
> > the container, thus the container's systemd tries to mount those and fails
> > due
> > to insufficient permissions.
> > 
> > The log reports no writable cgroup hierarchies and no available
> > controllers –
> > could there be a common cause?
> > 
> > I decided not to open an issue so far, since I am not sure if it is just
> > me
> > being incompetent here or if there is an actual issue. If we find an
> > actual
> > issue, I'll of course move this to the issue tracker.
> > 
> > Please find all the configuration dumps and logs below.
> > 
> > IIRC, I tried to run the script as provided in
> >    https://github.com/lxc/lxc/issues/1998#issuecomment-353241255
> > without success and various other things. However, I am unsure how the
> > available information can be applied since a few things changed in LXC 3,
> > no?
> > And systemd seems to a moving target as well.
> > 
> > Also, I work on an automation using Ansible to set up a host which can run
> > unprivileged containers. This will be publicly available once everything
> > works.
> > 
> > Cheers,
> > 
> > Lukas
> > 
> > ========================================================================
> > 
> > symptom
> > -------
> > 
> > ``lxc-start -n rproxy -l TRACE -o lxc.log -F``::
> > 
> >    Failed to mount cgroup at /sys/fs/cgroup/systemd: Permission denied
> >    [!!!!!!] Failed to mount API filesystems.
> >    Exiting PID 1...
> > 
> > ``lxc.log``:
> > https://bin.privacytools.io/?28c8377e545ce6a9#9I2a28JuaYf7yHDNIxtxCQxox6LvTrxT4l4scUDgQNc=
> > 
> > host details
> > ------------
> > 
> > * ``cat /etc/debian_version``: 10.0
> > * ``lxc-start --version``: 3.0.3
> > * ``lxc-checkconfig``::
> > 
> >      Kernel configuration not found at /proc/config.gz; searching...
> >      Kernel configuration found at /boot/config-4.19.0-5-amd64
> >      --- Namespaces ---
> >      Namespaces: enabled
> >      Utsname namespace: enabled
> >      Ipc namespace: enabled
> >      Pid namespace: enabled
> >      User namespace: enabled
> >      Network namespace: enabled
> > 
> >      --- Control groups ---
> >      Cgroups: enabled
> > 
> >      Cgroup v1 mount points:
> >      /sys/fs/cgroup/systemd
> >      /sys/fs/cgroup/memory
> >      /sys/fs/cgroup/cpuset
> >      /sys/fs/cgroup/cpu,cpuacct
> >      /sys/fs/cgroup/blkio
> >      /sys/fs/cgroup/net_cls,net_prio
> >      /sys/fs/cgroup/perf_event
> >      /sys/fs/cgroup/rdma
> >      /sys/fs/cgroup/freezer
> >      /sys/fs/cgroup/devices
> >      /sys/fs/cgroup/pids
> > 
> >      Cgroup v2 mount points:
> >      /sys/fs/cgroup/unified
> > 
> >      Cgroup v1 clone_children flag: enabled
> >      Cgroup device: enabled
> >      Cgroup sched: enabled
> >      Cgroup cpu account: enabled
> >      Cgroup memory controller: enabled
> >      Cgroup cpuset: enabled
> > 
> >      --- Misc ---
> >      Veth pair device: enabled, not loaded
> >      Macvlan: enabled, not loaded
> >      Vlan: enabled, not loaded
> >      Bridges: enabled, loaded
> >      Advanced netfilter: enabled, loaded
> >      CONFIG_NF_NAT_IPV4: enabled, loaded
> >      CONFIG_NF_NAT_IPV6: enabled, loaded
> >      CONFIG_IP_NF_TARGET_MASQUERADE: enabled, not loaded
> >      CONFIG_IP6_NF_TARGET_MASQUERADE: enabled, not loaded
> >      CONFIG_NETFILTER_XT_TARGET_CHECKSUM: enabled, not loaded
> >      CONFIG_NETFILTER_XT_MATCH_COMMENT: enabled, loaded
> >      FUSE (for use with lxcfs): enabled, loaded
> > 
> >      --- Checkpoint/Restore ---
> >      checkpoint restore: enabled
> >      CONFIG_FHANDLE: enabled
> >      CONFIG_EVENTFD: enabled
> >      CONFIG_EPOLL: enabled
> >      CONFIG_UNIX_DIAG: enabled
> >      CONFIG_INET_DIAG: enabled
> >      CONFIG_PACKET_DIAG: enabled
> >      CONFIG_NETLINK_DIAG: enabled
> >      File capabilities:
> > 
> >      Note : Before booting a new kernel, you can check its configuration
> >      usage : CONFIG=/path/to/config /usr/bin/lxc-checkconfig
> > 
> > * ``uname -a``: Linux hive 4.19.0-5-amd64 #1 SMP Debian 4.19.37-3
> >    (2019-05-15) x86_64 GNU/Linux
> > 
> > * ``cat /proc/self/cgroup``::
> > 
> >      11:pids:/user.slice/user-1000.slice/session-4.scope
> >      10:devices:/user.slice
> >      9:freezer:/user/lxc/0
> >      8:rdma:/
> >      7:perf_event:/
> >      6:net_cls,net_prio:/
> >      5:blkio:/user.slice
> >      4:cpu,cpuacct:/user/lxc/0
> >      3:cpuset:/user/lxc/0
> >      2:memory:/user/lxc/0
> >      1:name=systemd:/user/lxc/0
> >      0::/user.slice/user-1000.slice/session-4.scope/user/lxc/0
> > 
> > * ``cat /proc/self/mountinfo``::
> > 
> >    20 25 0:19 / /sys rw,nosuid,nodev,noexec,relatime shared:7 - sysfs
> > sysfs rw
> >    21 25 0:4 / /proc rw,relatime shared:14 - proc proc rw,hidepid=2
> >    22 25 0:6 / /dev rw,nosuid,relatime shared:2 - devtmpfs udev
> > rw,size=6134028k,nr_inodes=1533507,mode=755
> >    23 22 0:20 / /dev/pts rw,nosuid,noexec,relatime shared:3 - devpts
> > devpts
> > rw,gid=5,mode=620,ptmxmode=000
> >    24 25 0:21 / /run rw,nosuid,noexec,relatime shared:5 - tmpfs tmpfs
> > rw,size=1229916k,mode=755
> >    25 0 0:22 / / rw,relatime shared:1 - btrfs /dev/sda4
> > rw,compress=lzo,space_cache,user_subvol_rm_allowed,subvolid=5,subvol=/
> >    26 20 0:7 / /sys/kernel/security rw,nosuid,nodev,noexec,relatime
> > shared:8 -
> > securityfs securityfs rw
> >    27 22 0:24 / /dev/shm rw,nosuid,nodev shared:4 - tmpfs tmpfs rw
> >    28 24 0:25 / /run/lock rw,nosuid,nodev,noexec,relatime shared:6 - tmpfs
> > tmpfs rw,size=5120k
> >    29 20 0:26 / /sys/fs/cgroup ro,nosuid,nodev,noexec shared:9 - tmpfs
> > tmpfs
> > ro,mode=755
> >    30 29 0:27 / /sys/fs/cgroup/unified rw,nosuid,nodev,noexec,relatime
> > shared:10 - cgroup2 cgroup2 rw
> >    31 29 0:28 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime
> > shared:11 - cgroup cgroup rw,xattr,name=systemd
> >    32 20 0:29 / /sys/fs/pstore rw,nosuid,nodev,noexec,relatime shared:12 -
> > pstore pstore rw
> >    33 20 0:30 / /sys/fs/bpf rw,nosuid,nodev,noexec,relatime shared:13 -
> > bpf bpf
> > rw,mode=700
> >    34 29 0:31 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime
> > shared:15
> > - cgroup cgroup rw,memory
> >    35 29 0:32 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime
> > shared:16
> > - cgroup cgroup rw,cpuset,clone_children
> >    36 29 0:33 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime
> > shared:17 - cgroup cgroup rw,cpu,cpuacct
> >    37 29 0:34 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime
> > shared:18
> > - cgroup cgroup rw,blkio
> >    38 29 0:35 / /sys/fs/cgroup/net_cls,net_prio
> > rw,nosuid,nodev,noexec,relatime
> > shared:19 - cgroup cgroup rw,net_cls,net_prio
> >    39 29 0:36 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime
> > shared:20 - cgroup cgroup rw,perf_event
> >    40 29 0:37 / /sys/fs/cgroup/rdma rw,nosuid,nodev,noexec,relatime
> > shared:21 -
> > cgroup cgroup rw,rdma
> >    41 29 0:38 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime
> > shared:22 - cgroup cgroup rw,freezer
> >    42 29 0:39 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime
> > shared:23 - cgroup cgroup rw,devices
> >    43 29 0:40 / /sys/fs/cgroup/pids rw,nosuid,nodev,noexec,relatime
> > shared:24 -
> > cgroup cgroup rw,pids
> >    45 22 0:18 / /dev/mqueue rw,relatime shared:25 - mqueue mqueue rw
> >    44 22 0:41 / /dev/hugepages rw,relatime shared:26 - hugetlbfs hugetlbfs
> > rw,pagesize=2M
> >    46 20 0:8 / /sys/kernel/debug rw,relatime shared:27 - debugfs debugfs
> > rw
> >    47 21 0:42 / /proc/sys/fs/binfmt_misc rw,relatime shared:28 - autofs
> > systemd-1
> > rw,fd=41,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_ino=1678
> >    230 25 0:49 / /var/lib/lxcfs rw,nosuid,nodev,relatime shared:122 -
> > fuse.lxcfs lxcfs rw,user_id=0,group_id=0,allow_other
> >    245 20 0:50 / /sys/fs/fuse/connections rw,relatime shared:161 - fusectl
> > fusectl rw
> >    266 24 0:51 / /run/user/1000 rw,nosuid,nodev,relatime shared:169 -
> > tmpfs
> > tmpfs rw,size=1229912k,mode=700,uid=1000,gid=1000
> > 
> > 
> > * ``grep cgfs /etc/pam.d/common-session*``::
> > 
> > 	session optional pam_cgfs.so -c
> > freezer,memory,cpu,cpuset,cpuacct,unified,name=systemd
> > 	session optional pam_cgfs.so -c
> > freezer,memory,cpu,cpuset,cpuacct,unified,name=systemd
> > 
> > container config
> > ----------------
> > 
> > * ``cat rproxy/config``::
> > 
> >      lxc.include = /home/lxc/.config/lxc/common.conf
> >      lxc.uts.name = rproxy
> >      lxc.rootfs.path = btrfs:/home/lxc/rproxy/rootfs
> >      lxc.net.0.link = lxc-br-rproxy
> >      lxc.net.0.ipv6.address = fd00::2/16
> > 
> > * ``cat /home/lxc/.config/lxc/common.conf``
> > 
> >      lxc.include = /usr/share/lxc/config/common.conf
> >      lxc.include = /usr/share/lxc/config/userns.conf
> >      lxc.include = /etc/lxc/default.conf
> > 
> >      lxc.apparmor.profile = unconfined
> > 
> >      lxc.arch = x86_64
> >      lxc.start.auto = 1
> >      lxc.start.delay = 20
> > 
> >      lxc.net.0.type = veth
> >      lxc.net.0.name = eth0
> >      lxc.net.0.flags = up
> >      lxc.net.0.ipv6.gateway = auto
> > 
> >      lxc.idmap = u 0 165536 65536
> >      lxc.idmap = g 0 165536 65536
> > 
> > * ``/etc/lxc/default.conf``::
> > 
> >      lxc.net.0.type = empty
> >      lxc.apparmor.profile = generated
> >      lxc.apparmor.allow_nesting = 1
> > 
> > * ``cat /usr/share/lxc/config/userns.conf``
> > 
> >      lxc.cgroup.devices.deny =
> >      lxc.cgroup.devices.allow =
> > 
> >      lxc.cap.drop =
> >      lxc.cap.keep =
> > 
> >      lxc.tty.dir =
> > 
> >      lxc.mount.auto = sys:rw
> > 
> > * ``cat /usr/share/lxc/config/common.conf``::
> > 
> >      # Setup the LXC devices in /dev/lxc/
> >      lxc.tty.dir = lxc
> > 
> >      # Allow for 1024 pseudo terminals
> >      lxc.pty.max = 1024
> > 
> >      # Setup 4 tty devices
> >      lxc.tty.max = 4
> > 
> >      # Drop some harmful capabilities
> >      lxc.cap.drop = mac_admin mac_override sys_time sys_module sys_rawio
> > 
> >      # Ensure hostname is changed on clone
> >      lxc.hook.clone = /usr/share/lxc/hooks/clonehostname
> > 
> >      # CGroup whitelist
> >      lxc.cgroup.devices.deny = a
> >      ## Allow any mknod (but not reading/writing the node)
> >      lxc.cgroup.devices.allow = c *:* m
> >      lxc.cgroup.devices.allow = b *:* m
> >      ## Allow specific devices
> >      ### /dev/null
> >      lxc.cgroup.devices.allow = c 1:3 rwm
> >      ### /dev/zero
> >      lxc.cgroup.devices.allow = c 1:5 rwm
> >      ### /dev/full
> >      lxc.cgroup.devices.allow = c 1:7 rwm
> >      ### /dev/tty
> >      lxc.cgroup.devices.allow = c 5:0 rwm
> >      ### /dev/console
> >      lxc.cgroup.devices.allow = c 5:1 rwm
> >      ### /dev/ptmx
> >      lxc.cgroup.devices.allow = c 5:2 rwm
> >      ### /dev/random
> >      lxc.cgroup.devices.allow = c 1:8 rwm
> >      ### /dev/urandom
> >      lxc.cgroup.devices.allow = c 1:9 rwm
> >      ### /dev/pts/*
> >      lxc.cgroup.devices.allow = c 136:* rwm
> >      ### fuse
> >      lxc.cgroup.devices.allow = c 10:229 rwm
> > 
> >      lxc.mount.auto = cgroup:mixed proc:mixed sys:mixed
> >      lxc.mount.entry = /sys/fs/fuse/connections sys/fs/fuse/connections
> > none
> > bind,optional 0 0
> > 
> >      lxc.seccomp.profile = /usr/share/lxc/config/common.seccomp
> > 
> >      lxc.include = /usr/share/lxc/config/common.conf.d/
> > 
> > * ``grep lxc /etc/sub{g,u}id``::
> > 
> >      /etc/subgid:lxc:165536:65536
> >      /etc/subuid:lxc:165536:65536
> > 
> > * ``umask``: 077
> > 
> > * I also tried this (overkill approach) to make the cgroups writable
> >    (I guess?) without success::
> > 
> >      for x in `find /sys/fs/cgroup -name lxc`; do
> >        echo; echo $x; chgrp -R lxc $x; chmod g+rw $x;
> >      done
> > 
> > 
> > _______________________________________________
> > lxc-users mailing list
> > lxc-users at lists.linuxcontainers.org
> > http://lists.linuxcontainers.org/listinfo/lxc-users
> > 
> 
> _______________________________________________
> lxc-users mailing list
> lxc-users at lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-users

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://lists.linuxcontainers.org/pipermail/lxc-users/attachments/20190529/eebf6406/attachment.sig>


More information about the lxc-users mailing list