[lxc-users] Minor issues with unprivileged systemd containers on Debian 8

Mathias Gibbens mathias at calenhad.com
Thu May 28 22:32:04 UTC 2015


Hello all,

  I've been working on moving some of my unprivileged lxc containers to
a new Debian 8 ("jessie") host. For the most part, they are working
really well. Thanks to everyone who has helped make this possible!
However, I've found a few minor issues that I'm hoping I could get some
feedback on and figure out how to fix them.

  Here are the versions of the software I'm running: cgmanager 0.37,
lxcfs 0.9, lxc 1.1.2, systemd 215-17 (Debian package), Linux 4.0.4
x86_64 with the grsec patch (with the various knobs tuned to allow
unprivileged containers; if anyone's interested I can share the
details).

  All the containers are unprivileged Debian 8 images created through
the download template that ships with lxc.

  Throughout this email, the container name is "test", the host name is
"narya", and the user account creating unprivileged containers is "lxc"
accessed via SSH. The cgroups.sh attached script is run by the lxc user
to setup cgroups before running lxc-* commands.


Issue 1: When starting a container, systemd consumes 100% CPU for ~25
seconds

  When I start an unprivileged container, it hangs for ~25 seconds
immediately after the output "Set hostname to <test>." appears. On the
host, I can see via `top` that the container's systemd process is
consuming 100% of the CPU. After 25 seconds, the container continues to
boot normally. I have attached the output of `journalctl` obtained from
the container immediately after boot (attachment journalctl.txt).

  I don't think this is the systemd-journald 100% CPU issue, since my
understanding is that this has been fixed. Also, the only time I've seen
systemd consume 100% in a container is right at boot.

Issue 2: When shutting down a container, `lxc-stop` hangs for ~60
seconds

  When I stop an unprivileged container, it shuts down promptly and I
see the final line of output "[  OK  ] Reached target Shutdown."
However, the command `lxc-stop` then hangs for about 60 seconds before
finally returning to the command line. I'm not sure why this is
happening. It's not really a big issue, unless I'm trying to do a
`lxc-stop && lxc-start` which causes an extra 60 second delay in
rebooting the container.

Issue 3: Within container, seeing "Failed at step CGROUP spawning" for
services

  Occasionally I'm seeing the following sorts of errors in my
containers' logs:

    systemd[27337]: Failed at step CGROUP spawning /etc/init.d/apache2:
No such file or directory
    systemd[27339]: Failed at step CGROUP spawning /etc/init.d/exim4: No
such file or directory

  I'm not sure what's causing this, other than it seems (from my limited
testing) to be occurring more frequently/only when the lxc user's
environment/cgroup mapping goes away when the lxc user logs out. If I
leave the lxc user logged in after starting my containers the above
errors don't seem to occur. However, once lxc logs out they begin to
appear.

  Maybe this is intended behavior? I'm still learning about systemd and
cgroups so maybe someone else will be able to better understand the
error.

Issue 4: On host, seeing cgmanager and lxcfs errors

  Finally, I sometimes see cgmanager and lxcfs errors in the host's logs
(attachment cgmanager-lxcfs.txt). In this example, I restarted the sshd
service running in a container. ssh continues to run just fine in the
container, so I'm not sure if these errors are just supposed to be
informative or if there's really some error occurring.



  Thanks for any help,
Mathias
-------------- next part --------------
root at test:/# journalctl 
Failed to read /proc/cmdline. Ignoring: No such file or directory
-- Logs begin at Thu 2015-05-28 21:38:05 UTC, end at Thu 2015-05-28 21:38:09 UTC. --
May 28 21:38:05 test systemd-journal[100]: Runtime journal is using 8.0M (max allowed 1.5G, trying to leave 2.3G free of 15.6G available → current limit 1.5G).
May 28 21:38:05 test systemd-journal[100]: Runtime journal is using 8.0M (max allowed 1.5G, trying to leave 2.3G free of 15.6G available → current limit 1.5G).
May 28 21:38:05 test systemd-journal[100]: Journal started
May 28 21:38:05 test systemd[1]: Starting Slices.
May 28 21:38:05 test systemd[1]: Reached target Slices.
May 28 21:38:05 test systemd[1]: Starting Remount Root and Kernel File Systems...
May 28 21:38:05 test mount[72]: mount: permission denied
May 28 21:38:05 test mount[82]: mount: permission denied
May 28 21:38:05 test systemd[1]: Mounted POSIX Message Queue File System.
May 28 21:38:05 test systemd[1]: Started Copy rules generated while the root was ro.
May 28 21:38:05 test systemd[1]: dev-hugepages.mount mount process exited, code=exited status=32
May 28 21:38:05 test systemd[1]: Failed to mount Huge Pages File System.
May 28 21:38:05 test systemd[1]: Unit dev-hugepages.mount entered failed state.
May 28 21:38:05 test systemd[1]: sys-kernel-debug.mount mount process exited, code=exited status=32
May 28 21:38:05 test systemd[1]: Failed to mount Debug File System.
May 28 21:38:05 test systemd[1]: Unit sys-kernel-debug.mount entered failed state.
May 28 21:38:05 test systemd[1]: Started Remount Root and Kernel File Systems.
May 28 21:38:05 test systemd[1]: Started Various fixups to make systemd work better on Debian.
May 28 21:38:05 test systemd[1]: Starting Load/Save Random Seed...
May 28 21:38:05 test systemd[1]: Starting Local File Systems (Pre).
May 28 21:38:05 test systemd[1]: Reached target Local File Systems (Pre).
May 28 21:38:05 test systemd[1]: Starting Local File Systems.
May 28 21:38:05 test systemd[1]: Reached target Local File Systems.
May 28 21:38:05 test systemd[1]: Starting Remote File Systems.
May 28 21:38:05 test systemd[1]: Reached target Remote File Systems.
May 28 21:38:05 test systemd[1]: Starting Trigger Flushing of Journal to Persistent Storage...
May 28 21:38:06 test systemd[1]: Starting Create Volatile Files and Directories...
May 28 21:38:06 test systemd[1]: Started Load/Save Random Seed.
May 28 21:38:06 test systemd[1]: Started Create Volatile Files and Directories.
May 28 21:38:06 test systemd[1]: Starting Update UTMP about System Boot/Shutdown...
May 28 21:38:06 test systemd[1]: Starting LSB: Raise network interfaces....
May 28 21:38:06 test systemd-journal[100]: Runtime journal is using 8.0M (max allowed 1.5G, trying to leave 2.3G free of 15.6G available → current limit 1.5G).
May 28 21:38:06 test dhclient[266]: Internet Systems Consortium DHCP Client 4.3.1
May 28 21:38:06 test dhclient[266]: Copyright 2004-2014 Internet Systems Consortium.
May 28 21:38:06 test dhclient[266]: All rights reserved.
May 28 21:38:06 test dhclient[266]: For info, please visit https://www.isc.org/software/dhcp/
May 28 21:38:06 test dhclient[266]: 
May 28 21:38:06 test networking[228]: Configuring network interfaces...Internet Systems Consortium DHCP Client 4.3.1
May 28 21:38:06 test networking[228]: Copyright 2004-2014 Internet Systems Consortium.
May 28 21:38:06 test networking[228]: All rights reserved.
May 28 21:38:06 test networking[228]: For info, please visit https://www.isc.org/software/dhcp/
May 28 21:38:06 test dhclient[266]: Listening on LPF/eth0/00:16:3e:0f:38:6e
May 28 21:38:06 test dhclient[266]: Sending on   LPF/eth0/00:16:3e:0f:38:6e
May 28 21:38:06 test dhclient[266]: Sending on   Socket/fallback
May 28 21:38:06 test dhclient[266]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 8
May 28 21:38:06 test networking[228]: Listening on LPF/eth0/00:16:3e:0f:38:6e
May 28 21:38:06 test networking[228]: Sending on   LPF/eth0/00:16:3e:0f:38:6e
May 28 21:38:06 test networking[228]: Sending on   Socket/fallback
May 28 21:38:06 test networking[228]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 8
May 28 21:38:06 test systemd[1]: Started Trigger Flushing of Journal to Persistent Storage.
May 28 21:38:06 test systemd[1]: Started Update UTMP about System Boot/Shutdown.
May 28 21:38:07 test dhclient[266]: DHCPREQUEST on eth0 to 255.255.255.255 port 67
May 28 21:38:07 test dhclient[266]: DHCPOFFER from 172.20.0.1
May 28 21:38:07 test dhclient[266]: DHCPACK from 172.20.0.1
May 28 21:38:07 test networking[228]: DHCPREQUEST on eth0 to 255.255.255.255 port 67
May 28 21:38:07 test networking[228]: DHCPOFFER from 172.20.0.1
May 28 21:38:07 test networking[228]: DHCPACK from 172.20.0.1
May 28 21:38:07 test dhclient[266]: bound to 172.20.0.223 -- renewal in 3153 seconds.
May 28 21:38:07 test networking[228]: bound to 172.20.0.223 -- renewal in 3153 seconds.
May 28 21:38:07 test networking[228]: done.
May 28 21:38:07 test systemd[1]: Started LSB: Raise network interfaces..
May 28 21:38:07 test systemd[1]: Starting Network.
May 28 21:38:07 test systemd[1]: Reached target Network.
May 28 21:38:07 test systemd[1]: Starting System Initialization.
May 28 21:38:07 test systemd[1]: Reached target System Initialization.
May 28 21:38:07 test systemd[1]: Starting Daily Cleanup of Temporary Directories.
May 28 21:38:07 test systemd[1]: Started Daily Cleanup of Temporary Directories.
May 28 21:38:07 test systemd[1]: Starting Timers.
May 28 21:38:07 test systemd[1]: Reached target Timers.
May 28 21:38:07 test systemd[1]: Starting Basic System.
May 28 21:38:07 test systemd[1]: Reached target Basic System.
May 28 21:38:07 test systemd[1]: Starting /etc/rc.local Compatibility...
May 28 21:38:07 test systemd[1]: Started Login Service.
May 28 21:38:07 test systemd[1]: Starting getty on tty2-tty4 if dbus and logind are not available...
May 28 21:38:07 test systemd[1]: Starting Permit User Sessions...
May 28 21:38:08 test systemd[1]: Started /etc/rc.local Compatibility.
May 28 21:38:08 test systemd[1]: Started Permit User Sessions.
May 28 21:38:08 test systemd[1]: Starting Cleanup of Temporary Directories...
May 28 21:38:08 test systemd[1]: Starting Getty on tty3...
May 28 21:38:08 test systemd[1]: Started Getty on tty3.
May 28 21:38:08 test systemd[1]: Starting Getty on tty2...
May 28 21:38:08 test systemd[1]: Started Getty on tty2.
May 28 21:38:08 test systemd[1]: Starting Getty on tty1...
May 28 21:38:08 test systemd[1]: Started Getty on tty1.
May 28 21:38:08 test systemd[1]: Starting Getty on tty4...
May 28 21:38:08 test systemd[1]: Started Getty on tty4.
May 28 21:38:08 test systemd[1]: Starting Container Getty on /dev/pts/3...
May 28 21:38:08 test systemd[1]: Started Container Getty on /dev/pts/3.
May 28 21:38:08 test systemd[1]: Starting Container Getty on /dev/pts/2...
May 28 21:38:08 test systemd[1]: Started Container Getty on /dev/pts/2.
May 28 21:38:08 test systemd[1]: Starting Container Getty on /dev/pts/1...
May 28 21:38:08 test systemd[1]: Started Container Getty on /dev/pts/1.
May 28 21:38:08 test systemd[1]: Starting Container Getty on /dev/pts/0...
May 28 21:38:08 test systemd[1]: Started Container Getty on /dev/pts/0.
May 28 21:38:08 test systemd[1]: Starting Console Getty...
May 28 21:38:08 test systemd[1]: Started Console Getty.
May 28 21:38:08 test systemd[1]: Started Cleanup of Temporary Directories.
May 28 21:38:08 test systemd[1]: Started Getty on tty2.
May 28 21:38:08 test systemd[1]: Started Getty on tty3.
May 28 21:38:08 test systemd[1]: Started Getty on tty4.
May 28 21:38:08 test systemd[1]: Started getty on tty2-tty4 if dbus and logind are not available.
May 28 21:38:08 test systemd[1]: Starting Login Prompts.
May 28 21:38:08 test systemd[1]: Reached target Login Prompts.
May 28 21:38:08 test systemd[1]: Starting Multi-User System.
May 28 21:38:08 test systemd[1]: Reached target Multi-User System.
May 28 21:38:08 test systemd[1]: Starting Update UTMP about System Runlevel Changes...
May 28 21:38:09 test systemd[1]: Started Update UTMP about System Runlevel Changes.
May 28 21:38:09 test systemd[1]: Startup finished in 29.472s.
May 28 21:38:09 test agetty[393]: /dev/pts/1: cannot get controlling tty: Operation not permitted
May 28 21:38:09 test agetty[393]: /dev/pts/1: cannot get controlling tty: Operation not permitted
May 28 21:38:09 test agetty[393]: /dev/pts/1: cannot set process group: Inappropriate ioctl for device
May 28 21:38:09 test agetty[399]: /dev/pts/0: cannot get controlling tty: Operation not permitted
May 28 21:38:09 test agetty[399]: /dev/pts/0: cannot get controlling tty: Operation not permitted
May 28 21:38:09 test agetty[399]: /dev/pts/0: cannot set process group: Inappropriate ioctl for device
May 28 21:38:09 test agetty[387]: /dev/pts/2: cannot get controlling tty: Operation not permitted
May 28 21:38:09 test agetty[387]: /dev/pts/2: cannot get controlling tty: Operation not permitted
May 28 21:38:09 test agetty[387]: /dev/pts/2: cannot set process group: Inappropriate ioctl for device
May 28 21:38:09 test agetty[375]: /dev/tty4: cannot get controlling tty: Operation not permitted
May 28 21:38:09 test agetty[375]: /dev/tty4: cannot get controlling tty: Operation not permitted
May 28 21:38:09 test agetty[375]: /dev/tty4: cannot set process group: Inappropriate ioctl for device
-------------- next part --------------
May 28 20:41:48 narya cgmanager[957]: cgmanager:do_remove_main: Failed to remove /run/cgmanager/fs/none,name=systemd/lxc/lxc/test/system.slice/ssh.service: Device or resource busy
May 28 20:41:48 narya lxcfs[958]: call to remove (name=systemd:lxc/lxc/test/system.slice/ssh.service) failed: invalid request
May 28 20:43:35 narya lxcfs[958]: send_creds: failed at sendmsg: No such process
May 28 20:43:37 narya lxcfs[958]: send_creds: Error getting reply from server over socketpair
May 28 20:43:37 narya lxcfs[958]: do_read_pids: failed to ask child to exit: No such process
May 28 20:43:37 narya lxcfs[958]: Failed to select for scm_cred: Resource temporarily unavailable
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cgroups.sh
Type: application/x-shellscript
Size: 358 bytes
Desc: not available
URL: <http://lists.linuxcontainers.org/pipermail/lxc-users/attachments/20150528/5800c94c/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part
URL: <http://lists.linuxcontainers.org/pipermail/lxc-users/attachments/20150528/5800c94c/attachment.sig>


More information about the lxc-users mailing list