[lxc-users] Can't start container after lxd/lxc/lxcfs upgrade
Fajar A. Nugraha
list at fajar.net
Fri Mar 18 22:47:19 UTC 2016
On Sat, Mar 19, 2016 at 1:12 AM, B G <bg85305 at gmail.com> wrote:
> lxc => 2.0.0rc4
> lxd => 2.0.0rc4
> lxcfs => 2.0.0rc6
>
> After the latest upgrade to lxc/lxd tools existing and new containers fail
> to start, failing on the following stage from the container log:
>
> lxc 20160318161829.810 INFO lxc_conf - conf.c:run_script_argv:367 -
> Executing script '/usr/share/lxcfs/lxc.mount.hook' for container
> 'testcontainer-20160311-0918', config section 'lxc'
> lxc 20160318161829.856 ERROR lxc_conf - conf.c:run_buffer:347 - Script
> exited with status 1
> lxc 20160318161829.856 ERROR lxc_conf - conf.c:lxc_setup:3750 - failed to
> run mount hooks for container 'testcontainer-20160311-0918'.
>
> There don't appear to be any logs or debug output from the lxc.mount.hook
> script that I can see that will help further.
I had to add my own debugging lines to figure out what's wrong
>
> LXC, LXD and LXCFS services are reported running by systemd.
>
> Any help greatly appreciated!
Somewhere close to the end of lxc.mount.hook I setup debugging line
to see what the container's cgroup looks like. It shows this
+ ls -la /usr/lib/x86_64-linux-gnu/lxc/sys/fs/cgroup
total 0
drwxr-xr-x 12 root root 240 Mar 18 16:25 .
drwxr-xr-x 7 root root 0 Mar 18 16:15 ..
drwxr-xr-x 3 root root 60 Mar 18 16:25 blkio
drwxr-xr-x 3 root root 60 Mar 18 16:25 cpu
drwxr-xr-x 3 root root 60 Mar 18 16:25 cpuset
drwxr-xr-x 3 root root 60 Mar 18 16:25 devices
drwxr-xr-x 3 root root 60 Mar 18 16:25 freezer
drwxr-xr-x 3 root root 60 Mar 18 16:25 hugetlb
drwxr-xr-x 3 root root 60 Mar 18 16:25 memory
drwxr-xr-x 3 root root 60 Mar 18 16:25 net_cls
drwxr-xr-x 3 root root 60 Mar 18 16:25 perf_event
drwxr-xr-x 3 root root 60 Mar 18 16:25 systemd
That's probably where the bug lies. cpu and net_cls is already their
own directory. However lxc.mount.hook tries to create a symlink from
cpu,cpuset (which will be created and bind-mounted later) to cpu.
Since that directory already exist, it ended up trying to create
/usr/lib/x86_64-linux-gnu/lxc/sys/fs/cgroup/cpu/cpu symlink instead of
/usr/lib/x86_64-linux-gnu/lxc/sys/fs/cgroup/cpu. Which fails.
I didn't see a relevant change to lxcfs (rc4->rc6) on the "create
symlink" behavior, so the bug is probably somewhere in lxc (?) that
creates "cpu" and "net_cls" cgroup inside the container.
My workaround:
# diff -Naru /usr/share/lxcfs/lxc.mount.hook.orig
/usr/share/lxcfs/lxc.mount.hook
--- /usr/share/lxcfs/lxc.mount.hook.orig 2016-03-18
07:32:48.000000000 +0700
+++ /usr/share/lxcfs/lxc.mount.hook 2016-03-18 16:26:33.633345802 +0700
@@ -51,7 +51,13 @@
for single in $arr
do
if [ ! -L ${LXC_ROOTFS_MOUNT}/sys/fs/cgroup/$single ]; then
- ln -s $DEST ${LXC_ROOTFS_MOUNT}/sys/fs/cgroup/$single
+ if [ -d
${LXC_ROOTFS_MOUNT}/sys/fs/cgroup/$single ]; then
+ # a cgroup is already mounted there. Just
bind-mount ours
+ mount -n --bind $entry
${LXC_ROOTFS_MOUNT}/sys/fs/cgroup/$single
+ else
+ # I can simply create a symlink
+ ln -s $DEST
${LXC_ROOTFS_MOUNT}/sys/fs/cgroup/$single
+ fi
fi
done
fi
The comments speak for themselves. That at least allows the container
to start while waiting for the devs to come up with a proper fix. The
container ended up with a cgroup directory like this:
# ls -la /sys/fs/cgroup/
total 0
drwxr-xr-x 14 root root 320 Mar 18 16:43 .
drwxr-xr-x 7 root root 0 Mar 18 16:43 ..
drwxr-xr-x 3 root root 60 Mar 18 16:43 blkio
drwxr-xr-x 2 root root 0 Mar 19 05:39 cpu
drwxr-xr-x 2 root root 0 Mar 19 05:39 cpu,cpuacct
lrwxrwxrwx 1 root root 11 Mar 18 16:43 cpuacct -> cpu,cpuacct
drwxr-xr-x 3 root root 60 Mar 18 16:43 cpuset
drwxr-xr-x 3 root root 60 Mar 18 16:43 devices
drwxr-xr-x 3 root root 60 Mar 18 16:43 freezer
drwxr-xr-x 3 root root 60 Mar 18 16:43 hugetlb
drwxr-xr-x 3 root root 60 Mar 18 16:43 memory
drwxr-xr-x 2 root root 0 Mar 19 05:39 net_cls
drwxr-xr-x 2 root root 0 Mar 19 05:39 net_cls,net_prio
lrwxrwxrwx 1 root root 16 Mar 18 16:43 net_prio -> net_cls,net_prio
drwxr-xr-x 3 root root 60 Mar 18 16:43 perf_event
drwxr-xr-x 3 root root 60 Mar 18 16:43 systemd
Note how "cpu" is a directory, but "cpuacct" is a symlink. The same
goes for net_cls and net_prio.
--
Fajar
More information about the lxc-users
mailing list