[lxc-users] Can't start container after lxd/lxc/lxcfs upgrade
Stéphane Graber
stgraber at ubuntu.com
Sat Mar 19 00:07:30 UTC 2016
On Sat, Mar 19, 2016 at 05:47:19AM +0700, Fajar A. Nugraha wrote:
> On Sat, Mar 19, 2016 at 1:12 AM, B G <bg85305 at gmail.com> wrote:
> > lxc => 2.0.0rc4
> > lxd => 2.0.0rc4
> > lxcfs => 2.0.0rc6
> >
> > After the latest upgrade to lxc/lxd tools existing and new containers fail
> > to start, failing on the following stage from the container log:
> >
> > lxc 20160318161829.810 INFO lxc_conf - conf.c:run_script_argv:367 -
> > Executing script '/usr/share/lxcfs/lxc.mount.hook' for container
> > 'testcontainer-20160311-0918', config section 'lxc'
> > lxc 20160318161829.856 ERROR lxc_conf - conf.c:run_buffer:347 - Script
> > exited with status 1
> > lxc 20160318161829.856 ERROR lxc_conf - conf.c:lxc_setup:3750 - failed to
> > run mount hooks for container 'testcontainer-20160311-0918'.
> >
> > There don't appear to be any logs or debug output from the lxc.mount.hook
> > script that I can see that will help further.
>
> I had to add my own debugging lines to figure out what's wrong
>
>
> >
> > LXC, LXD and LXCFS services are reported running by systemd.
> >
> > Any help greatly appreciated!
>
>
> Somewhere close to the end of lxc.mount.hook I setup debugging line
> to see what the container's cgroup looks like. It shows this
>
> + ls -la /usr/lib/x86_64-linux-gnu/lxc/sys/fs/cgroup
> total 0
> drwxr-xr-x 12 root root 240 Mar 18 16:25 .
> drwxr-xr-x 7 root root 0 Mar 18 16:15 ..
> drwxr-xr-x 3 root root 60 Mar 18 16:25 blkio
> drwxr-xr-x 3 root root 60 Mar 18 16:25 cpu
> drwxr-xr-x 3 root root 60 Mar 18 16:25 cpuset
> drwxr-xr-x 3 root root 60 Mar 18 16:25 devices
> drwxr-xr-x 3 root root 60 Mar 18 16:25 freezer
> drwxr-xr-x 3 root root 60 Mar 18 16:25 hugetlb
> drwxr-xr-x 3 root root 60 Mar 18 16:25 memory
> drwxr-xr-x 3 root root 60 Mar 18 16:25 net_cls
> drwxr-xr-x 3 root root 60 Mar 18 16:25 perf_event
> drwxr-xr-x 3 root root 60 Mar 18 16:25 systemd
Can you also extract /proc/self/mountinfo at that time please?
It indeed looks like the change to add cgroup and cgroup-full
lxc.mount.auto support into cgfsng with rc11 is causing some trouble.
I'll need to setup a machine where I can reproduce this as none of my
systems are running into this, presumably because they all have cgns
kernels.
>
>
> That's probably where the bug lies. cpu and net_cls is already their
> own directory. However lxc.mount.hook tries to create a symlink from
> cpu,cpuset (which will be created and bind-mounted later) to cpu.
> Since that directory already exist, it ended up trying to create
> /usr/lib/x86_64-linux-gnu/lxc/sys/fs/cgroup/cpu/cpu symlink instead of
> /usr/lib/x86_64-linux-gnu/lxc/sys/fs/cgroup/cpu. Which fails.
>
> I didn't see a relevant change to lxcfs (rc4->rc6) on the "create
> symlink" behavior, so the bug is probably somewhere in lxc (?) that
> creates "cpu" and "net_cls" cgroup inside the container.
>
> My workaround:
>
> # diff -Naru /usr/share/lxcfs/lxc.mount.hook.orig
> /usr/share/lxcfs/lxc.mount.hook
> --- /usr/share/lxcfs/lxc.mount.hook.orig 2016-03-18
> 07:32:48.000000000 +0700
> +++ /usr/share/lxcfs/lxc.mount.hook 2016-03-18 16:26:33.633345802 +0700
> @@ -51,7 +51,13 @@
> for single in $arr
> do
> if [ ! -L ${LXC_ROOTFS_MOUNT}/sys/fs/cgroup/$single ]; then
> - ln -s $DEST ${LXC_ROOTFS_MOUNT}/sys/fs/cgroup/$single
> + if [ -d
> ${LXC_ROOTFS_MOUNT}/sys/fs/cgroup/$single ]; then
> + # a cgroup is already mounted there. Just
> bind-mount ours
> + mount -n --bind $entry
> ${LXC_ROOTFS_MOUNT}/sys/fs/cgroup/$single
> + else
> + # I can simply create a symlink
> + ln -s $DEST
> ${LXC_ROOTFS_MOUNT}/sys/fs/cgroup/$single
> + fi
> fi
> done
> fi
>
>
> The comments speak for themselves. That at least allows the container
> to start while waiting for the devs to come up with a proper fix. The
> container ended up with a cgroup directory like this:
>
> # ls -la /sys/fs/cgroup/
> total 0
> drwxr-xr-x 14 root root 320 Mar 18 16:43 .
> drwxr-xr-x 7 root root 0 Mar 18 16:43 ..
> drwxr-xr-x 3 root root 60 Mar 18 16:43 blkio
> drwxr-xr-x 2 root root 0 Mar 19 05:39 cpu
> drwxr-xr-x 2 root root 0 Mar 19 05:39 cpu,cpuacct
> lrwxrwxrwx 1 root root 11 Mar 18 16:43 cpuacct -> cpu,cpuacct
> drwxr-xr-x 3 root root 60 Mar 18 16:43 cpuset
> drwxr-xr-x 3 root root 60 Mar 18 16:43 devices
> drwxr-xr-x 3 root root 60 Mar 18 16:43 freezer
> drwxr-xr-x 3 root root 60 Mar 18 16:43 hugetlb
> drwxr-xr-x 3 root root 60 Mar 18 16:43 memory
> drwxr-xr-x 2 root root 0 Mar 19 05:39 net_cls
> drwxr-xr-x 2 root root 0 Mar 19 05:39 net_cls,net_prio
> lrwxrwxrwx 1 root root 16 Mar 18 16:43 net_prio -> net_cls,net_prio
> drwxr-xr-x 3 root root 60 Mar 18 16:43 perf_event
> drwxr-xr-x 3 root root 60 Mar 18 16:43 systemd
>
> Note how "cpu" is a directory, but "cpuacct" is a symlink. The same
> goes for net_cls and net_prio.
>
> --
> Fajar
> _______________________________________________
> lxc-users mailing list
> lxc-users at lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-users
--
Stéphane Graber
Ubuntu developer
http://www.ubuntu.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.linuxcontainers.org/pipermail/lxc-users/attachments/20160318/a9ee231d/attachment.sig>
More information about the lxc-users
mailing list