[lxc-users] Can't start container after lxd/lxc/lxcfs upgrade

Stéphane Graber stgraber at ubuntu.com
Sat Mar 19 00:07:30 UTC 2016


On Sat, Mar 19, 2016 at 05:47:19AM +0700, Fajar A. Nugraha wrote:
> On Sat, Mar 19, 2016 at 1:12 AM, B G <bg85305 at gmail.com> wrote:
> > lxc => 2.0.0rc4
> > lxd => 2.0.0rc4
> > lxcfs => 2.0.0rc6
> >
> > After the latest upgrade to lxc/lxd tools existing and new containers fail
> > to start, failing on the following stage from the container log:
> >
> > lxc 20160318161829.810 INFO     lxc_conf - conf.c:run_script_argv:367 -
> > Executing script '/usr/share/lxcfs/lxc.mount.hook' for container
> > 'testcontainer-20160311-0918', config section 'lxc'
> > lxc 20160318161829.856 ERROR    lxc_conf - conf.c:run_buffer:347 - Script
> > exited with status 1
> > lxc 20160318161829.856 ERROR    lxc_conf - conf.c:lxc_setup:3750 - failed to
> > run mount hooks for container 'testcontainer-20160311-0918'.
> >
> > There don't appear to be any logs or debug output from the lxc.mount.hook
> > script that I can see that will help further.
> 
> I had to add my own debugging lines to figure out what's wrong
> 
> 
> >
> > LXC, LXD and LXCFS services are reported running by systemd.
> >
> > Any help greatly appreciated!
> 
> 
> Somewhere close to the end of  lxc.mount.hook I setup debugging line
> to see what the container's cgroup looks like. It shows this
> 
> + ls -la /usr/lib/x86_64-linux-gnu/lxc/sys/fs/cgroup
> total 0
> drwxr-xr-x 12 root root 240 Mar 18 16:25 .
> drwxr-xr-x  7 root root   0 Mar 18 16:15 ..
> drwxr-xr-x  3 root root  60 Mar 18 16:25 blkio
> drwxr-xr-x  3 root root  60 Mar 18 16:25 cpu
> drwxr-xr-x  3 root root  60 Mar 18 16:25 cpuset
> drwxr-xr-x  3 root root  60 Mar 18 16:25 devices
> drwxr-xr-x  3 root root  60 Mar 18 16:25 freezer
> drwxr-xr-x  3 root root  60 Mar 18 16:25 hugetlb
> drwxr-xr-x  3 root root  60 Mar 18 16:25 memory
> drwxr-xr-x  3 root root  60 Mar 18 16:25 net_cls
> drwxr-xr-x  3 root root  60 Mar 18 16:25 perf_event
> drwxr-xr-x  3 root root  60 Mar 18 16:25 systemd

Can you also extract /proc/self/mountinfo at that time please?

It indeed looks like the change to add cgroup and cgroup-full
lxc.mount.auto support into cgfsng with rc11 is causing some trouble.

I'll need to setup a machine where I can reproduce this as none of my
systems are running into this, presumably because they all have cgns
kernels.

> 
> 
> That's probably where the bug lies. cpu and net_cls is already their
> own directory. However lxc.mount.hook tries to create a symlink from
> cpu,cpuset (which will be created and bind-mounted later) to cpu.
> Since that directory already exist, it ended up trying to create
> /usr/lib/x86_64-linux-gnu/lxc/sys/fs/cgroup/cpu/cpu symlink instead of
>  /usr/lib/x86_64-linux-gnu/lxc/sys/fs/cgroup/cpu. Which fails.
> 
> I didn't see a relevant change to lxcfs (rc4->rc6) on the "create
> symlink" behavior, so the bug is probably somewhere in lxc (?) that
> creates "cpu" and "net_cls" cgroup inside the container.
> 
> My workaround:
> 
> # diff -Naru /usr/share/lxcfs/lxc.mount.hook.orig
> /usr/share/lxcfs/lxc.mount.hook
> --- /usr/share/lxcfs/lxc.mount.hook.orig        2016-03-18
> 07:32:48.000000000 +0700
> +++ /usr/share/lxcfs/lxc.mount.hook     2016-03-18 16:26:33.633345802 +0700
> @@ -51,7 +51,13 @@
>                  for single in $arr
>                  do
>                      if [ ! -L ${LXC_ROOTFS_MOUNT}/sys/fs/cgroup/$single ]; then
> -                        ln -s $DEST ${LXC_ROOTFS_MOUNT}/sys/fs/cgroup/$single
> +                        if [ -d
> ${LXC_ROOTFS_MOUNT}/sys/fs/cgroup/$single ]; then
> +                            # a cgroup is already mounted there. Just
> bind-mount ours
> +                            mount -n --bind $entry
> ${LXC_ROOTFS_MOUNT}/sys/fs/cgroup/$single
> +                        else
> +                            # I can simply create a symlink
> +                            ln -s $DEST
> ${LXC_ROOTFS_MOUNT}/sys/fs/cgroup/$single
> +                        fi
>                      fi
>                  done
>              fi
> 
> 
> The comments speak for themselves. That at least allows the container
> to start while waiting for the devs to come up with a proper fix. The
> container ended up with a cgroup directory like this:
> 
> # ls -la /sys/fs/cgroup/
> total 0
> drwxr-xr-x 14 root root 320 Mar 18 16:43 .
> drwxr-xr-x  7 root root   0 Mar 18 16:43 ..
> drwxr-xr-x  3 root root  60 Mar 18 16:43 blkio
> drwxr-xr-x  2 root root   0 Mar 19 05:39 cpu
> drwxr-xr-x  2 root root   0 Mar 19 05:39 cpu,cpuacct
> lrwxrwxrwx  1 root root  11 Mar 18 16:43 cpuacct -> cpu,cpuacct
> drwxr-xr-x  3 root root  60 Mar 18 16:43 cpuset
> drwxr-xr-x  3 root root  60 Mar 18 16:43 devices
> drwxr-xr-x  3 root root  60 Mar 18 16:43 freezer
> drwxr-xr-x  3 root root  60 Mar 18 16:43 hugetlb
> drwxr-xr-x  3 root root  60 Mar 18 16:43 memory
> drwxr-xr-x  2 root root   0 Mar 19 05:39 net_cls
> drwxr-xr-x  2 root root   0 Mar 19 05:39 net_cls,net_prio
> lrwxrwxrwx  1 root root  16 Mar 18 16:43 net_prio -> net_cls,net_prio
> drwxr-xr-x  3 root root  60 Mar 18 16:43 perf_event
> drwxr-xr-x  3 root root  60 Mar 18 16:43 systemd
> 
> Note how "cpu" is a directory, but "cpuacct" is a symlink. The same
> goes for net_cls and net_prio.
> 
> -- 
> Fajar
> _______________________________________________
> lxc-users mailing list
> lxc-users at lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-users

-- 
Stéphane Graber
Ubuntu developer
http://www.ubuntu.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.linuxcontainers.org/pipermail/lxc-users/attachments/20160318/a9ee231d/attachment.sig>


More information about the lxc-users mailing list