[lxc-users] lxc-start hanging

Serge Hallyn serge.hallyn at ubuntu.com
Thu Oct 16 18:53:49 UTC 2014


Quoting Mahmood (mahmood at circleci.com):
> Recently, we have noticed a spike of lxc-start hanging in our fleet of
> short-lived containers.  After a machine successfully starting and
> stopping short-lived containers for awhile, a `lxc-start` hangs along
> with all subsequent lxc-start operations for new containers.  The host
> looks healthy otherwise for other applications, and lxc-stop succeeds
> for the remaining containers.
> 
> The lxc logs and strace can be found at
> https://gist.github.com/notnoopci/56cc26e3573745c65a73 .  It doesn't
> seem correlated with a change in lxc/cgroup/kernel versions, and I
> didn't notice an odd behavior of the containers running immediately
> prior to the machine getting wedged.  I also tried disabling apparmor
> profile and using (un-)privileged containers without any luck.
> 
> Any hints on how to debug the problem?  The hosts are running with
> Ubuntu LTS 14.04 with kernel 3.14.19-031419-generic  and lxc version
> 1.0.5.

The syslog looks like a kernel bug, but it may just be looking funny
bc you are running lxc-start through strace.  I've occasionally noticed
a hang that is due to cgmanager having started too early - when / was
ro - and a library call re-trying the creation of a dbus lockfile repeatedly
with EROFS.  You could try

sudo stop cgmanager;  sudo start cgmanager

and see if that fixes it.  But that would only be if it took about an
extra 20-30 seconds to start up, not if it hangs forever.


More information about the lxc-users mailing list