[lxc-users] lxc-start hanging
Serge Hallyn
serge.hallyn at ubuntu.com
Thu Oct 16 18:53:49 UTC 2014
Quoting Mahmood (mahmood at circleci.com):
> Recently, we have noticed a spike of lxc-start hanging in our fleet of
> short-lived containers. After a machine successfully starting and
> stopping short-lived containers for awhile, a `lxc-start` hangs along
> with all subsequent lxc-start operations for new containers. The host
> looks healthy otherwise for other applications, and lxc-stop succeeds
> for the remaining containers.
>
> The lxc logs and strace can be found at
> https://gist.github.com/notnoopci/56cc26e3573745c65a73 . It doesn't
> seem correlated with a change in lxc/cgroup/kernel versions, and I
> didn't notice an odd behavior of the containers running immediately
> prior to the machine getting wedged. I also tried disabling apparmor
> profile and using (un-)privileged containers without any luck.
>
> Any hints on how to debug the problem? The hosts are running with
> Ubuntu LTS 14.04 with kernel 3.14.19-031419-generic and lxc version
> 1.0.5.
The syslog looks like a kernel bug, but it may just be looking funny
bc you are running lxc-start through strace. I've occasionally noticed
a hang that is due to cgmanager having started too early - when / was
ro - and a library call re-trying the creation of a dbus lockfile repeatedly
with EROFS. You could try
sudo stop cgmanager; sudo start cgmanager
and see if that fixes it. But that would only be if it took about an
extra 20-30 seconds to start up, not if it hangs forever.
More information about the lxc-users
mailing list