[lxc-devel] Potential deadlock with lxcfs and lxc-freeze

Serge Hallyn serge.hallyn at ubuntu.com
Wed Feb 17 23:14:12 UTC 2016


Quoting Fabian Grünbichler (f.gruenbichler at proxmox.com):
> 
> > Fabian Grünbichler <f.gruenbichler at proxmox.com> hat am 12. Februar 2016 um
> > 13:53 geschrieben:
> > 
> > Summary so far: uptime, ps and any other process accessing /proc/uptime within
> > a
> > container using lxcfs can pretty reliably make the whole container unfreezable
> > and cause the uptime-accessing process itself and the associated lxcfs process
> > to wait forever. Both states persist until either the container is shutdown or
> > the waiting lxcfs process is forcibly killed. The latter will also allow an
> > ongoing, hanging freeze to finish.
> 
> Finally got to the bottom of this issue. It is somewhat mitigated in recent
> lxcfs versions because of the init PID caching mechanism that was recently
> introduced, which does not cause a double fork for every read of /proc/uptime
> anymore, but only for each time the init's PID is cached (again).
> 
> The root cause is described in a glibc bug from 2013[1], which also describes
> the only possible workaround for this at the moment: if using setns() to change
> PID namespace (which only applies to the callers children), the parent must not

D'oh, yes.

> use fork(), but create the child via clone(). Otherwise, the forked child and
> the parent might have identical PIDs which fails an assertion in glibc's fork.c
> (the colliding PIDs are in different namespaces, but glibc does not care about
> PID namespaces in this code path). Using clone() avoids the issue because there
> is no such assertion in the clone code path ;)
> 
> Limiting the range of free PIDs on the host and in the container so that both
> share the same small range might still trigger the fork assertion in current
> lxcfs (and thus the freeze bug, if lxc-freeze is called in parallel), although I
> haven't tried this so far.
> 
> There are three occurrences of this (anti-)pattern ("fork() - setns() - fork()")
> in the current lxcfs code base, which should probably be patched:
> - write_task_init_pid_exit(), which was the culprit in the described uptime bug
> - pid_to_ns_wrapper() and pid_from_ns_wrapper(), which are used for reading and
> writing the tasks and cgroup.procs files.
> 
> I'd be willing to write the patches, if desired.

Thanks, that would be great.

> Sidenote: freezing a container while running "cat
> /sys/fs/cgroup/freezer/lxc/108/tasks
> /sys/fs/cgroup/freezer/lxc/108/cgroup.procs" sometimes causes error messages in
> the container: "cat: /sys/fs/cgroup/freezer/lxc/108/cgroup.procs: Interrupted
> system call" and less frequently, error messages in lxcfs' journal output: "Feb
> 17 16:11:54 host lxcfs[15643]: send_creds: Error getting reply from server over
> socketpair". Whenever the error message in the journal appears, the lxc-freeze
> process which is running at the same time hangs for a second or two (my guess is
> that lxc-freeze and send_creds are racy). So far, I could not trigger indefinite
> hangs of lxc-freeze or the container, like with the old uptime code.
> 
> Regards,
> Fabian
> 
> 1: https://sourceware.org/bugzilla/show_bug.cgi?id=15392
> 
> _______________________________________________
> lxc-devel mailing list
> lxc-devel at lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-devel


More information about the lxc-devel mailing list