[lxc-devel] [PATCH] RFC: how to fix race with fast init?
Serge Hallyn
serge.hallyn at ubuntu.com
Sun Mar 10 02:48:41 UTC 2013
Quoting Daniel Lezcano (daniel.lezcano at free.fr):
> On 03/09/2013 12:01 AM, Serge Hallyn wrote:
> > Detection of SIGCHLD from the container init by the monitor process
> > which spawned it is done during lxc_poll. If the monitor is slow
> > and the init (especially if using lxc-init to run /bin/true) exits
> > quickly, it can send its SIGCHLD before lxc_poll starts. In that
> > case lxc_poll ends up hanging forever waiting for the SIGCHLD,
> > while the init process is a zombie waiting to be reaped.
>
> This problem has already been solved a couple of years ago. I suspect
> there is another bug.
That could certainly be (in which case my patch would be the only
workaround for now). I did see the early sigfd init, but figured
that the sigchlds must not be getting queued? You do epoll_wait
much later, but you're not doing any sort of inotify thing.
Even when I move the initialization of the signalfd epoll earlier (but
not the epoll_wait itself, of course) I still miss the sigchld.
> The signalfd is set *before*, so if a signal arrives while starting the
> container, we should enter and exit immediately the mainloop.
>
> That could have an edge/level triggered problem but it is not the case.
>
> The problem is coming from the checking of the pid when it is received
> by the monitor.
>
> lxc-execute 1362829720.335 NOTICE lxc_execute - '/bin/true'
> started with pid '18591'
> lxc-execute 1362829721.335 INFO lxc_console - no rootfs, no console.
> lxc-execute 1362829721.335 WARN lxc_start - invalid pid for
> SIGCHLD: 18591 <> 18590
>
> So it is ignoring the signal.
But 18590 is presumably lxc-init, which becomes defunct, while 18591 is
the forked task to actualy do the work. So why do we miss lxc-init's
signal?
> This problem shouldn't appears with lxc-start.
It hasn't (that we know of), but then the init task in lxc-start takes
a lot longer. I haven't ried, what happens when you
lxc-start -n r1 -- lxc-init /bin/true
(my toy box isn't on right now, can't test tonight)
thanks,
-serge
More information about the lxc-devel
mailing list