[lxc-users] zombie process blocks stopping of container
Michael H. Warfield
mhw at WittsEnd.com
Tue Jun 3 15:57:06 UTC 2014
On Tue, 2014-06-03 at 15:35 +0000, Serge Hallyn wrote:
> Quoting Stéphane Graber (stgraber at ubuntu.com):
> > On Tue, Jun 03, 2014 at 04:56:03PM +0200, Tamas Papp wrote:
> > >
> > > On 06/03/2014 04:50 PM, Stéphane Graber wrote:
> > > >lxc-stop will send SIGPWR (or the equivalent signal) to the container,
> > > >wait 30s then SIGKILL init. lxc-stop -k will skip the SIGPWR step,
> > > >lxc-stop --nokill will skip the SIGKILL step.
> > > >
> > > >It's pretty odd that init after a kill -9 is still marked running... I'd
> > > >have expected it to either go away or get stuck in D state if
> > > >something's really wrong...
> > > >
> > > >Do you see anything relevant in the kernel log?
> > >
> > > Nothing. I was in hurry, so I restarted the whole machine, I cannot
> > > collect more information.
> > > Unfortunately I'm pretty sure it will be back soon, since this was
> > > not the first time.
> > > What do you suggest, what should I check, when I face it again?
> >
> > So my hope would be for the kernel to report the task as hung which
> > causes a stacktrace to be dumped in dmesg. If not, then it's going to be
> > a bit harder to figure it out...
> Both the container init and its lxc monitor where in S state. I assume
> a sigcont sent to one or both of those would have fixed it.
I interpreted that "S" to indicate it was "suspended interruptable"
state. Maybe we're using a different ps but this is what my man page
says:
--
PROCESS STATE CODES
Here are the different values that the s, stat and state output
specifiers (header "STAT" or "S") will display to describe the state of
a process:
D uninterruptible sleep (usually IO)
R running or runnable (on run queue)
S interruptible sleep (waiting for an event to complete)
T stopped by job control signal
t stopped by debugger during the tracing
W paging (not valid since the 2.6.xx kernel)
X dead (should never be seen)
Z defunct ("zombie") process, terminated but not reaped by
its parent
--
State T is "stopped by job control signal"
In this case, though, it seems like the kernel is treating this init
process as if it were in state D, uninterruptible sleep, which even a
SIGKILL will not interrupt.
> Until that
> happens, any other tasks in the container will indeed be zombies waiting
> for the container init to reap them.
>
> The question is why container init and monitor were stopped.
Regards,
Mike
--
Michael H. Warfield (AI4NB) | (770) 978-7061 | mhw at WittsEnd.com
/\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/
NIC whois: MHW9 | An optimist believes we live in the best of all
PGP Key: 0x674627FF | possible worlds. A pessimist is sure of it!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 482 bytes
Desc: This is a digitally signed message part
URL: <http://lists.linuxcontainers.org/pipermail/lxc-users/attachments/20140603/627bde8a/attachment.sig>
More information about the lxc-users
mailing list