[Lxc-users] unstoppable container

Serge E. Hallyn serge.hallyn at canonical.com
Tue Aug 31 13:26:22 UTC 2010


Quoting Papp Tamás (tompos at martos.bme.hu):
> 
> Serge E. Hallyn wrote, On 2010. 08. 31. 4:06:
> >Quoting Daniel Lezcano (daniel.lezcano at free.fr):
> >>On 08/31/2010 12:23 AM, Serge E. Hallyn wrote:
> >>>Quoting Daniel Lezcano (daniel.lezcano at free.fr):
> >>>>On 08/30/2010 02:36 PM, Serge E. Hallyn wrote:
> >>>>>Quoting Papp Tamás (tompos at martos.bme.hu):
> >>>>>>Daniel Lezcano wrote, On 2010. 08. 30. 13:08:
> >>>>>>>Usually, there is a mechanism used in lxc to kill -9 the process 1 of
> >>>>>>>the container (which wipes out all the processes of the containers)
> >>>>>>>when lxc-start dies.
> >>>>>>It should wipe out them, but in my case it was unsuccessfull, even if I
> >>>>>>killed the init process by hand.
> >>>>>>
> >>>>>>>So if you still have the processes running inside the container but
> >>>>>>>lxc-start is dead, then:
> >>>>>>>  * you are using a 2.6.32 kernel which is buggy (this mechanism is
> >>>>>>>broken).
> >>>>>>Ubuntu 10.04, so it's exactly the point, the kernel is 2.6.32 .
> >>>>>>
> >>>>>>
> >>>>>>Could you point me (or the Ubuntu guy in the list) to an URL, which
> >>>>>>describes the problem or maybe to the kernel patch. If it's possible,
> >>>>>>maybe the Ubuntu kernel maintainers would fix the official Ubuntu kernel.
> >>>>>Daniel,
> >>>>>
> >>>>>which patch are you talking about?  (presumably a patch against
> >>>>>zap_pid_ns_processes()?)  If it's keeping containers from properly
> >>>>>shutting down, we may be able to SRU a small enough patch, but if
> >>>>>it involves a whole Oleg rewrite then maybe not :)
> >>>>I am referring to these ones:
> >>>>
> >>>>http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commit;h=13aa9a6b0f2371d2ce0de57c2ede62ab7a787157
> >>>>http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commit;h=dd34200adc01c5217ef09b55905b5c2312d65535
> >>>>http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commit;h=dd34200adc01c5217ef09b55905b5c2312d65535
> >>>(note, second and third are identical - did you mean to paste 2 or 3 links?
> >>3 links, was this one.
> >>
> >>http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commit;h=614c517d7c00af1b26ded20646b329397d6f51a1
> >
> >Ah, thanks.
> >
> >I had a feeling the second one depended on defining si_fromuser in all
> >lowercase, but for some reason git wasn't showing that one to me easily.
> >
> >>>>Are they small enough for a SRU ?
> >>>The first one looks trivial enough.  I'd be afraid the second one would be
> >>>considered to have deep and subtle regression potential.  But, we can
> >>>always try.  I'm not on the kernel team so am not likely to have any say
> >>>on it myself :)
> >>Shall we ask directly to the kernel-team@ mailing list ? Or do we
> >>have to do a SRU first ?
> >
> >Actually, first step would be for Papp to open a bug against both
> >lxc and the kernel.  Papp, do you mind doing that?
> >
> >Without a bug, an SRU ain't gonna fly.
> 
> Sure I can do this. What should I write in the report exactly and
> what is the correct email address I write to?
> 
> - kernel version (2.6.32.x)
> - system (Ubuntu)

and that it's an uptodate lucid.

> - container was unstoppable(?) even if there were no processess
> - the way I was successful
> - ...and?

A recipe to reproduce the bug.  It has to be reproducible.  Then
I'll run the recipe and when I see the failure, I'll confirm the
bug (which a separate second person needs to do).

thanks,
-serge




More information about the lxc-users mailing list