[Lxc-users] unstoppable container
Papp Tamás
tompos at martos.bme.hu
Wed Sep 1 05:43:50 UTC 2010
Serge E. Hallyn wrote, On 2010. 08. 31. 15:26:
> Quoting Papp Tamás (tompos at martos.bme.hu):
>
>> Serge E. Hallyn wrote, On 2010. 08. 31. 4:06:
>>
>>> Quoting Daniel Lezcano (daniel.lezcano at free.fr):
>>>
>>>> On 08/31/2010 12:23 AM, Serge E. Hallyn wrote:
>>>>
>>>>> Quoting Daniel Lezcano (daniel.lezcano at free.fr):
>>>>>
>>>>>> On 08/30/2010 02:36 PM, Serge E. Hallyn wrote:
>>>>>>
>>>>>>> Quoting Papp Tamás (tompos at martos.bme.hu):
>>>>>>>
>>>>>>>> Daniel Lezcano wrote, On 2010. 08. 30. 13:08:
>>>>>>>>
>>>>>>>>> Usually, there is a mechanism used in lxc to kill -9 the process 1 of
>>>>>>>>> the container (which wipes out all the processes of the containers)
>>>>>>>>> when lxc-start dies.
>>>>>>>>>
>>>>>>>> It should wipe out them, but in my case it was unsuccessfull, even if I
>>>>>>>> killed the init process by hand.
>>>>>>>>
>>>>>>>>
>>>>>>>>> So if you still have the processes running inside the container but
>>>>>>>>> lxc-start is dead, then:
>>>>>>>>> * you are using a 2.6.32 kernel which is buggy (this mechanism is
>>>>>>>>> broken).
>>>>>>>>>
>>>>>>>> Ubuntu 10.04, so it's exactly the point, the kernel is 2.6.32 .
>>>>>>>>
>>>>>>>>
>>>>>>>> Could you point me (or the Ubuntu guy in the list) to an URL, which
>>>>>>>> describes the problem or maybe to the kernel patch. If it's possible,
>>>>>>>> maybe the Ubuntu kernel maintainers would fix the official Ubuntu kernel.
>>>>>>>>
>>>>>>> Daniel,
>>>>>>>
>>>>>>> which patch are you talking about? (presumably a patch against
>>>>>>> zap_pid_ns_processes()?) If it's keeping containers from properly
>>>>>>> shutting down, we may be able to SRU a small enough patch, but if
>>>>>>> it involves a whole Oleg rewrite then maybe not :)
>>>>>>>
>>>>>> I am referring to these ones:
>>>>>>
>>>>>> http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commit;h=13aa9a6b0f2371d2ce0de57c2ede62ab7a787157
>>>>>> http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commit;h=dd34200adc01c5217ef09b55905b5c2312d65535
>>>>>> http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commit;h=dd34200adc01c5217ef09b55905b5c2312d65535
>>>>>>
>>>>> (note, second and third are identical - did you mean to paste 2 or 3 links?
>>>>>
>>>> 3 links, was this one.
>>>>
>>>> http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commit;h=614c517d7c00af1b26ded20646b329397d6f51a1
>>>>
>>> Ah, thanks.
>>>
>>> I had a feeling the second one depended on defining si_fromuser in all
>>> lowercase, but for some reason git wasn't showing that one to me easily.
>>>
>>>
>>>>>> Are they small enough for a SRU ?
>>>>>>
>>>>> The first one looks trivial enough. I'd be afraid the second one would be
>>>>> considered to have deep and subtle regression potential. But, we can
>>>>> always try. I'm not on the kernel team so am not likely to have any say
>>>>> on it myself :)
>>>>>
>>>> Shall we ask directly to the kernel-team@ mailing list ? Or do we
>>>> have to do a SRU first ?
>>>>
>>> Actually, first step would be for Papp to open a bug against both
>>> lxc and the kernel. Papp, do you mind doing that?
>>>
>>> Without a bug, an SRU ain't gonna fly.
>>>
>> Sure I can do this. What should I write in the report exactly and
>> what is the correct email address I write to?
>>
>> - kernel version (2.6.32.x)
>> - system (Ubuntu)
>>
>
> and that it's an uptodate lucid.
>
>
>> - container was unstoppable(?) even if there were no processess
>> - the way I was successful
>> - ...and?
>>
>
> A recipe to reproduce the bug. It has to be reproducible. Then
> I'll run the recipe and when I see the failure, I'll confirm the
> bug (which a separate second person needs to do).
>
>
Today I will give a try to reproduce it.
tamas
More information about the lxc-users
mailing list