[Lxc-users] unstoppable container

Daniel Lezcano daniel.lezcano at free.fr
Mon Aug 30 12:45:31 UTC 2010


On 08/30/2010 02:11 PM, Ferenc Wagner wrote:
> Daniel Lezcano<daniel.lezcano at free.fr>  writes:
>
>    
>> On 08/30/2010 12:40 PM, Papp Tamás wrote:
>>
>>      
>>> In the tasks file I saw three processes: udevd, init and one more, which
>>> I don't remember. I killed them all, but the cgroup still exists.
>>>        
>> The cgroup is removed by lxc-start, but this is not a problem, because
>> it will be removed (if empty), when running lxc-start again.
>>      
> I suspect a transmission error in this sentence, could you please resend it?
>    

The cgroup is not removed automatically by the cgroup infrastructure 
when all the tasks die, it's how the cgroup is implemented. So it is up 
to lxc-start to remove the cgroup after the pid 1 of the container 
exits. If lxc-start was killed, this directory will not be removed and 
will stay there.

If you start your container again, lxc-start will try to remove this 
directory if it is present and recreate a new cgroup.

>> Usually, there is a mechanism used in lxc to kill -9 the process 1 of
>> the container (which wipes out all the processes of the containers) when
>> lxc-start dies.
>>      
> I guess this mechanism has no chance when lxc-start is killed by SIGKILL...
>    

Yes, but hopefully there is a linux specific process control, where the 
kernel sends a signal to a child process when its parent dies.

"
...
        PR_SET_PDEATHSIG (since Linux 2.1.57)
               Set the parent process death signal of the calling 
process to arg2 (either a signal value in the
               range 1..maxsig, or 0 to clear).  This is the signal that 
the calling process will get when  its
               parent dies.  This value is cleared for the child of a 
fork(2).
...
"

This prctl is used in lxc as a safe guard in case lxc-start is killed 
widely, in order to wipe out container's processes.

>> So if you still have the processes running inside the container but
>> lxc-start is dead, then:
>>    * you are using a 2.6.32 kernel which is buggy (this mechanism is broken).
>>   or/and
>>    * there are processes in 'T' states within the container
>>      
> Is this a kernel mechanism to clean up all processes of a container when
> the container init exits, or is it a user-space thing implemented in
> lxc-start?
When the container init exits, it sends a SIGKILL to all the child 
processes and reap them (aka wait), that happens at the kernel level 
(zap_pid_ns). Hence, in userspace, when wait('init') returns you have 
the guarantee there are no more processes in the container.

>   If the former, in which versions of 2.6.32 is this feature
> broken?
>    

I meant the prctl(PR_SET_PDEATHSIG) is broken on 2.6.32





More information about the lxc-users mailing list