[Lxc-users] can't restart container without rebooting entire host, because can't delete cgroups files, tasks is 0

Fri Nov 5 18:26:14 UTC 2010

On 11/5/2010 1:34 PM, Serge E. Hallyn wrote:
> A few comments:
>
> 1. To remove the directories, rmdir all descendent directories.  I'd
>     think something like 'find . -type d -print0 | xargs rmdir' would
>     do.
> 2. You can prevent this from happening by using a notify-on-release
>     handler.
> 3. This should stop happening when lxc (soon) switches to using the
>     clone-child cgroup helper instead of the ns cgroup.
>
> -serge
>

Just to make it clear...

nj9:~ # lxc-stop -n nj10-014
nj9:~ # lxc-info -n nj10-014
'nj10-014' is STOPPED
nj9:~ # lxc-destroy -n nj10-014
'nj10-014' does not exist
nj9:~ # lxc-ps -elf |grep nj10-014
            0 S root      3037 32341  0  80   0 -   579 pipe_w 14:25 
pts/4    00:00:00 grep nj10-014
nj9:~ #
nj9:~ # rm -vrf /cgroup/nj10-014
rm: cannot remove `/cgroup/nj10-014/19237/3/cpuset.memory_spread_slab': 
Operation not permitted
rm: cannot remove `/cgroup/nj10-014/19237/3/cpuset.memory_spread_page': 
Operation not permitted
[...]
rm: cannot remove `/cgroup/nj10-014/net_cls.classid': Operation not 
permitted
rm: cannot remove `/cgroup/nj10-014/notify_on_release': Operation not 
permitted
rm: cannot remove `/cgroup/nj10-014/tasks': Operation not permitted
nj9:~ #

I don't know how to track down if there is possibly some process that is 
part of the cgroup even though lxc-ps doesn't show any.
Examine every single process and verify that it's part of the host or 
another container until I find one I can't account for?

Since this happens to me all the time and on different hosts (albeit 
using the same kernel versions and other software all configured the 
same way) I can't believe this doesn't happen to many others and I'm 
surprised I don't see more acknowledgment of the issue here. I see other 
people reporting the problem, but I also see the responses simply say to 
delete the files, which, we can't do.

So i wonder is my configuration and usage simply wrong? I'm using very 
simple config files copied from the veth samples.

nj9:~ # find /etc/lxc/nj10-010 -type f |xargs -tn1 cat
cat /etc/lxc/nj10-010/fstab
none /lxc/nj10-010/dev/pts devpts defaults 0 0
none /lxc/nj10-010/proc    proc   defaults 0 0
none /lxc/nj10-010/sys     sysfs  defaults 0 0
none /lxc/nj10-010/dev/shm tmpfs  defaults 0 0
cat /etc/lxc/nj10-010/config
lxc.utsname = nj10-010
lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = br0
lxc.network.hwaddr = 02:00:47:bb:ce:56
lxc.network.ipv4 = 71.187.206.86/24
lxc.network.name = eth0
lxc.mount = /etc/lxc/nj10-010/fstab
lxc.rootfs = /lxc/nj10-010
nj9:~ #

How are you not having the same problem?

-- 
bkw