[Lxc-users] can't restart container without rebooting entire host, because can't delete cgroups files, tasks is 0

Daniel Lezcano daniel.lezcano at free.fr
Mon Nov 15 15:41:01 UTC 2010


On 11/15/2010 04:17 PM, Miroslav Lednicky, AVONET, s.r.o. wrote:
> Dne 15.11.2010 15:56, Daniel Lezcano napsal(a):
>> On 11/15/2010 03:26 PM, Miroslav Lednicky, AVONET, s.r.o. wrote:
>>> Hello,
>>>
>>> please see:
>>>
>>> ls -l
>>> total 0
>>> drwxr-xr-x 4 root root 0 2010-11-15 15:00 1285
>>> drwxr-xr-x 4 root root 0 2010-11-15 15:00 1298
>>> drwxr-xr-x 4 root root 0 2010-11-15 15:01 1322
>>> drwxr-xr-x 4 root root 0 2010-11-15 15:01 1325
>>> drwxr-xr-x 4 root root 0 2010-11-15 15:02 1335
>>> drwxr-xr-x 4 root root 0 2010-11-15 15:09 1386
>>> drwxr-xr-x 4 root root 0 2010-11-15 15:11 1401
>>> drwxr-xr-x 4 root root 0 2010-11-15 15:12 1408
>>> drwxr-xr-x 4 root root 0 2010-11-15 15:12 1411
>>> drwxr-xr-x 3 root root 0 2010-11-15 15:17 1459
>>> -r--r--r-- 1 root root 0 2010-11-15 15:02 cgroup.procs
>>> -r--r--r-- 1 root root 0 2010-11-15 15:02 cpuacct.stat
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 cpuacct.usage
>>> -r--r--r-- 1 root root 0 2010-11-15 15:02 cpuacct.usage_percpu
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 cpu.rt_period_us
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 cpu.rt_runtime_us
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 cpuset.cpu_exclusive
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 cpuset.cpus
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 cpuset.mem_exclusive
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 cpuset.mem_hardwall
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 cpuset.memory_migrate
>>> -r--r--r-- 1 root root 0 2010-11-15 15:02 cpuset.memory_pressure
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 cpuset.memory_spread_page
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 cpuset.memory_spread_slab
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 cpuset.mems
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 cpuset.sched_load_balance
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 
>>> cpuset.sched_relax_domain_level
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 cpu.shares
>>> --w------- 1 root root 0 2010-11-15 15:02 devices.allow
>>> --w------- 1 root root 0 2010-11-15 15:02 devices.deny
>>> -r--r--r-- 1 root root 0 2010-11-15 15:02 devices.list
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 freezer.state
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 memory.failcnt
>>> --w------- 1 root root 0 2010-11-15 15:02 memory.force_empty
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 memory.limit_in_bytes
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 memory.max_usage_in_bytes
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 memory.memsw.failcnt
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 memory.memsw.limit_in_bytes
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 
>>> memory.memsw.max_usage_in_bytes
>>> -r--r--r-- 1 root root 0 2010-11-15 15:02 memory.memsw.usage_in_bytes
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 memory.soft_limit_in_bytes
>>> -r--r--r-- 1 root root 0 2010-11-15 15:02 memory.stat
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 memory.swappiness
>>> -r--r--r-- 1 root root 0 2010-11-15 15:02 memory.usage_in_bytes
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 memory.use_hierarchy
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 net_cls.classid
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 notify_on_release
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:02 tasks
>>> root at lnx-zl-teaspl:/cgroup/teas_www# ls -lR 1285
>>> 1285:
>>> total 0
>>> drwxr-xr-x 2 root root 0 2010-11-15 15:00 2
>>> drwxr-xr-x 2 root root 0 2010-11-15 15:00 3
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 cgroup.procs
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 cpuacct.stat
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuacct.usage
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 cpuacct.usage_percpu
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpu.rt_period_us
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpu.rt_runtime_us
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.cpu_exclusive
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.cpus
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.mem_exclusive
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.mem_hardwall
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.memory_migrate
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 cpuset.memory_pressure
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.memory_spread_page
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.memory_spread_slab
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.mems
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.sched_load_balance
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 
>>> cpuset.sched_relax_domain_level
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpu.shares
>>> --w------- 1 root root 0 2010-11-15 15:00 devices.allow
>>> --w------- 1 root root 0 2010-11-15 15:00 devices.deny
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 devices.list
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 freezer.state
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.failcnt
>>> --w------- 1 root root 0 2010-11-15 15:00 memory.force_empty
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.limit_in_bytes
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.max_usage_in_bytes
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.memsw.failcnt
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.memsw.limit_in_bytes
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 
>>> memory.memsw.max_usage_in_bytes
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 memory.memsw.usage_in_bytes
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.soft_limit_in_bytes
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 memory.stat
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.swappiness
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 memory.usage_in_bytes
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.use_hierarchy
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 net_cls.classid
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 notify_on_release
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 tasks
>>>
>>> 1285/2:
>>> total 0
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 cgroup.procs
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 cpuacct.stat
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuacct.usage
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 cpuacct.usage_percpu
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpu.rt_period_us
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpu.rt_runtime_us
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.cpu_exclusive
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.cpus
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.mem_exclusive
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.mem_hardwall
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.memory_migrate
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 cpuset.memory_pressure
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.memory_spread_page
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.memory_spread_slab
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.mems
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.sched_load_balance
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 
>>> cpuset.sched_relax_domain_level
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpu.shares
>>> --w------- 1 root root 0 2010-11-15 15:00 devices.allow
>>> --w------- 1 root root 0 2010-11-15 15:00 devices.deny
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 devices.list
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 freezer.state
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.failcnt
>>> --w------- 1 root root 0 2010-11-15 15:00 memory.force_empty
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.limit_in_bytes
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.max_usage_in_bytes
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.memsw.failcnt
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.memsw.limit_in_bytes
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 
>>> memory.memsw.max_usage_in_bytes
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 memory.memsw.usage_in_bytes
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.soft_limit_in_bytes
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 memory.stat
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.swappiness
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 memory.usage_in_bytes
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.use_hierarchy
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 net_cls.classid
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 notify_on_release
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 tasks
>>>
>>> 1285/3:
>>> total 0
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 cgroup.procs
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 cpuacct.stat
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuacct.usage
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 cpuacct.usage_percpu
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpu.rt_period_us
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpu.rt_runtime_us
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.cpu_exclusive
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.cpus
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.mem_exclusive
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.mem_hardwall
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.memory_migrate
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 cpuset.memory_pressure
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.memory_spread_page
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.memory_spread_slab
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.mems
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpuset.sched_load_balance
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 
>>> cpuset.sched_relax_domain_level
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 cpu.shares
>>> --w------- 1 root root 0 2010-11-15 15:00 devices.allow
>>> --w------- 1 root root 0 2010-11-15 15:00 devices.deny
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 devices.list
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 freezer.state
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.failcnt
>>> --w------- 1 root root 0 2010-11-15 15:00 memory.force_empty
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.limit_in_bytes
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.max_usage_in_bytes
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.memsw.failcnt
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.memsw.limit_in_bytes
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 
>>> memory.memsw.max_usage_in_bytes
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 memory.memsw.usage_in_bytes
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.soft_limit_in_bytes
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 memory.stat
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.swappiness
>>> -r--r--r-- 1 root root 0 2010-11-15 15:00 memory.usage_in_bytes
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 memory.use_hierarchy
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 net_cls.classid
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 notify_on_release
>>> -rw-r--r-- 1 root root 0 2010-11-15 15:00 tasks
>>>
>>> It is contentof my cgroup directory with running LXC. There are
>>> directories:
>>>
>>> drwxr-xr-x 4 root root 0 2010-11-15 15:00 1285
>>> drwxr-xr-x 4 root root 0 2010-11-15 15:00 1298
>>> drwxr-xr-x 4 root root 0 2010-11-15 15:01 1322
>>> drwxr-xr-x 4 root root 0 2010-11-15 15:01 1325
>>> drwxr-xr-x 4 root root 0 2010-11-15 15:02 1335
>>> drwxr-xr-x 4 root root 0 2010-11-15 15:09 1386
>>> drwxr-xr-x 4 root root 0 2010-11-15 15:11 1401
>>> drwxr-xr-x 4 root root 0 2010-11-15 15:12 1408
>>> drwxr-xr-x 4 root root 0 2010-11-15 15:12 1411
>>> drwxr-xr-x 3 root root 0 2010-11-15 15:17 1459
>>>
>>> These PIDs are no in global proc filesystem
>>> and still increasing.
>>>
>>> Load of my machine is still higher and higher.
>>>
>>> I will must restart computer ant the end. :-(
>>>
>>> What can i do with it?
>>
>> Ok, let's try to understand.
>>
>> Let's do it step-by-step :
>>
>> 1 - the topmost directory shows "freezer.state" and it shouldn't because
>> you can not freeze the system (except if there is a recent change in the
>> kernel)
>>
>> 2 - try to delete the 1285/2 directory, if you can't check the content
>> of 1285/2/tasks and look for the process in the system: is it a process
>> running inside a container ?
>
> 1285/2/tasks is empty
>
> rm -r 1285/2
> rm: cannot remove `1285/2/cpuset.memory_spread_slab': Operation not 
> permitted
> rm: cannot remove `1285/2/cpuset.memory_spread_page': Operation not 
> permitted
> rm: cannot remove `1285/2/cpuset.memory_pressure': Operation not 
> permitted
> rm: cannot remove `1285/2/cpuset.memory_migrate': Operation not permitted
> rm: cannot remove `1285/2/cpuset.sched_relax_domain_level': Operation 
> not permitted
> rm: cannot remove `1285/2/cpuset.sched_load_balance': Operation not 
> permitted
> rm: cannot remove `1285/2/cpuset.mem_hardwall': Operation not permitted
> rm: cannot remove `1285/2/cpuset.mem_exclusive': Operation not permitted
> rm: cannot remove `1285/2/cpuset.cpu_exclusive': Operation not permitted
> rm: cannot remove `1285/2/cpuset.mems': Operation not permitted
> rm: cannot remove `1285/2/cpuset.cpus': Operation not permitted
> rm: cannot remove `1285/2/cpu.rt_period_us': Operation not permitted
> rm: cannot remove `1285/2/cpu.rt_runtime_us': Operation not permitted
> rm: cannot remove `1285/2/cpu.shares': Operation not permitted
> rm: cannot remove `1285/2/cpuacct.stat': Operation not permitted
> rm: cannot remove `1285/2/cpuacct.usage_percpu': Operation not permitted
> rm: cannot remove `1285/2/cpuacct.usage': Operation not permitted
> rm: cannot remove `1285/2/memory.memsw.failcnt': Operation not permitted
> rm: cannot remove `1285/2/memory.memsw.limit_in_bytes': Operation not 
> permitted
> rm: cannot remove `1285/2/memory.memsw.max_usage_in_bytes': Operation 
> not permitted
> rm: cannot remove `1285/2/memory.memsw.usage_in_bytes': Operation not 
> permitted
> rm: cannot remove `1285/2/memory.swappiness': Operation not permitted
> rm: cannot remove `1285/2/memory.use_hierarchy': Operation not permitted
> rm: cannot remove `1285/2/memory.force_empty': Operation not permitted
> rm: cannot remove `1285/2/memory.stat': Operation not permitted
> rm: cannot remove `1285/2/memory.failcnt': Operation not permitted
> rm: cannot remove `1285/2/memory.soft_limit_in_bytes': Operation not 
> permitted
> rm: cannot remove `1285/2/memory.limit_in_bytes': Operation not permitted
> rm: cannot remove `1285/2/memory.max_usage_in_bytes': Operation not 
> permitted
> rm: cannot remove `1285/2/memory.usage_in_bytes': Operation not permitted
> rm: cannot remove `1285/2/devices.list': Operation not permitted
> rm: cannot remove `1285/2/devices.deny': Operation not permitted
> rm: cannot remove `1285/2/devices.allow': Operation not permitted
> rm: cannot remove `1285/2/freezer.state': Operation not permitted
> rm: cannot remove `1285/2/net_cls.classid': Operation not permitted
> rm: cannot remove `1285/2/notify_on_release': Operation not permitted
> rm: cannot remove `1285/2/cgroup.procs': Operation not permitted
> rm: cannot remove `1285/2/tasks': Operation not permitted
>
>> Oh, a dumb question : are you using the libvirt with lxc ?
>
> No, i using lxc package (lxc-start, lxc-stop, lxc-console, ...)

ok.

>
> I see, that vsftd running in container generate it sometimes for example.

good point ! the culprit is vsftpd.

 From the vsftpd code source :

"int ret = syscall(__NR_clone,
                       CLONE_NEWPID | CLONE_NEWIPC | CLONE_NEWNET | SIGCHLD,
                       NULL);"

I am pretty sure, without lxc you will still have the problem.

The cgroup is a pseudo-filesystem, so you don't have to remove all the 
files. The removing of a cgroup is just about removing a cgroup 
directory (what you can't do with an usual fs). You have to remove the 
cgroup from the deeper directory to the upper directory. If a cgroup is 
active, that is a task is attached to it (the 'tasks' file is not 
empty), the directory removing will be forbidden.

What I do usually is just 'sudo rm -rf /cgroup/*' without taking care of 
the errors. You should have a lot of cgroups to be deleted, at least the 
unused ones.

This problem was anticipate and this misbehavior is about to be solved 
completely with lxc and the kernel. So be patient ;)

A bit of explanation here:

http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commit;h=97978e6d1f2da0073416870410459694fbdbfd9b
http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commit;h=45531757b45cae0ce64c5aff08c2534d5a0fa3e7

I suggest you use another ftp server for the moment.

Hope that helps
   -- Daniel






More information about the lxc-users mailing list