[lxc-users] Memory problem in LXC causes host to crash

Serge Hallyn serge.hallyn at ubuntu.com
Thu Aug 28 13:15:02 UTC 2014


Quoting Danijel Vargek, Continum (danijel.vargek at continum.net):
> Hi all,
> 
> we are running a LXC-Host with several testing containers (14 at the moment).
> The host itself is on Ubuntu 14.04, with 3.13.0-32 Kernel. The containers
> are running Debian Wheezy.
> 
> From time to time the host machine completely crashes, probably due to containers
> eating up too much ram. We already limited every container via cgroup (cpu and ram),
> but still receive this behaviour. 
> 
> Our suspection is, that java on some of the containers isn't correctly limited, which
> leads to crashing the host machine. 
> 
> Does anybody got similar expirience, or is there something missing when limiting containers
> via cgroup? 
> 
> This is the syslog entry for the last crash (host machine + one of the containers):
> 
> #### HOST ####
> Aug 26 13:33:10 node04 kernel: [87282.555841] Modules linked in:<4>[87282.555841] Call Trace:
> Aug 26 13:33:10 node04 kernel: [87282.555841]  [<ffffffff811458c4>] perf_event_overflow+0x14/0x20
> Aug 26 13:33:10 node04 kernel: [87282.555841]  [<ffffffff8136e9ed>] ? __write_lock_failed+0xd/0x20
> Aug 26 13:33:10 node04 kernel: [87282.555841] ---[ end trace 71798cbdeee56afd ]---
> Aug 26 13:33:10 node04 kernel: [87304.156008] RAX: ffff881018a2b2e8 RBX: ffff880815771e28 RCX: 0000000000000006
> Aug 26 13:33:10 node04 kernel: [87304.156008] Stack:
> Aug 26 13:33:10 node04 kernel: [87304.156008]  [<ffffffff81152384>] pagefault_out_of_memory+0x14/0x80
> Aug 26 13:33:10 node04 kernel: [87304.156008]  [<ffffffff81727fda>] do_page_fault+0x1a/0x70
> Aug 26 13:33:18 node04 kernel: [87312.204006] FS:  00007f58c7c59700(0000) GS:ffff88103f940000(0000) knlGS:0000000000000000
> Aug 26 13:33:18 node04 kernel: [87312.204006] Stack:
> Aug 26 13:33:18 node04 kernel: [87312.204006]  ffff880ab78e3c70 ffffffff8160c3a8 ffff88103f914f00 ffff88103f914f00
> Aug 26 13:33:18 node04 kernel: [87312.204006]  [<ffffffff810dba85>] smp_call_function_single+0xe5/0x190
> Aug 26 13:33:18 node04 kernel: [87312.204006]  [<ffffffff810dbeb6>] smp_call_function_many+0x286/0x2d0
> Aug 26 13:33:18 node04 kernel: [87312.204006]  [<ffffffff811814d5>] change_protection+0x65/0xb0
> Aug 26 13:33:18 node04 kernel: [87312.204006]  [<ffffffff8172423c>] retint_signal+0x48/0x8c
> ##############
> 
> #### Container ####
> Aug 26 13:32:45 ff01 kernel: [87279.009427] CPU: 0 PID: 773 Comm: java Tainted: GF          O 3.13.0-32-generic #57-Ubuntu
> Aug 26 13:32:45 ff01 kernel: [87279.009442] Call Trace:
> Aug 26 13:32:45 ff01 kernel: [87279.009470]  [<ffffffff811b388c>] mem_cgroup_oom_synchronize+0x4fc/0x540

This (mem_cgroup_oom_synchronize) suggests to me that in fact the container
is correctly limited.  java exceeds what the container is allowed to
use, and so is killed.

> Aug 26 13:32:45 ff01 kernel: [87279.009502]  [<ffffffff81724448>] page_fault+0x28/0x30
> Aug 26 13:32:45 ff01 kernel: [87279.009650] [23109]     0 23109    32444      422      29        0             0 console-kit-dae
> Aug 26 13:32:45 ff01 kernel: [87279.009779] [  714]  1000   714     4999      182      12        0             0 wrapper-linux-x
> ###################
> 
> Please tell me if you need additional information.
> 
> Regards,
> Danijel Vargek
> 
> -- 
> Danijel Vargek
> Systemadministrator Unix
> 
> Continum AG
> Bismarckallee 7b-d
> D-79098 Freiburg i. Br.
> Tel.: +49 761 217111-77
> Fax.: +49 761 217111-99
> http://www.continum.net
> 
> Sitz der Gesellschaft: Freiburg im Breisgau
> Registergericht: Amtsgericht Freiburg, HRB 6866
> Vorstand: Volker T. Mueller
> Vorsitzender d. Aufsichtsrats: Bernd Straub
> _______________________________________________
> lxc-users mailing list
> lxc-users at lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-users


More information about the lxc-users mailing list