[lxc-users] Memory problem in LXC causes host to crash
Serge Hallyn
serge.hallyn at ubuntu.com
Thu Aug 28 13:15:02 UTC 2014
Quoting Danijel Vargek, Continum (danijel.vargek at continum.net):
> Hi all,
>
> we are running a LXC-Host with several testing containers (14 at the moment).
> The host itself is on Ubuntu 14.04, with 3.13.0-32 Kernel. The containers
> are running Debian Wheezy.
>
> From time to time the host machine completely crashes, probably due to containers
> eating up too much ram. We already limited every container via cgroup (cpu and ram),
> but still receive this behaviour.
>
> Our suspection is, that java on some of the containers isn't correctly limited, which
> leads to crashing the host machine.
>
> Does anybody got similar expirience, or is there something missing when limiting containers
> via cgroup?
>
> This is the syslog entry for the last crash (host machine + one of the containers):
>
> #### HOST ####
> Aug 26 13:33:10 node04 kernel: [87282.555841] Modules linked in:<4>[87282.555841] Call Trace:
> Aug 26 13:33:10 node04 kernel: [87282.555841] [<ffffffff811458c4>] perf_event_overflow+0x14/0x20
> Aug 26 13:33:10 node04 kernel: [87282.555841] [<ffffffff8136e9ed>] ? __write_lock_failed+0xd/0x20
> Aug 26 13:33:10 node04 kernel: [87282.555841] ---[ end trace 71798cbdeee56afd ]---
> Aug 26 13:33:10 node04 kernel: [87304.156008] RAX: ffff881018a2b2e8 RBX: ffff880815771e28 RCX: 0000000000000006
> Aug 26 13:33:10 node04 kernel: [87304.156008] Stack:
> Aug 26 13:33:10 node04 kernel: [87304.156008] [<ffffffff81152384>] pagefault_out_of_memory+0x14/0x80
> Aug 26 13:33:10 node04 kernel: [87304.156008] [<ffffffff81727fda>] do_page_fault+0x1a/0x70
> Aug 26 13:33:18 node04 kernel: [87312.204006] FS: 00007f58c7c59700(0000) GS:ffff88103f940000(0000) knlGS:0000000000000000
> Aug 26 13:33:18 node04 kernel: [87312.204006] Stack:
> Aug 26 13:33:18 node04 kernel: [87312.204006] ffff880ab78e3c70 ffffffff8160c3a8 ffff88103f914f00 ffff88103f914f00
> Aug 26 13:33:18 node04 kernel: [87312.204006] [<ffffffff810dba85>] smp_call_function_single+0xe5/0x190
> Aug 26 13:33:18 node04 kernel: [87312.204006] [<ffffffff810dbeb6>] smp_call_function_many+0x286/0x2d0
> Aug 26 13:33:18 node04 kernel: [87312.204006] [<ffffffff811814d5>] change_protection+0x65/0xb0
> Aug 26 13:33:18 node04 kernel: [87312.204006] [<ffffffff8172423c>] retint_signal+0x48/0x8c
> ##############
>
> #### Container ####
> Aug 26 13:32:45 ff01 kernel: [87279.009427] CPU: 0 PID: 773 Comm: java Tainted: GF O 3.13.0-32-generic #57-Ubuntu
> Aug 26 13:32:45 ff01 kernel: [87279.009442] Call Trace:
> Aug 26 13:32:45 ff01 kernel: [87279.009470] [<ffffffff811b388c>] mem_cgroup_oom_synchronize+0x4fc/0x540
This (mem_cgroup_oom_synchronize) suggests to me that in fact the container
is correctly limited. java exceeds what the container is allowed to
use, and so is killed.
> Aug 26 13:32:45 ff01 kernel: [87279.009502] [<ffffffff81724448>] page_fault+0x28/0x30
> Aug 26 13:32:45 ff01 kernel: [87279.009650] [23109] 0 23109 32444 422 29 0 0 console-kit-dae
> Aug 26 13:32:45 ff01 kernel: [87279.009779] [ 714] 1000 714 4999 182 12 0 0 wrapper-linux-x
> ###################
>
> Please tell me if you need additional information.
>
> Regards,
> Danijel Vargek
>
> --
> Danijel Vargek
> Systemadministrator Unix
>
> Continum AG
> Bismarckallee 7b-d
> D-79098 Freiburg i. Br.
> Tel.: +49 761 217111-77
> Fax.: +49 761 217111-99
> http://www.continum.net
>
> Sitz der Gesellschaft: Freiburg im Breisgau
> Registergericht: Amtsgericht Freiburg, HRB 6866
> Vorstand: Volker T. Mueller
> Vorsitzender d. Aufsichtsrats: Bernd Straub
> _______________________________________________
> lxc-users mailing list
> lxc-users at lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-users
More information about the lxc-users
mailing list