[lxc-users] LXD 2.14 - Ubuntu 16.04 - kernel 4.4.0-57-generic - SWAP continuing to grow
Saint Michael
venefax at gmail.com
Sat Jul 15 09:36:14 UTC 2017
I have a lot of memory management issues using pure LXC. In my case, my box
has only one container. I use LXC to be able to move my app around, not to
squeeze performance out of hardware. What happens is my database gets
killed the OOM manager, although there are gigabytes of RAM used for cache.
The memory manager kills applications instead of reclaiming memory from
disc cache. How can this be avoided?
My config at the host is:
vm.hugepages_treat_as_movable=0
vm.hugetlb_shm_group=27
vm.nr_hugepages=2500
vm.nr_hugepages_mempolicy=2500
vm.nr_overcommit_hugepages=0
vm.overcommit_memory=0
vm.swappiness=0
vm.vfs_cache_pressure=150
vm.dirty_ratio=10
vm.dirty_background_ratio=5
This shows the issue
[9449866.130270] Node 0 hugepages_total=1250 hugepages_free=1250
hugepages_surp=0 hugepages_size=2048kB
[9449866.130271] Node 1 hugepages_total=1250 hugepages_free=1248
hugepages_surp=0 hugepages_size=2048kB
[9449866.130271] 46181 total pagecache pages
[9449866.130273] 33203 pages in swap cache
[9449866.130274] Swap cache stats: add 248571542, delete 248538339, find
69031185/100062903
[9449866.130274] Free swap = 0kB
[9449866.130275] Total swap = 8305660kB
[9449866.130276] 20971279 pages RAM
[9449866.130276] 0 pages HighMem/MovableOnly
[9449866.130276] 348570 pages reserved
[9449866.130277] 0 pages cma reserved
[9449866.130277] 0 pages hwpoisoned
[9449866.130278] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds
swapents oom_score_adj name
[9449866.130286] [ 618] 0 618 87181 135 168 3
3 0 systemd-journal
[9449866.130288] [ 825] 0 825 11343 130 25 3
0 0 systemd-logind
[9449866.130289] [ 830] 0 830 1642 31 8 3
0 0 mcelog
[9449866.130290] [ 832] 996 832 26859 51 23 3
47 0 chronyd
[9449866.130292] [ 834] 0 834 4905 100 12 3
0 0 irqbalance
[9449866.130293] [ 835] 0 835 6289 177 15 3
0 0 smartd
[9449866.130295] [ 837] 81 837 28499 258 28 3
149 -900 dbus-daemon
[9449866.130296] [ 857] 0 857 1104 16 7 3
0 0 rngd
[9449866.130298] [ 859] 0 859 192463 37114 224 4
40630 0 NetworkManager
[9449866.130300] [ 916] 0 916 25113 229 50 3
0 -1000 sshd
[9449866.130302] [ 924] 0 924 6490 50 17 3
0 0 atd
[9449866.130303] [ 929] 0 929 35327 199 20 3
284 0 agetty
[9449866.130305] [ 955] 0 955 22199 3185 43 3
312 0 dhclient
[9449866.130307] [ 1167] 0 1167 6125 88 17 3
2 0 lxc-autostart
[9449866.130309] [ 1176] 0 1176 10818 275 24 3
38 0 systemd
[9449866.130310] [ 1188] 0 1188 13303 1980 29 3
36 0 systemd-journal
[9449866.130312] [ 1372] 99 1372 3881 2 12 3
45 0 dnsmasq
[9449866.130313] [ 1375] 81 1375 6108 77 17 3
39 -900 dbus-daemon
[9449866.130315] [ 1394] 0 1394 6175 46 15 3
168 0 systemd-logind
[9449866.130316] [ 1395] 0 1395 78542 1142 69 3
4 0 rsyslogd
[9449866.130317] [ 1397] 0 1397 1614 32 8 3
0 0 agetty
[9449866.130319] [ 1398] 0 1398 1614 31 8 3
0 0 agetty
[9449866.130320] [ 1400] 0 1400 1614 31 8 3
0 0 agetty
[9449866.130321] [ 1401] 0 1401 1614 2 8 3
30 0 agetty
[9449866.130322] [ 1402] 0 1402 1614 2 8 3
29 0 agetty
[9449866.130324] [ 1403] 0 1403 1614 31 8 3
0 0 agetty
[9449866.130325] [ 1404] 0 1404 1614 32 8 3
0 0 agetty
[9449866.130327] [ 1405] 0 1405 1614 32 8 3
0 0 agetty
[9449866.130328] [ 1406] 0 1406 1614 2 8 3
29 0 agetty
[9449866.130329] [ 1408] 0 1408 1614 2 8 3
30 0 agetty
[9449866.130330] [ 1409] 0 1409 1614 30 7 3
0 0 agetty
[9449866.130332] [18224] 0 18224 26456 0 43 4
404 0 VGAuthService
[9449866.130333] [18225] 0 18225 61032 95 58 3
258 0 vmtoolsd
[9449866.130335] [28660] 0 28660 26372 44 54 4
202 -1000 sshd
[9449866.130337] [18992] 998 18992 132859 876 54 3
13 0 polkitd
[9449866.130339] [23849] 0 23849 10744 370 23 3
0 -1000 systemd-udevd
[9449866.130340] [ 3484] 0 3484 184082 265 243 4
129 0 rsyslogd
[9449866.130342] [31175] 32 31175 14328 35 30 3
102 0 rpcbind
[9449866.130344] [31205] 0 31205 111747 0 65 3
343 0 abrtd
[9449866.130345] [31248] 0 31248 187819 30 167 4
291 0 abrt-dump-journ
[9449866.130347] [31303] 0 31303 172125 32 120 4
258 0 abrt-dump-journ
[9449866.130348] [16252] 0 16252 5659 129 17 3
24 0 crond
[9449866.130350] [11626] 0 11626 33235 25 15 3
130 0 crond
[9449866.130351] [11717] 0 11717 13897 109 26 3
3 -1000 auditd
[9449866.130353] [31764] 0 31764 51162 73 36 3
50 0 gssproxy
[9449866.130355] [31372] 0 31372 11441 557 21 3
0 -1000 systemd-udevd
[9449866.130357] [12835] 27 12835 23997106 18960777 43207 119
2018937 0 mysqld
[9449866.130359] [ 8109] 0 8109 17597 220 39 3
3 0 crond
[9449866.130361] [ 8185] 0 8185 28282 57 10 3
0 0 safe_asterisk1
[9449866.130363] [ 8186] 0 8186 901036 8104 344 7
3 0 asterisk
[9449866.130364] [ 8215] 0 8215 28282 57 10 4
0 0 safe_asterisk2
[9449866.130366] [ 8216] 0 8216 917544 8133 345 6
19 0 asterisk
[9449866.130367] [ 8265] 0 8265 28282 57 10 3
0 0 safe_asterisk3
[9449866.130369] [ 8266] 0 8266 934048 8203 347 7
1 0 asterisk
[9449866.130370] [ 8351] 0 8351 28282 57 10 3
0 0 safe_asterisk4
[9449866.130372] [ 8353] 0 8353 950557 8235 349 6
15 0 asterisk
[9449866.130373] [ 8400] 0 8400 28282 57 10 3
0 0 safe_asterisk5
[9449866.130375] [ 8401] 0 8401 901033 8122 345 7
1 0 asterisk
[9449866.130376] [ 8460] 0 8460 28282 57 9 3
0 0 safe_asterisk6
[9449866.130377] [ 8461] 0 8461 1148653 8136 361 8
47 0 asterisk
[9449866.130379] [ 8537] 0 8537 28282 57 9 3
0 0 safe_asterisk7
[9449866.130403] [ 8538] 0 8538 983574 8148 349 6
2 0 asterisk
[9449866.130405] [ 8594] 0 8594 28282 58 10 3
0 0 safe_asterisk8
[9449866.130406] [ 8596] 0 8596 950555 8131 350 7
61 0 asterisk
[9449866.130408] [ 8649] 0 8649 28282 57 9 4
0 0 safe_asterisk9
[9449866.130409] [ 8651] 0 8651 901033 8139 345 6
0 0 asterisk
[9449866.130411] [ 8712] 0 8712 28282 58 10 3
0 0 safe_asterisk10
[9449866.130413] [ 8714] 0 8714 901034 8114 342 6
14 0 asterisk
[9449866.130415] [14800] 0 14800 31930 117 18 3
0 0 screen
[9449866.130416] [14801] 0 14801 28284 55 12 3
0 0 audit.sh
[9449866.130419] [17583] 0 17583 28284 55 10 3
0 0 audit.sh
[9449866.130421] [17584] 0 17584 40172 279 35 3
0 0 mysql
[9449866.130422] [17585] 0 17585 28373 22 12 3
0 0 awk
[9449866.130424] Out of memory: Kill process 12835 (mysqld) score 926 or
sacrifice child
[9449866.130661] Killed process 12835 (mysqld) total-vm:95988424kB,
anon-rss:75843108kB, file-rss:0kB, shmem-rss:0kB
[9449872.448957] oom_reaper: reaped process 12835 (mysqld), now
anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Philip
On Sat, Jul 15, 2017 at 5:11 AM, Marat Khalili <mkh at rqc.ru> wrote:
> I'm using LXC, and I frequently observe some unused containers get swapped
> out, even though system has plenty of RAM and no RAM limits are set. The
> only bad effect I observe is couple of seconds delay when you first log
> into them after some time. I guess it is absolutely normal since kernel
> tries to maximize amount of memory available for disk caches.
>
> If you don't like this behavior, instead of trying to fine tune kernel
> parameters why not disable swap altogether? Many people run it this way,
> it's mostly a matter of taste these days. (But first check your software
> for leaks.)
>
> > For example, our “server-4” machine shows 8G total RAM, 500MB free, 2.5G
> available, and 5G of buff/cache. Yet, swap is at 5.5GB and has been slowly
> growing over the past few days. It seems something is preventing the apps
> from using the RAM.
>
> Did you identify what processes all this virtual memory belongs to?
>
> > To be honest, we have been battling lots of memory/swap issues using
> LXD. We started with no tuning, but the app stack quickly ran out of
> memory.
>
> LXC/LXD is hardly responsible for your app stack memory usage. Either you
> underestimated it or there's a memory leak somewhere.
>
> > Given all the issues we have had with memory and swap using LXD, we are
> seriously considering moving back to the traditional VM approach until
> LXC/LXD is better “baked”.
>
> Did your VMs use less memory? I don't think so. Limits could be better
> enforced, but VMs don't magically give you infinite RAM.
> --
>
> With Best Regards,
> Marat Khalili
>
> On July 14, 2017 9:58:57 PM GMT+03:00, Ron Kelley <rkelleyrtp at gmail.com>
> wrote:
>>
>> Wondering if anyone else has similar issues.
>>
>> We have 5x LXD 2.12 servers running (U16.04 - kernel 4.4.0-57-generic - 8G RAM, 19G SWAP). Each server is running about 50 LXD containers - Wordpress w/Nginx and PHP7. The servers have been running for about 15 days now, and swap space continues to grow. In addition, the kswapd0 process starts consuming CPU until we flush the system cache via "/bin/echo 3 > /proc/sys/vm/drop_caches” command.
>>
>> Our LXD profile looks like this:
>> -------------------------
>> config:
>> limits.cpu: "2"
>> limits.memory: 512MB
>> limits.memory.swap: "true"
>> limits.memory.swap.priority: "1"
>> -------------------------
>>
>>
>> We also have added these to /etc/sysctl.conf
>> -------------------------
>> vm.swappiness=10
>> vm.vfs_cache_pressure=50
>> -------------------------
>>
>> A quick “top” output shows plenty of available Memory and buff/cache. But, for some reason, the system continues to swap out the app. For example, our “server-4” machine shows 8G total RAM, 500MB free, 2.5G available, and 5G of buff/cache. Yet, swap is at 5.5GB and has been slowly growing over the past few days. It seems something is preventing the apps from using the RAM.
>>
>>
>> To be honest, we have been battling lots of memory/swap issues using LXD. We started with no tuning, but the app stack quickly ran out of memory. After editing the profile to allow 512MB RAM per container (and restarting the container), the kswapd0 issue happens. Given all the issues we have had with memory and swap using LXD, we are seriously considering moving back to the traditional VM approach until LXC/LXD is better “baked”.
>>
>>
>> -Ron
>> ------------------------------
>>
>> lxc-users mailing list
>> lxc-users at lists.linuxcontainers.org
>> http://lists.linuxcontainers.org/listinfo/lxc-users
>>
>>
> _______________________________________________
> lxc-users mailing list
> lxc-users at lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxcontainers.org/pipermail/lxc-users/attachments/20170715/b47427ea/attachment-0001.html>
More information about the lxc-users
mailing list