<html><body>

<p>Serge E. Hallyn wrote:</p>

<blockquote><p>On Tue, Mar 14, 2017 at 02:29:01AM +0100, Benoit GEORGELIN – Association Web4all wrote:</p>

<blockquote><p>----- Mail original -----</p>

<blockquote><p>De: “Simos Xenitellis” <simos.lists@googlemail.com> À: “lxc-users” <lxc-users@lists.linuxcontainers.org> Envoyé: Lundi 13 Mars 2017 20:22:03 Objet: Re: [lxc-users] Experience with large number of LXC/LXD containers On Sun, Mar 12, 2017 at 11:28 PM, Benoit GEORGELIN – Association Web4all <benoit.georgelin@web4all.fr> wrote:</p>

<blockquote><p>Hi lxc-users , I would like to know if you have any experience with a large number of LXC/LXD containers ? In term of performance, stability and limitation . I'm wondering for exemple, if having 100 containers behave the same of having 1.000 or 10.000 with the same configuration to avoid to talk about container usage. I have been looking around for a couple of days to found any user/admin feedback experience but i'm not able to find large deployments Is there any ressources limits or any maximum number that can be deployed on the same node ? Beside physical performance of the node, is there any specific behavior that a large number of LXC/LXD containers can experience ? I'm not aware of any test or limits that can occurs beside number of process. But I'm sure from LXC/LXD side it might have some technical contraints ? Maybe on namespace availability , or any other technical layer used by LXC/LXD I will be interested to here from your experience or if you have any links/books/story about this large deployments</p></blockquote>

<p>This would be interesting to hear if someone can talk publicly about their large deployment. In any case, it should be possible to create, for example, 1000 web servers and then try to access each one and check any issues regarding the response time. Another test would be to install 1000 Wordpress installations and check again for the response time and resource usage. Such scripts to create this massive number of containers would also be helpful to replicate any issues in order to solve them. Simos</p></blockquote></blockquote></blockquote>

<p>Been reading this + here's a bit of info.</p>

<p>I've been running LXC since early deployment + now LXD.</p>

<p>There are a few big performance killers related to WordPress. If you keep these issues in mind, you'll be good.</p>

<p>1) I run 100s of sites across many containers on many machines.</p>

<pre>My business is private, high speed hosting, so I eat from my efforts.

No theory here.</pre>

<pre>I target WordPress site speed at 3000+ reqs/second, measured locally

using ab (ApacheBench). This is a crude tool + sufficient, as I issue

1,000,000 simultaneous 5 thread connections against a server for 30 seconds.</pre>

<pre>ab -k -t 30 -n 10000000 -c 5 $URL</pre>

<pre>This will crash most machines, unless they're tuned well.</pre>

<p>2) Memory + CPU. The big killer of performance anywhere is swap thrash. If top</p>

<pre>shows swapping for more than a few seconds, likely your system is heading

toward a crash.</pre>

<pre>Fix: I tend to deploy OVH machines with 128G of memory, as this is enough

memory to handle huge spikes of memory usage across many sites, during

traffic spikes... then recover...</pre>

<pre>For example, running 100s of sites across many LXD containers, I've had

machines sustain 250,000+ reqs/hour every day for months.</pre>

<pre>At these traffic levels, <1 core used sustained + 50%ish memory use.</pre>

<pre>Sites still show 3000+ reqs/sec using ab test above.</pre>

<p>3) Database: I run MariaDB rather than MySQL as it's smokin' fast.</p>

<pre>I also relocate /tmp to tmpfs, so temp file i/o runs at memory speed,

rather than disk speed.</pre>

<pre>This ensures all MariaDB temp select set files (for complex selects)

generate + access at memory speed.</pre>

<pre>Also PHP session /tmp files run at memory speed.</pre>

<pre>This is important to me, as many of my clients run large membership

sites. Many are >40K members. This sites performance would circle

the drain if /tmp was on disk.</pre>

<p>4) Disk Thrash: Becomes the killer as traffic increases.</p>

<p>5) Apache Logging: For several clients I'm currently retuning my Apache logging</p>

<pre>to skip logging of successful serves of - images, css, js, fonts. I'll still

long non-200s, as these need to be debugged.</pre>

<pre>This can make a huge difference if memory pressure/use forces disk writes to

actually go to disk, rather than kernel filesystem i/o buffers.</pre>

<pre>Once memory pressure forces physical disk writes, disk i/o starves Apache from

quickly serving uncached content. Very ugly.</pre>

<pre>Right now I'm doing extensive filesystem testing, to reduce disk thrash during

traffic spikes + related memory pressure.</pre>

<p>6) Net Connection: If you're running 1000s of containers, best also check adapter</p>

<pre>saturation. I use 10Gig adapters + even at extreme traffic levels, they barely

reach 10% saturation.</pre>

<pre>This means 10Gig adapters are a must for me, as 10% is 1Gig, so using 1Gig

adapters, site speed would begin to throttle, based on adapter saturation,

which would be a bear to debug.</pre>

<p>7) Apache: I've taken setting up Apache to kill off processes, after anywhere</p>

<pre>from 10K to 100K requests served. This ensures the kernel can garbage collect

(resource reclamation) which also helps escape swapping.</pre>

<pre>If you have 100,000s+ Apache processes running, with no kill off, then eventually

they can potentially eat up a massive amount of memory, which takes a long time

to reclaim, depending on other MPM config settings.</pre>

<p>So… General rule of thumb. Tune your entire WAMPL stack to run out of memory:</p>

<pre>WAMPL - WordPress running on Apache + PHP + MariaDB + Linux</pre>

<p>If your sites run at memory speed, makes no real difference how many containers you run. Possibly context switching might come into play if many of the sites running were high traffic sites.</p>

<p>If problems occur, just look at your Apache logs across all containers. Move the site with highest traffic to another physical machine.</p>

<p>Or, if top shows swapping, add more memory.</p>

<img src="http://links.davidfavor.com/wf/open?upn=dR-2FDpsqbqS0pG-2FvtYnlwlYDAXhLtx9yTSf3jZyR4W4FJm5kgQqvrKRmmJ9iEttydB67f6KPKWQvceIdzo0wTGxRUkYqoOsOfmkIFKtSMHPlO1gGyoE-2F4vNrJIG3gwvtdHL3fcGnYC2RXwVHTiBA4TKQe0-2Fsoi4p33r22KvDqkCRfZfg175YZWyKtfAgzcXV98I1i-2F8nU1JYg-2FwzN5QKrrkPNizvdQz43akN3ODwGOIrwO6lSo7t-2BA2HfZK-2Fbwt-2FW" alt="" width="1" height="1" border="0" style="height:1px !important;width:1px !important;border-width:0 !important;margin-top:0 !important;margin-bottom:0 !important;margin-right:0 !important;margin-left:0 !important;padding-top:0 !important;padding-bottom:0 !important;padding-right:0 !important;padding-left:0 !important;"/>

</body></html>