[lxc-users] System daemons don't start inside containers after some number of containers was created

Fri Jul 21 08:22:58 UTC 2017

Found it - https://github.com/lxc/lxd/blob/master/doc/production-setup.md

On 21 July 2017 at 15:44, Ivan Kurnosov <zerkms at zerkms.ru> wrote:

> Hi,
>
> I have a very strange situation and I'm not even sure if it's the right
> mail list to post, but let's see.
>
> I have successfully reproduced it on 2 machines: on a real hardware and in
> virtualbox vm.
>
> Given, the host OS Ubuntu 16.04.2 (will all updates), I'm creating a bunch
> of containers. To reproduce the problem I'm using the following script:
>
> lxc launch ubuntu:16.04 container-1
> sleep 5
> lxc exec container-1 ps ax
>
> lxc launch ubuntu:16.04 container-2
> sleep 5
> lxc exec container-2 ps ax
>
> lxc launch ubuntu:16.04 container-3
> sleep 5
> lxc exec container-3 ps ax
> ....
>
> etc up to container-30.
>
> At some point (different on both machines, but consistent) the container
> is created, started but has /sbin/init as it ONLY running process. So no
> systemd or any other system daemon running there apart of /sbin/init. And
> after that point every other container I create looks broken in the very
> same way.
>
> Here is how the output from the script above looks on the edge between
> "proper containers" and "broken containers":
>
> Creating container-12
> Starting container-12
>   PID TTY      STAT   TIME COMMAND
>     1 ?        Ss     0:00 /sbin/init
>    53 ?        Ss     0:00 /lib/systemd/systemd-udevd
>    57 ?        Ss     0:00 /lib/systemd/systemd-journald
>   236 ?        Ss     0:00 /sbin/dhclient -1 -v -pf /run/dhclient.eth0.pid
> -lf /var/lib/dhcp/dhclient.eth0.leases -I -df
> /var/lib/dhcp/dhclient6.eth0.leases eth0
>   292 ?        Rs     0:00 /usr/bin/python3 /usr/bin/cloud-init init
>   295 ?        S      0:00 /bin/sh -c tee -a /var/log/cloud-init-output.log
>   296 ?        S      0:00 tee -a /var/log/cloud-init-output.log
>   300 ?        Rs+    0:00 ps ax
> Creating container-13
> Starting container-13
>   PID TTY      STAT   TIME COMMAND
>     1 ?        Ss     0:00 /sbin/init
>   221 ?        Rs+    0:00 ps ax
>
> All the containers after container-13 are created "broken". If I create
> another container now - it also would not start properly.
>
> But if I only leave 11 containers running and create another one or
> restart one of "broken" ones - then it would start fine.
>
> It only is reproducible with ubuntu 16.04 containers, 17.04 run fine (at
> least up to 30 simultaneously running containers).
>
> The number of containers the OS "allows" to run properly is different on
> both of machines I tried (12 on the real hardware, and 20 in the virtualbox
> vm).
> There is plenty of memory available, so memory is not a problem.
>
> There is nothing particularly interesting in the host machine's syslog or
> lxd.log.
>
> And in the container there are no logs to read actually, since journalctl
> and rsyslog weren't even started.
>
>
> Any suggestions on where I could dig further?
>
> --
> With best regards, Ivan Kurnosov
>

-- 
With best regards, Ivan Kurnosov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxcontainers.org/pipermail/lxc-users/attachments/20170721/a3cec0a3/attachment.html>