[lxc-users] Containers seem to cannot spawn new processes

Lukas Schulze lspcity at gmail.com
Fri Sep 19 07:54:14 UTC 2014


Hi,

I'm still experiencing problems running lxc containers on my debian host.
They stop working randomly. Sometimes they will work for days and weeks and
sometimes they stop working after 10 minutes.

In July I already send an e-mail to this list with detailed information:
https://lists.linuxcontainers.org/pipermail/lxc-users/2014-July/007383.html

I further investigated the problem and it seems that the container is no
longer able to spawn new processes.
I cannot login via SSH or lxc-console and also postfix stops working, but
my nginx is still delivering the website.

I don't have set any restrictions in my cgroups.
There is enough space left on the HDDs (103GB available) and there are
enough free Inodes ( > 90% available).

At the end of this message I attached some log information:
   - available Inodes via $ df -i
   - TRACE log for a container from starting it till I stop it because it
stopped working
   - a process tree of the hanging container, got it via $ ps -ef forest
from the host system

Hopefully someone can help me. This is a really annoying problem.
If you need further information, just let me know.

Best regards
Lukas



# HOST
      $ uname -a
      Linux host.example.org 3.2.0-4-amd64 #1 SMP Debian 3.2.60-1+deb7u1
x86_64 GNU/Linux

      $ df -i
      Filesystem                        Inodes  IUsed    IFree IUse%
Mounted on
      rootfs                          12517376 652127 11865249    6% /
      udev                             8258632    421  8258211    1% /dev
      tmpfs                            8260286    401  8259885    1% /run
      /dev/disk/by-uuid/0281ee...     12517376 652127 11865249    6% /
      tmpfs                            8260286     18  8260268    1%
/run/lock
      tmpfs                            8260286      2  8260284    1%
/run/shm
      /dev/md1                          131072    453   130619    1% /boot


# CONTAINER
      $ df -i
      Filesystem                        Inodes  IUsed    IFree IUse%
Mounted on
      rootfs                          12517376 652399 11864977    6% /
      /dev/disk/by-uuid/0281ee...     12517376 652399 11864977    6% /
      tmpfs                            8260286     69  8260217    1% /run
      tmpfs                            8260286      1  8260285    1%
/run/lock
      tmpfs                            8260286      2  8260284    1%
/run/shm


# CONFIG of a container
      lxc.rootfs = /var/lib/lxc/mail/rootfs
      lxc.include = /usr/share/lxc/config/debian.common.conf
      lxc.mount = /var/lib/lxc/mail/fstab
      lxc.utsname = mail
      lxc.arch = amd64

      lxc.network.type = veth
      lxc.network.name = veth12
      lxc.network.flags = up
      lxc.network.link = br0
      lxc.network.veth.pair = veth12-sid
      lxc.network.ipv4 = 10.1.1.12/24
      lxc.network.ipv4.gateway = 10.1.1.254


# CONFIG of /usr/share/lxc/config/debian.common.conf
      # Default pivot location
      lxc.pivotdir = lxc_putold

      # Default mount entries
      lxc.mount.entry = proc proc proc nodev,noexec,nosuid 0 0
      lxc.mount.entry = sysfs sys sysfs defaults 0 0
      lxc.mount.entry = /sys/fs/fuse/connections sys/fs/fuse/connections
none bind,optional 0 0

      # Default console settings
      lxc.tty = 4
      lxc.pts = 1024

      # Default capabilities
      lxc.cap.drop = sys_module mac_admin mac_override sys_time

      # When using LXC with apparmor, the container will be confined by
default.
      # If you wish for it to instead run unconfined, copy the following
line
      # (uncommented) to the container's configuration file.
      #lxc.aa_profile = unconfined

      # To support container nesting on an Ubuntu host while retaining most
of
      # apparmor's added security, use the following two lines instead.
      #lxc.aa_profile = lxc-container-default-with-nesting
      #lxc.hook.mount = /usr/share/lxc/hooks/mountcgroups

      # If you wish to allow mounting block filesystems, then use the
following
      # line instead, and make sure to grant access to the block device
and/or loop
      # devices below in lxc.cgroup.devices.allow.
      #lxc.aa_profile = lxc-container-default-with-mounting

      # Default cgroup limits
      lxc.cgroup.devices.deny = a
      ## Allow any mknod (but not using the node)
      lxc.cgroup.devices.allow = c *:* m
      lxc.cgroup.devices.allow = b *:* m
      ## /dev/null and zero
      lxc.cgroup.devices.allow = c 1:3 rwm
      lxc.cgroup.devices.allow = c 1:5 rwm
      ## consoles
      lxc.cgroup.devices.allow = c 5:0 rwm
      lxc.cgroup.devices.allow = c 5:1 rwm
      ## /dev/{,u}random
      lxc.cgroup.devices.allow = c 1:8 rwm
      lxc.cgroup.devices.allow = c 1:9 rwm
      ## /dev/pts/*
      lxc.cgroup.devices.allow = c 5:2 rwm
      lxc.cgroup.devices.allow = c 136:* rwm
      ## rtc
      lxc.cgroup.devices.allow = c 254:0 rm
      ## fuse
      lxc.cgroup.devices.allow = c 10:229 rwm
      ## tun
      lxc.cgroup.devices.allow = c 10:200 rwm
      ## full
      lxc.cgroup.devices.allow = c 1:7 rwm
      ## hpet
      lxc.cgroup.devices.allow = c 10:228 rwm
      ## kvm
      lxc.cgroup.devices.allow = c 10:232 rwm
      ## To use loop devices, copy the following line to the container's
      ## configuration file (uncommented).
      #lxc.cgroup.devices.allow = b 7:* rwm

      # Blacklist some syscalls which are not safe in privileged
      # containers
      lxc.seccomp = /usr/share/lxc/config/common.seccomp


# PROCESS TREE for a container
      root     12895     1  0 Sep15 ?        00:00:00 lxc-start -d -n mail
      root     12913 12895  0 Sep15 ?        00:00:01  \_ init [3]
      root     14007 12913  0 Sep15 ?        00:00:00      \_
/usr/sbin/syslogd --no-forward
      root     14114 12913  0 Sep15 ?        00:00:00      \_ /usr/sbin/cron
      root     32282 14114  0 15:17 ?        00:00:00      |   \_
/USR/SBIN/CRON
      root     14132 12913  0 Sep15 ?        00:00:00      \_ /usr/sbin/sshd
      root     32409 14132  0 15:17 ?        00:00:00      |   \_ sshd:
root [priv]
      101      32410 32409  0 15:17 ?        00:00:00      |       \_
[sshd] <defunct>
      110      14161 12913  0 Sep15 ?        00:00:02      \_
/usr/sbin/spamass-milter -P /var/run/spamass/spamass.pid -f -p
/var/spool/postfix/spamass/spamass.sock -u spamass-milter -i 127.0.0.1 -m
-r -1 -I
      root     14275 12913  0 Sep15 ?        00:00:00      \_ nginx: master
process /usr/sbin/nginx
      www-data 14276 14275  0 Sep15 ?        00:00:02      |   \_ nginx:
worker process
      www-data 14277 14275  0 Sep15 ?        00:00:10      |   \_ nginx:
worker process
      www-data 14278 14275  0 Sep15 ?        00:00:10      |   \_ nginx:
worker process
      www-data 14280 14275  0 Sep15 ?        00:00:08      |   \_ nginx:
worker process
      root     14310 12913  0 Sep15 ?        00:00:00      \_ /bin/sh
/usr/bin/mysqld_safe
      sshd     14801 14310  0 Sep15 ?        00:01:57      |   \_
/usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql
--plugin-dir=/usr/lib/mysql/plugin --user=mysql
--pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock
--port=3306
      root     14803 14310  0 Sep15 ?        00:00:00      |   \_ logger -t
mysqld -p daemon.error
      root     14595 12913  0 Sep15 ?        00:00:05      \_ php-fpm:
master process (/etc/php5/fpm/php-fpm.conf)
      www-data 14599 14595  0 Sep15 ?        00:00:00      |   \_ php-fpm:
pool www
      www-data 14600 14595  0 Sep15 ?        00:00:00      |   \_ php-fpm:
pool www
      www-data 14604 14595  0 Sep15 ?        00:00:00      |   \_ php-fpm:
pool vmail
      root     14890 12913  0 Sep15 ?        00:00:00      \_
/usr/sbin/dovecot -c /etc/dovecot/dovecot.conf
      105      14895 14890  0 Sep15 ?        00:00:00      |   \_
dovecot/anvil
      root     14896 14890  0 Sep15 ?        00:00:00      |   \_
dovecot/log
      105      29395 14890  0 15:11 ?        00:00:00      |   \_
dovecot/auth
      munin    15012 12913  0 Sep15 ?        00:07:18      \_
/usr/sbin/clamd -c /etc/clamav/clamd.conf
      munin    15276 12913  0 Sep15 ?        00:01:59      \_
/usr/bin/freshclam -d --quiet --config-file=/etc/clamav/freshclam.conf
      munin    15287 12913  0 Sep15 ?        00:00:02      \_
/usr/sbin/clamav-milter --config-file=/etc/clamav/clamav-milter.conf
      root     15356 12913  0 Sep15 ?        00:00:01      \_
/usr/lib/postfix/master
      ntp      15370 15356  0 Sep15 ?        00:00:00      |   \_ qmgr -l
-t fifo -u
      ntp      15485 15356  0 Sep15 ?        00:00:00      |   \_ tlsmgr -l
-t unix -u -c
      ntp       3635 15356  0 Sep17 ?        00:00:00      |   \_ anvil -l
-t unix -u -c
      ntp      26072 15356  0 15:05 ?        00:00:00      |   \_ pickup -l
-t fifo -u -c
      ntp      29393 15356  0 15:11 ?        00:00:00      |   \_ smtpd -n
smtp -t inet -u -c -o stress= -s 2
      ntp      29394 15356  0 15:11 ?        00:00:00      |   \_ proxymap
-t unix -u
      ntp      29439 15356  0 15:12 ?        00:00:00      |   \_ smtpd -n
smtp -t inet -u -c -o stress= -s 2
      ntp      30408 15356  0 15:15 ?        00:00:00      |   \_ smtpd -n
smtp -t inet -u -c -o stress= -s 2
      root     15364 12913  0 Sep15 pts/19   00:00:00      \_ /sbin/getty
38400 console
      root     15365 12913  0 Sep15 pts/15   00:00:00      \_ /sbin/getty
38400 tty1 linux
      root     15366 12913  0 Sep15 pts/16   00:00:00      \_ /sbin/getty
38400 tty2 linux
      root     15367 12913  0 Sep15 pts/17   00:00:00      \_ /sbin/getty
38400 tty3 linux
      root     15368 12913  0 Sep15 pts/18   00:00:00      \_ /sbin/getty
38400 tty4 linux
      root      1045 12913  0 06:44 ?        00:00:05      \_
/usr/sbin/spamd --create-prefs --max-children 5
--helper-home-dir=/var/lib/spamassassin -u debian-spamd -g debian-spamd -d
--pidfile=/var/run/spamd.pid
      zabbix    1046  1045  0 06:44 ?        00:00:00          \_ spamd
child
      zabbix    1047  1045  0 06:44 ?        00:00:00          \_ spamd
child


# TRACE log for a container from starting it till I stop it
      lxc-start 1411109478.750 INFO     lxc_start_ui - using rcfile
/var/lib/lxc/mail/config
      lxc-start 1411109478.753 WARN     lxc_log - lxc_log_init called with
log already initialized
      lxc-start 1411109478.759 INFO     lxc_monitor - using monitor sock
name lxc/ad055575fe28ddd5//var/lib/lxc
      lxc-start 1411109478.761 DEBUG    lxc_conf - allocated pty
'/dev/pts/15' (5/6)
      lxc-start 1411109478.761 DEBUG    lxc_conf - allocated pty
'/dev/pts/16' (7/8)
      lxc-start 1411109478.761 DEBUG    lxc_conf - allocated pty
'/dev/pts/17' (9/10)
      lxc-start 1411109478.761 DEBUG    lxc_conf - allocated pty
'/dev/pts/18' (11/12)
      lxc-start 1411109478.761 INFO     lxc_conf - tty's configured
      lxc-start 1411109478.761 DEBUG    lxc_start - sigchild handler set
      lxc-start 1411109478.761 DEBUG    lxc_console - no console peer
      lxc-start 1411109478.761 INFO     lxc_start - 'mail' is initialized
      lxc-start 1411109478.869 DEBUG    lxc_start - Dropping cap_sys_boot
      lxc-start 1411109478.879 DEBUG    lxc_conf - instanciated veth
'veth12-sid/vethLDVH7K', index is '96'
      lxc-start 1411109478.879 INFO     lxc_cgroup - cgroup driver cgroupfs
initing for mail
      lxc-start 1411109478.882 DEBUG    lxc_cgfs - cgroup 'devices.deny'
set to 'a'
      lxc-start 1411109478.883 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c *:* m'
      lxc-start 1411109478.883 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'b *:* m'
      lxc-start 1411109478.883 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 1:3 rwm'
      lxc-start 1411109478.883 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 1:5 rwm'
      lxc-start 1411109478.883 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 5:0 rwm'
      lxc-start 1411109478.883 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 5:1 rwm'
      lxc-start 1411109478.883 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 1:8 rwm'
      lxc-start 1411109478.883 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 1:9 rwm'
      lxc-start 1411109478.883 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 5:2 rwm'
      lxc-start 1411109478.883 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 136:* rwm'
      lxc-start 1411109478.883 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 254:0 rm'
      lxc-start 1411109478.883 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 10:229 rwm'
      lxc-start 1411109478.883 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 10:200 rwm'
      lxc-start 1411109478.883 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 1:7 rwm'
      lxc-start 1411109478.883 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 10:228 rwm'
      lxc-start 1411109478.883 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 10:232 rwm'
      lxc-start 1411109478.883 INFO     lxc_cgfs - cgroup has been setup
      lxc-start 1411109478.921 DEBUG    lxc_conf - move 'veth12' to '19912'
      lxc-start 1411109478.922 DEBUG    lxc_start - Dropped cap_sys_boot
      lxc-start 1411109478.924 DEBUG    lxc_conf - mounted
'/var/lib/lxc/mail/rootfs' on '/usr/lib/lxc/rootfs'
      lxc-start 1411109478.924 INFO     lxc_conf - 'mail' hostname has been
setup
      lxc-start 1411109478.950 DEBUG    lxc_conf - 'veth12' has been setup
      lxc-start 1411109478.950 INFO     lxc_conf - network has been setup
      lxc-start 1411109478.950 DEBUG    lxc_conf - Set exec command to
/sbin/init
      lxc-start 1411109478.950 INFO     lxc_conf - Autodev not required.
      lxc-start 1411109478.951 INFO     lxc_conf - mount points have been
setup
      lxc-start 1411109478.952 DEBUG    lxc_conf - mounted 'proc' on
'/usr/lib/lxc/rootfs/proc', type 'proc'
      lxc-start 1411109478.952 DEBUG    lxc_conf - mounted 'sysfs' on
'/usr/lib/lxc/rootfs/sys', type 'sysfs'
      lxc-start 1411109478.952 INFO     lxc_conf - failed to mount
'/sys/fs/fuse/connections' on '/usr/lib/lxc/rootfs/sys/fs/fuse/connections'
(optional): No such file or directory
      lxc-start 1411109478.952 INFO     lxc_conf - mount points have been
setup
      lxc-start 1411109478.952 INFO     lxc_conf - console has been setup
      lxc-start 1411109478.953 INFO     lxc_conf - 4 tty(s) has been setup
      lxc-start 1411109478.953 INFO     lxc_conf - I am 1, /proc/self
points to '1'
      lxc-start 1411109478.954 DEBUG    lxc_conf - created
'/usr/lib/lxc/rootfs/lxc_putold' directory
      lxc-start 1411109478.954 DEBUG    lxc_conf - mountpoint for old
rootfs is '/usr/lib/lxc/rootfs/lxc_putold'
      lxc-start 1411109478.954 DEBUG    lxc_conf - pivot_root syscall to
'/usr/lib/lxc/rootfs' successful
      lxc-start 1411109478.954 DEBUG    lxc_conf - umounted
'/lxc_putold/dev/pts'
      lxc-start 1411109478.954 DEBUG    lxc_conf - umounted
'/lxc_putold/run/lock'
      lxc-start 1411109478.954 DEBUG    lxc_conf - umounted
'/lxc_putold/run/shm'
      lxc-start 1411109478.954 DEBUG    lxc_conf - umounted
'/lxc_putold/sys/fs/cgroup'
      lxc-start 1411109478.954 DEBUG    lxc_conf - umounted
'/lxc_putold/proc'
      lxc-start 1411109478.954 DEBUG    lxc_conf - umounted
'/lxc_putold/boot'
      lxc-start 1411109478.954 DEBUG    lxc_conf - umounted
'/lxc_putold/dev'
      lxc-start 1411109478.954 DEBUG    lxc_conf - umounted
'/lxc_putold/run'
      lxc-start 1411109478.954 DEBUG    lxc_conf - umounted
'/lxc_putold/sys'
      lxc-start 1411109478.954 DEBUG    lxc_conf - umounted '/lxc_putold'
      lxc-start 1411109478.955 INFO     lxc_conf - created new pts instance
      lxc-start 1411109478.955 INFO     lxc_conf - set personality to '0x0'
      lxc-start 1411109478.955 DEBUG    lxc_conf - drop capability
'sys_module' (16)
      lxc-start 1411109478.955 DEBUG    lxc_conf - drop capability
'mac_admin' (33)
      lxc-start 1411109478.955 DEBUG    lxc_conf - drop capability
'mac_override' (32)
      lxc-start 1411109478.955 DEBUG    lxc_conf - drop capability
'sys_time' (25)
      lxc-start 1411109478.955 DEBUG    lxc_conf - capabilities have been
setup
      lxc-start 1411109478.955 NOTICE   lxc_conf - 'mail' is setup.
      lxc-start 1411109478.955 DEBUG    lxc_cgfs - cgroup 'devices.deny'
set to 'a'
      lxc-start 1411109478.955 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c *:* m'
      lxc-start 1411109478.955 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'b *:* m'
      lxc-start 1411109478.955 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 1:3 rwm'
      lxc-start 1411109478.955 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 1:5 rwm'
      lxc-start 1411109478.955 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 5:0 rwm'
      lxc-start 1411109478.955 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 5:1 rwm'
      lxc-start 1411109478.955 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 1:8 rwm'
      lxc-start 1411109478.955 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 1:9 rwm'
      lxc-start 1411109478.955 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 5:2 rwm'
      lxc-start 1411109478.956 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 136:* rwm'
      lxc-start 1411109478.956 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 254:0 rm'
      lxc-start 1411109478.956 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 10:229 rwm'
      lxc-start 1411109478.956 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 10:200 rwm'
      lxc-start 1411109478.956 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 1:7 rwm'
      lxc-start 1411109478.956 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 10:228 rwm'
      lxc-start 1411109478.956 DEBUG    lxc_cgfs - cgroup 'devices.allow'
set to 'c 10:232 rwm'
      lxc-start 1411109478.956 INFO     lxc_cgfs - cgroup has been setup
      lxc-start 1411109478.956 NOTICE   lxc_start - exec'ing '/sbin/init'
      lxc-start 1411109478.958 NOTICE   lxc_start - '/sbin/init' started
with pid '19912'
      lxc-start 1411109478.959 DEBUG    lxc_utmp - Added
'/proc/19912/root/run' to inotifywatch
      lxc-start 1411109478.959 WARN     lxc_start - invalid pid for SIGCHLD
      lxc-start 1411109478.959 DEBUG    lxc_commands - peer has disconnected
      lxc-start 1411109478.959 DEBUG    lxc_commands - peer has disconnected
      lxc-start 1411109478.959 DEBUG    lxc_commands - 'mail' is in
'RUNNING' state
      lxc-start 1411109478.976 DEBUG    lxc_utmp - got inotify event 2 for
utmp
      lxc-start 1411109478.977 DEBUG    lxc_utmp - got inotify event 256
for initctl
      lxc-start 1411109478.977 DEBUG    lxc_utmp - got inotify event 2 for
utmp
      lxc-start 1411109479.023 DEBUG    lxc_utmp - got inotify event 256
for .clean
      lxc-start 1411109936.513 DEBUG    lxc_commands - peer has disconnected
      lxc-start 1411109936.514 DEBUG    lxc_commands - peer has disconnected
      lxc-start 1411109936.514 DEBUG    lxc_commands - peer has disconnected
      lxc-start 1411109936.514 DEBUG    lxc_commands - peer has disconnected
      lxc-start 1411109936.514 DEBUG    lxc_commands - peer has disconnected
      lxc-start 1411109936.514 DEBUG    lxc_commands - peer has disconnected
      lxc-start 1411109936.516 DEBUG    lxc_commands - peer has disconnected
      lxc-start 1411109936.516 DEBUG    lxc_commands - peer has disconnected
      lxc-start 1411109996.965 DEBUG    lxc_start - container init process
exited
      lxc-start 1411109997.025 DEBUG    lxc_start - unknown exit status for
init: 9
      lxc-start 1411109997.025 INFO     lxc_error - child <19912> ended on
signal (9)
      lxc-start 1411109997.025 WARN     lxc_conf - failed to remove
interface 'veth12'
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxcontainers.org/pipermail/lxc-users/attachments/20140919/c22acc2c/attachment-0001.html>


More information about the lxc-users mailing list