[lxc-devel] Potential deadlock with lxcfs and lxc-freeze
Fabian Grünbichler
f.gruenbichler at proxmox.com
Thu Feb 11 09:47:00 UTC 2016
Hello,
some of our users encounter a strange issue when using lxc-freeze on a container
using lxcfs. Sometimes, lxc-freeze is unable to freeze a process inside the
container that is accessing files in /proc that are provided by lxcfs. The
process(es) in question hang in FUSE's request_wait_answer(), and the associated
lxcfs process in futex_wait_queue_me (according to ps faxl).
This is quite surprising, because lxcfs is not part of the cgroup that is
frozen, and should thus not be affected by a call to lxc-freeze. A similar, but
NOT unsurprising behaviour can be observed when mounting a FUSE file system in
the container itself (e.g., create /dev/fuse and mount an sshfs inside the CT),
running find in a loop on the mounted FUSE fs in the container and trying to
lxc-freeze the container. In that case, the problem is that the kernel freezer
does not know in which order the processes would need to be frozen in order to
avoid a deadlock. I don't see how this would apply to lxcfs (running on the
host) and a process accessing it (in the container) though.
A test setup that seems to work (but takes a while to trigger):
1) Log into container and do:
$ while : ; do uptime; done
2) On host do:
$ i=0; while : ; do let i++; echo freeze $i && lxc-freeze -n NAME; echo unfreeze
&& lxc-unfreeze -n NAME; done
At some point, the output of 2 will stop, and 'ps faxl' will show something like
this:
# ps faxl |grep lxcfs
4 0 3774 1 20 0 527956 2132 futex_wait_queue_me Ssl ?
0:10 /usr/bin/lxcfs -f -s -o allow_other /var/lib/lxcfs/
5 0 22927 3774 20 0 380220 788 wait S ?
0:00 \_ /usr/bin/lxcfs -f -s -o allow_other /var/lib/lxcfs/
1 0 22928 22927 20 0 380352 788 futex_wait_queue_me S ?
0:00 \_ /usr/bin/lxcfs -f -s -o allow_other /var/lib/lxcfs/
# (ps faxl portion for the container, no lxc-attach was used so this
includes all of it)
5 0 12569 1 20 0 38768 3448 ep_poll Ss ?
0:02 [lxc monitor] /var/lib/lxc 104
4 0 12651 12569 20 0 34080 4492 refrigerator Ds ?
0:00 \_ /sbin/init
4 0 12815 12651 20 0 30488 5436 refrigerator Ds ?
0:00 \_ /usr/lib/systemd/systemd-journald
4 81 12981 12651 20 0 34748 3444 refrigerator Ds ?
0:00 \_ /usr/bin/dbus-daemon --system --address=systemd: --nofork
--nopidfile --systemd-activation
4 0 13016 12651 20 0 15292 2424 refrigerator Ds ?
0:00 \_ /usr/lib/systemd/systemd-logind
4 193 13033 12651 20 0 19792 2688 refrigerator Ds ?
0:00 \_ /usr/lib/systemd/systemd-networkd
4 0 13052 12651 20 0 6348 1664 refrigerator Ds+ pts/7
0:00 \_ /sbin/agetty --noclear --keep-baud console 115200 38400 9600
vt220
4 0 13055 12651 20 0 6348 1544 refrigerator Ds+ pts/1
0:00 \_ /sbin/agetty --noclear --keep-baud pts/1 115200 38400 9600
vt220
4 0 13058 12651 20 0 89728 4128 refrigerator Ds ?
0:00 \_ login -- root
4 0 30296 13058 20 0 14408 3356 refrigerator Ds pts/0
0:01 | \_ -bash
0 0 22921 30296 20 0 31980 2380 request_wait_answer D+ pts/0
0:00 | \_ uptime
4 0 30127 12651 20 0 33752 4128 refrigerator Ds ?
0:00 \_ /usr/lib/systemd/systemd --user
5 0 30159 30127 20 0 96432 1316 sigtimedwait S ?
0:00 \_ (sd-pam)
Attaching gdb to the lxcfs process in question (22928 in this case) gives the
following (trimmed) backtrace:
#0 __lll_lock_wait_private () at
../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
#1 0x00007f9552b816db in _L_lock_11305 () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x00007f9552b7f838 in __GI___libc_realloc (oldmem=0x7f9552ea8620
<main_arena>, bytes=bytes at entry=567) at malloc.c:3025
See [1] for full backtrace. It seems that a fork() gone wrong fails an assertion
and the malloc() needed to asprintf the error message waits for a lock? Calling
lxc-unfreeze -n NAME makes both the container and lxcfs continue without
problems, a subsequent lxc-freeze -n NAME works (since it took >6000 freeze
attempts to trigger the issue with this setup, this is not surprising).
While it takes a while to reproduce this in this test setting, our users report
that it occurs quite often in a "real" environment. Some common factors seem to
be: running multiple containers, running some kind of monitoring software
accessing various /proc files in the container (we have reports concerning
piwik, splunkd and monit). See [2] for a support forum thread with reports of
varying detail, and hopefully more backtraces soon. Note that Proxmox VE calls
lxc-freeze for both snapshot and suspend mode backups, so this issue affects
both modes.
Thanks in advance for checking this out,
Fabian
1:
https://gist.githubusercontent.com/Blub/72a7f432fcf8f6513919/raw/cbc22497abd95746dbb426b0674572c7ffef6a07/lxc-err1.txt
2: https://forum.proxmox.com/threads/lxc-backup-randomly-hangs-at-suspend.25345/
More information about the lxc-devel
mailing list