[lxc-users] container stuck until lxcfs restart
Fajar A. Nugraha
list at fajar.net
Wed Mar 25 05:53:56 UTC 2015
On Tue, Mar 24, 2015 at 2:10 PM, Fajar A. Nugraha <list at fajar.net> wrote:
> On Tue, Mar 24, 2015 at 5:52 AM, Norberto Bensa
> <nbensa+lxcusers at gmail.com> wrote:
>> Hello,
>>
>> from time to tome, I think once per day, my containers get stuck. I found
>> that `service lxcfs restart` restores functionality.
>>
>> Is this a know bug?
>>
>> ii liblxc1 1.1.0-0ubuntu1
>> amd64 Linux Containers userspace tools (library)
>> ii lxc 1.1.0-0ubuntu1
>> amd64 Linux Containers userspace tools
>> ii lxc-templates 1.1.0-0ubuntu1
>> amd64 Linux Containers userspace tools (templates)
>> ii lxcfs 0.6-0ubuntu2
>> amd64 FUSE based filesystem for LXC
>> ii python3-lxc 1.1.0-0ubuntu1
>> amd64 Linux Containers userspace tools (Python 3.x bindings)
>
>
> I had some problems with simple test of several cycles start-stop a
> vivid container. Similar to your experience, restarting lxcfs solves
> the problem. Serge suggested running lxcfs under gdb and try to find
> out what's wrong when the problem occur. I haven't had time to do so
> though, since debugging with gdb can be somewhat complicated.
>
> If you're just a "normal" non-developer user, and using it for
> "production" use, then my best advice for now is "stop lxcfs, don't
> use any systemd-based containers". upstart/sysvinit-based containers
> will work even without lxcfs.
>
> If you're a developer, then compiling lxcfs with debugging info, and
> poking around with gdb when the problem occurs might help.
I believe I have a reproducer script:
- create a ubuntu vivid container, c1, with systemd
# lxc-create -n c1 -- -d ubuntu -r vivid -a amd64
# lxc-attach -n c1 apt-get update
# lxc-attach -n c1 apt-get install systemd-sysv
- run this script
# n=0;while true;do n=$((n+1));echo "$(date) -- Try #$n";sleep
1;lxc-start -n c1 && sleep 10 && lxc-attach -n c1 -- poweroff &&
lxc-console -n c1 -t console;done
It's now stuck on try #30
# lxc-ls -f c1
NAME STATE IPV4 IPV6 GROUPS AUTOSTART
-------------------------------------------------------
c1 RUNNING 192.168.124.120 - - NO
# lxc-attach -n c1 -> hangs
# ps -ef | grep attach
root 1026 19966 0 12:17 pts/14 00:00:00 lxc-attach -n c1 -- poweroff
root 1633 25540 0 12:48 pts/29 00:00:00 lxc-attach -n c1
The first lxc-atach is the one from the script
# strace -f -p 1026
Process 1026 attached
wait4(1028,
# ps -ef | grep 1028
root 1028 1026 0 12:17 pts/14 00:00:00 poweroff
# strace -f -p 1028
Process 1028 attached
ppoll([{fd=3, events=POLLIN}], 1, NULL, NULL, 8
# ps -ef | grep lxcfs
root 383 1 5 12:08 ? 00:02:26 /usr/bin/lxcfs -s -f
-o allow_other /var/lib/lxcfs
root 1312 383 0 12:17 ? 00:00:00 /usr/bin/lxcfs -s -f
-o allow_other /var/lib/lxcfs
root 1313 1312 0 12:17 ? 00:00:00 /usr/bin/lxcfs -s -f
-o allow_other /var/lib/lxcfs
root 1764 1654 0 12:50 pts/28 00:00:00 grep --color=auto lxcfs
# strace -f -p 383
Process 383 attached
wait4(1312,
# strace -f -p 1312
Process 1312 attached
wait4(1313,
# strace -f -p 1313
Process 1313 attached
recvmsg(5,
"poweroff" command hungs polling fd3 (lxcfs-related?) and lxcfs is
stuck on recvmsg (waiting for something from the poweroff command?)
--
Fajar
More information about the lxc-users
mailing list