[lxc-users] Containers won't start under stretch-backport kernel reboot
Tony Lewis
tony at lewistribe.com
Tue Aug 14 06:54:43 UTC 2018
Apologies in advance for the bump, but does anyone have an insights on this?
On 11/08/18 16:23, Tony Lewis wrote:
> I have been running LXD/LXC on a stock Debian Stretch
> kernel(4.9.0-7-amd64), and it's working fine. But it's beneficial for
> me to go to the Stretch Backports (4.17.0-0.bpo.1-amd64). When I do,
> my container's won't start until I forcefully kill the LXD daemon and
> restart the service.
>
> Details are below, and I'd appreciate some help figuring out what is
> going wrong.
>
> Details...
>
> I was originally running LXD from packages then later migrated to
> snap. It didn't go smoothly so there's a chance there is some
> package-related cruft remaining. It seems there are only snap-related
> lxd binaries on my system:
>
> root at server:~# find /usr/ /lib /snap /var /bin /sbin -name lxd -type f
> -print
> /snap/lxd/7651/bin/lxd
> /snap/lxd/7651/commands/lxd
> /snap/lxd/7792/bin/lxd
> /snap/lxd/7792/commands/lxd
> /snap/lxd/8011/bin/lxd
> /snap/lxd/8011/commands/lxd
>
> root at server:~# lxd --version
> 3.3
> root at server:~# lxc --version
> 3.3
>
> root at server:~# dpkg -l | grep lx
> ii libgl1-mesa-glx:amd64 13.0.6-1+b2 amd64 free implementation
> of the OpenGL API -- GLX runtime
> ii libxcb-glx0:amd64 1.12-1 amd64 X C Binding, glx extension
> rc lxc-common 2.1.0-0ubuntu1~ubuntu17.04.1~ppa1 amd64 Linux
> Containers userspace tools (common tools)
> rc lxc1 2.1.0-0ubuntu1~ubuntu17.04.1~ppa1 amd64 Linux
> Containers userspace tools
> rc lxcfs 2.0.7-1 amd64 FUSE based
> filesystem for LXC
> rc lxd 2.18-0ubuntu5~ubuntu17.04.1~ppa1 amd64 Container
> hypervisor based on LXC - daemon
>
> Right after a reboot, LXD is running:
>
> root at server:~# ps -ef | grep lx
> root 1823 1 0 11:42 ? 00:00:00 /bin/sh
> /snap/lxd/8011/commands/daemon.start
> root 2436 1 0 11:42 ? 00:00:00 lxcfs
> /var/snap/lxd/common/var/lib/lxcfs -p /var/snap/lxd/common/lxcfs.pid
> root 2452 1823 5 11:42 ? 00:00:08 lxd --logfile
> /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
> root 2453 1823 1 11:42 ? 00:00:02 lxd waitready
> root 2454 1823 0 11:42 ? 00:00:00 /bin/sh
> /snap/lxd/8011/commands/daemon.start
> lxd 2938 1 0 11:42 ? 00:00:00 dnsmasq --strict-order
> --bind-interfaces
> --pid-file=/var/snap/lxd/common/lxd/networks/lxdnet1/dnsmasq.pid
> --except-interface=lo --interface=lxdnet1 --quiet-dhcp --quiet-dhcp6
> --quiet-ra --listen-address=10.1.99.1 --dhcp-no-override
> --dhcp-authoritative
> --dhcp-leasefile=/var/snap/lxd/common/lxd/networks/lxdnet1/dnsmasq.leases
> --dhcp-hostsfile=/var/snap/lxd/common/lxd/networks/lxdnet1/dnsmasq.hosts
> --dhcp-range 10.1.99.2,10.1.99.254,1h
> --listen-address=fd42:6727:ccbe:877f::1 --enable-ra --dhcp-range
> ::,constructor:lxdnet1,ra-stateless,ra-names -s lxd -S /lxd/
> --conf-file=/var/snap/lxd/common/lxd/networks/lxdnet1/dnsmasq.raw -u lxd
> lxd 3207 1 0 11:42 ? 00:00:00 dnsmasq --strict-order
> --bind-interfaces
> --pid-file=/var/snap/lxd/common/lxd/networks/lxdnet0/dnsmasq.pid
> --except-interface=lo --interface=lxdnet0 --quiet-dhcp --quiet-dhcp6
> --quiet-ra --listen-address=10.1.100.1 --dhcp-no-override
> --dhcp-authoritative
> --dhcp-leasefile=/var/snap/lxd/common/lxd/networks/lxdnet0/dnsmasq.leases
> --dhcp-hostsfile=/var/snap/lxd/common/lxd/networks/lxdnet0/dnsmasq.hosts
> --dhcp-range 10.1.100.2,10.1.100.254,1h
> --listen-address=fd42:5dd8:266d:cfea::1 --enable-ra --dhcp-range
> ::,constructor:lxdnet0,ra-stateless,ra-names -s lxd -S /lxd/
> --conf-file=/var/snap/lxd/common/lxd/networks/lxdnet0/dnsmasq.raw -u lxd
> root 4163 4040 0 11:45 pts/1 00:00:00 grep lx
>
> There are two LXD-ish looking services:
>
> root at server:~# systemctl list-units | grep lx
> sys-devices-virtual-net-lxdnet0.device loaded active plugged
> /sys/devices/virtual/net/lxdnet0
> sys-devices-virtual-net-lxdnet1.device loaded active plugged
> /sys/devices/virtual/net/lxdnet1
> sys-subsystem-net-devices-lxdnet0.device loaded active plugged
> /sys/subsystem/net/devices/lxdnet0
> sys-subsystem-net-devices-lxdnet1.device loaded active plugged
> /sys/subsystem/net/devices/lxdnet1
> run-snapd-ns-lxd.mnt.mount loaded active mounted /run/snapd/ns/lxd.mnt
> snap-lxd-7651.mount loaded active mounted Mount unit for lxd
> snap-lxd-7792.mount loaded active mounted Mount unit for lxd
> snap-lxd-8011.mount loaded active mounted Mount unit for lxd
> lxd.service loaded active exited LSB: Container hypervisor based on
> LXC
> snap.lxd.daemon.service loaded active running Service for snap
> application lxd.daemon
>
> lxd.service is not running, but snap.lxd.daemon.service is:
>
> root at server:~# systemctl status snap.lxd.daemon.service
> ● snap.lxd.daemon.service - Service for snap application lxd.daemon
> Loaded: loaded (/etc/systemd/system/snap.lxd.daemon.service;
> enabled; vendor preset: enabled)
> Active: active (running) since Sat 2018-08-11 11:42:31 AEST; 3min
> 22s ago
> Main PID: 1823 (daemon.start)
> Tasks: 0 (limit: 4915)
> CGroup: /system.slice/snap.lxd.daemon.service
> ‣ 1823 /bin/sh /snap/lxd/8011/commands/daemon.start
>
> Aug 11 11:42:34 server snap[1823]: 2: fd: 8: perf_event
> Aug 11 11:42:34 server snap[1823]: 3: fd: 9: blkio
> Aug 11 11:42:34 server snap[1823]: 4: fd: 10: freezer
> Aug 11 11:42:34 server snap[1823]: 5: fd: 11: devices
> Aug 11 11:42:34 server snap[1823]: 6: fd: 12: cpu,cpuacct
> Aug 11 11:42:34 server snap[1823]: 7: fd: 13: net_cls,net_prio
> Aug 11 11:42:34 server snap[1823]: 8: fd: 14: memory
> Aug 11 11:42:34 server snap[1823]: 9: fd: 15: name=systemd
> Aug 11 11:42:35 server snap[1823]: lvl=warn msg="CGroup memory swap
> accounting is disabled, swap limits will be ignored."
> t=2018-08-11T01:42:35+0000
> Aug 11 11:42:38 server snap[1823]: lvl=warn msg="Unable to update
> backup.yaml at this time" name=backuptests t=2018-08-11T01:42:38+0000
>
> root at server:~# systemctl status lxd.service
> ● lxd.service - LSB: Container hypervisor based on LXC
> Loaded: loaded (/etc/init.d/lxd; generated; vendor preset: enabled)
> Active: active (exited) since Sat 2018-08-11 11:42:24 AEST; 3min
> 49s ago
> Docs: man:systemd-sysv-generator(8)
> Process: 1412 ExecStart=/etc/init.d/lxd start (code=exited,
> status=0/SUCCESS)
> Tasks: 0 (limit: 4915)
> CGroup: /system.slice/lxd.service
>
> Aug 11 11:42:24 server systemd[1]: Starting LSB: Container hypervisor
> based on LXC...
> Aug 11 11:42:24 server systemd[1]: Started LSB: Container hypervisor
> based on LXC.
>
> I can try stopping both services (though the first one reports as
> being exited) the snap LXD service:
>
> root at server:~# systemctl stop snap.lxd.daemon.service
>
> root at server:~# systemctl stop lxd
>
> root at server:~# systemctl stop snap.lxd.daemon.service
>
> root at server:~# systemctl list-units | grep lx
> sys-devices-virtual-net-lxdnet0.device loaded active plugged
> /sys/devices/virtual/net/lxdnet0
> sys-devices-virtual-net-lxdnet1.device loaded active plugged
> /sys/devices/virtual/net/lxdnet1
> sys-subsystem-net-devices-lxdnet0.device loaded active plugged
> /sys/subsystem/net/devices/lxdnet0
> sys-subsystem-net-devices-lxdnet1.device loaded active plugged
> /sys/subsystem/net/devices/lxdnet1
> run-snapd-ns-lxd.mnt.mount loaded active mounted /run/snapd/ns/lxd.mnt
> snap-lxd-7651.mount loaded active mounted Mount unit for lxd
> snap-lxd-7792.mount loaded active mounted Mount unit for lxd
> snap-lxd-8011.mount loaded active mounted Mount unit for lxd
>
> root at server:~# systemctl status snap.lxd.daemon.service
> ● snap.lxd.daemon.service - Service for snap application lxd.daemon
> Loaded: loaded (/etc/systemd/system/snap.lxd.daemon.service;
> enabled; vendor preset: enabled)
> Active: inactive (dead) since Sat 2018-08-11 11:46:49 AEST; 25s ago
> Process: 4304 ExecStop=/usr/bin/snap run --command=stop lxd.daemon
> (code=exited, status=0/SUCCESS)
> Process: 1823 ExecStart=/usr/bin/snap run lxd.daemon (code=killed,
> signal=TERM)
> Main PID: 1823 (code=killed, signal=TERM)
>
> Aug 11 11:42:38 server snap[1823]: lvl=warn msg="Unable to update
> backup.yaml at this time" name=backuptests t=2018-08-11T01:42:38+0000
> Aug 11 11:46:48 server systemd[1]: Stopping Service for snap
> application lxd.daemon...
> Aug 11 11:46:48 server /usr/bin/snap[4304]: cmd.go:105: DEBUG:
> restarting into "/snap/core/current/usr/bin/snap"
> Aug 11 11:46:48 server snap[4320]: cmd.go:105: DEBUG: restarting into
> "/snap/core/current/usr/bin/snap"
> Aug 11 11:46:48 server snap[4304]: error: no changes found
> Aug 11 11:46:48 server snap[4304]: => Stop reason is: host shutdown
> Aug 11 11:46:48 server snap[4304]: => Stopping LXD (with container
> shutdown)
> Aug 11 11:46:48 server snap[4304]: lxd: error while loading shared
> libraries: liblxc.so.1: cannot open shared object file: No such file
> or directory
> Aug 11 11:46:48 server snap[4304]: => Stopping LXCFS
> Aug 11 11:46:49 server systemd[1]: Stopped Service for snap
> application lxd.daemon.
>
> But LXD is still running:
>
> root at server:~# ps -ef | grep lx
> root 2452 1 4 11:42 ? 00:00:13 lxd --logfile
> /var/snap/lxd/common/lxd/logs/lxd.log --group lxd
> root 2453 1 1 11:42 ? 00:00:03 lxd waitready
> root 2454 1 0 11:42 ? 00:00:00 /bin/sh
> /snap/lxd/8011/commands/daemon.start
> lxd 2938 1 0 11:42 ? 00:00:00 dnsmasq --strict-order
> --bind-interfaces
> --pid-file=/var/snap/lxd/common/lxd/networks/lxdnet1/dnsmasq.pid
> --except-interface=lo --interface=lxdnet1 --quiet-dhcp --quiet-dhcp6
> --quiet-ra --listen-address=10.1.99.1 --dhcp-no-override
> --dhcp-authoritative
> --dhcp-leasefile=/var/snap/lxd/common/lxd/networks/lxdnet1/dnsmasq.leases
> --dhcp-hostsfile=/var/snap/lxd/common/lxd/networks/lxdnet1/dnsmasq.hosts
> --dhcp-range 10.1.99.2,10.1.99.254,1h
> --listen-address=fd42:6727:ccbe:877f::1 --enable-ra --dhcp-range
> ::,constructor:lxdnet1,ra-stateless,ra-names -s lxd -S /lxd/
> --conf-file=/var/snap/lxd/common/lxd/networks/lxdnet1/dnsmasq.raw -u lxd
> lxd 3207 1 0 11:42 ? 00:00:00 dnsmasq --strict-order
> --bind-interfaces
> --pid-file=/var/snap/lxd/common/lxd/networks/lxdnet0/dnsmasq.pid
> --except-interface=lo --interface=lxdnet0 --quiet-dhcp --quiet-dhcp6
> --quiet-ra --listen-address=10.1.100.1 --dhcp-no-override
> --dhcp-authoritative
> --dhcp-leasefile=/var/snap/lxd/common/lxd/networks/lxdnet0/dnsmasq.leases
> --dhcp-hostsfile=/var/snap/lxd/common/lxd/networks/lxdnet0/dnsmasq.hosts
> --dhcp-range 10.1.100.2,10.1.100.254,1h
> --listen-address=fd42:5dd8:266d:cfea::1 --enable-ra --dhcp-range
> ::,constructor:lxdnet0,ra-stateless,ra-names -s lxd -S /lxd/
> --conf-file=/var/snap/lxd/common/lxd/networks/lxdnet0/dnsmasq.raw -u lxd
> root 4454 4040 0 11:47 pts/1 00:00:00 grep lx
>
> I can list my containers, and with debug I can verify that comms with
> the socket works. But if I attempt to manually start a container,
> that command blocks and nothing happens.
>
> root at server:~# lxc ls
> +-------------+---------+------+------+------------+-----------+
> | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
> +-------------+---------+------+------+------------+-----------+
> | container1 | STOPPED | | | PERSISTENT | 0 |
> +-------------+---------+------+------+------------+-----------+
> | container2 | STOPPED | | | PERSISTENT | 0 |
> +-------------+---------+------+------+------------+-----------+
> | container3 | STOPPED | | | PERSISTENT | 0 |
> +-------------+---------+------+------+------------+-----------+
> | container4 | STOPPED | | | PERSISTENT | 0 |
> +-------------+---------+------+------+------------+-----------+
> | container5 | STOPPED | | | PERSISTENT | 0 |
> +-------------+---------+------+------+------------+-----------+
>
> If I manually kill the LXD process and restart the service with
> systemctl, my containers automatically start in turn and I am good to go:
>
> root at server:~# kill 2452
> root at server:~# ps -ef | grep lx
> lxd 2938 1 0 11:42 ? 00:00:00 dnsmasq --strict-order
> --bind-interfaces
> --pid-file=/var/snap/lxd/common/lxd/networks/lxdnet1/dnsmasq.pid
> --except-interface=lo --interface=lxdnet1 --quiet-dhcp --quiet-dhcp6
> --quiet-ra --listen-address=10.1.99.1 --dhcp-no-override
> --dhcp-authoritative
> --dhcp-leasefile=/var/snap/lxd/common/lxd/networks/lxdnet1/dnsmasq.leases
> --dhcp-hostsfile=/var/snap/lxd/common/lxd/networks/lxdnet1/dnsmasq.hosts
> --dhcp-range 10.1.99.2,10.1.99.254,1h
> --listen-address=fd42:6727:ccbe:877f::1 --enable-ra --dhcp-range
> ::,constructor:lxdnet1,ra-stateless,ra-names -s lxd -S /lxd/
> --conf-file=/var/snap/lxd/common/lxd/networks/lxdnet1/dnsmasq.raw -u lxd
> lxd 3207 1 0 11:42 ? 00:00:00 dnsmasq --strict-order
> --bind-interfaces
> --pid-file=/var/snap/lxd/common/lxd/networks/lxdnet0/dnsmasq.pid
> --except-interface=lo --interface=lxdnet0 --quiet-dhcp --quiet-dhcp6
> --quiet-ra --listen-address=10.1.100.1 --dhcp-no-override
> --dhcp-authoritative
> --dhcp-leasefile=/var/snap/lxd/common/lxd/networks/lxdnet0/dnsmasq.leases
> --dhcp-hostsfile=/var/snap/lxd/common/lxd/networks/lxdnet0/dnsmasq.hosts
> --dhcp-range 10.1.100.2,10.1.100.254,1h
> --listen-address=fd42:5dd8:266d:cfea::1 --enable-ra --dhcp-range
> ::,constructor:lxdnet0,ra-stateless,ra-names -s lxd -S /lxd/
> --conf-file=/var/snap/lxd/common/lxd/networks/lxdnet0/dnsmasq.raw -u lxd
> root 4468 4040 0 11:47 pts/1 00:00:00 grep lx
> root at server:~# lxc ls
> Error: Get http://unix.socket/1.0: dial unix
> /var/snap/lxd/common/lxd/unix.socket: connect: connection refused
> root at server:~# systemctl start lxd
> root at server:~# lxc ls
> Error: Get http://unix.socket/1.0: dial unix
> /var/snap/lxd/common/lxd/unix.socket: connect: connection refused
> root at server:~# lxc ls
> Error: Get http://unix.socket/1.0: dial unix
> /var/snap/lxd/common/lxd/unix.socket: connect: connection refused
> root at server:~# lxc ls
> Error: Get http://unix.socket/1.0: dial unix
> /var/snap/lxd/common/lxd/unix.socket: connect: connection refused
> root at server:~# lxc ls
> Error: Get http://unix.socket/1.0: dial unix
> /var/snap/lxd/common/lxd/unix.socket: connect: connection refused
> root at server:~# systemctl start snap.lxd.daemon
> root at server:~# systemctl list-units | grep lx
> sys-devices-virtual-net-lxdnet0.device loaded active plugged
> /sys/devices/virtual/net/lxdnet0
> sys-devices-virtual-net-lxdnet1.device loaded active plugged
> /sys/devices/virtual/net/lxdnet1
> sys-subsystem-net-devices-lxdnet0.device loaded active plugged
> /sys/subsystem/net/devices/lxdnet0
> sys-subsystem-net-devices-lxdnet1.device loaded active plugged
> /sys/subsystem/net/devices/lxdnet1
> run-snapd-ns-lxd.mnt.mount loaded active mounted /run/snapd/ns/lxd.mnt
> snap-lxd-7651.mount loaded active mounted Mount unit for lxd
> snap-lxd-7792.mount loaded active mounted Mount unit for lxd
> snap-lxd-8011.mount loaded active mounted Mount unit for lxd
> lxd.service loaded active exited LSB: Container hypervisor based on
> LXC
> snap.lxd.daemon.service loaded active running Service for snap
> application lxd.daemon
>
> root at server:~# lxc ls
> +-------------+---------+-----------------------+-----------------------------------------------+------------+-----------+
>
> | NAME | STATE | IPV4 | IPV6
> | TYPE | SNAPSHOTS |
> +-------------+---------+-----------------------+-----------------------------------------------+------------+-----------+
>
> | container1 | RUNNING | 10.1.100.49 (eth0) |
> fd42:5dd8:266d:cfea:216:3eff:fe17:904c (eth0) | PERSISTENT | 0 |
> +-------------+---------+-----------------------+-----------------------------------------------+------------+-----------+
>
> | container2 | RUNNING | 10.1.100.182 (eth0) |
> fd42:5dd8:266d:cfea:216:3eff:fe1c:91d0 (eth0) | PERSISTENT | 0 |
> +-------------+---------+-----------------------+-----------------------------------------------+------------+-----------+
>
> | container3 | RUNNING | 10.1.100.56 (eth0) |
> fd42:5dd8:266d:cfea:216:3eff:fec6:6816 (eth0) | PERSISTENT | 0 |
> +-------------+---------+-----------------------+-----------------------------------------------+------------+-----------+
>
> | container4 | RUNNING | 10.1.100.209 (eth0) |
> fd42:5dd8:266d:cfea:216:3eff:fe27:6f8f (eth0) | PERSISTENT | 0 |
> +-------------+---------+-----------------------+-----------------------------------------------+------------+-----------+
>
> | container5 | RUNNING | 10.1.100.43 (eth0) |
> fd42:5dd8:266d:cfea:216:3eff:fea4:a034 (eth0) | PERSISTENT | 0 |
> +-------------+---------+-----------------------+-----------------------------------------------+------------+-----------+
>
>
>
>
>
> _______________________________________________
> lxc-users mailing list
> lxc-users at lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-users
More information about the lxc-users
mailing list