[lxc-users] "error: LXD still not running after 5 minutes" - failed lxd.migrate - how to recover?

Stéphane Graber stgraber at ubuntu.com
Wed Aug 8 21:48:16 UTC 2018


On Wed, Aug 08, 2018 at 11:26:10PM +0200, Tomasz Chmielewski wrote:
> On 2018-08-08 22:26, Stéphane Graber wrote:
> 
> > > Not sure how to recover now? The containers seem intact in
> > > /var/lib/lxd/
> > 
> > What do you get if you do "journalctl -u snap.lxd.daemon -n 300" and
> 
> -- Logs begin at Thu 2018-07-12 06:07:13 UTC, end at Wed 2018-08-08 21:07:13
> UTC. --
> Aug 08 18:21:12 b1 systemd[1]: Started Service for snap application
> lxd.daemon.
> Aug 08 18:21:12 b1 lxd.daemon[12581]: => Preparing the system
> Aug 08 18:21:12 b1 lxd.daemon[12581]: ==> Creating missing snap
> configuration
> Aug 08 18:21:13 b1 lxd.daemon[12581]: ==> Loading snap configuration
> Aug 08 18:21:13 b1 lxd.daemon[12581]: ==> Setting up mntns symlink
> (mnt:[4026532794])
> Aug 08 18:21:13 b1 lxd.daemon[12581]: ==> Setting up kmod wrapper
> Aug 08 18:21:13 b1 lxd.daemon[12581]: ==> Preparing /boot
> Aug 08 18:21:13 b1 lxd.daemon[12581]: ==> Preparing a clean copy of /run
> Aug 08 18:21:13 b1 lxd.daemon[12581]: ==> Preparing a clean copy of /etc
> Aug 08 18:21:13 b1 lxd.daemon[12581]: ==> Setting up ceph configuration
> Aug 08 18:21:13 b1 lxd.daemon[12581]: ==> Setting up LVM configuration
> Aug 08 18:21:13 b1 lxd.daemon[12581]: ==> Rotating logs
> Aug 08 18:21:13 b1 lxd.daemon[12581]: ==> Setting up ZFS (0.7)
> Aug 08 18:21:13 b1 lxd.daemon[12581]: ==> Escaping the systemd cgroups
> Aug 08 18:21:13 b1 lxd.daemon[12581]: ==> Escaping the systemd process
> resource limits
> Aug 08 18:21:41 b1 systemd[1]: Stopping Service for snap application
> lxd.daemon...
> Aug 08 18:21:42 b1 lxd.daemon[13595]: => Stop reason is: host shutdown
> Aug 08 18:21:42 b1 lxd.daemon[13595]: => Stopping LXD (with container
> shutdown)
> Aug 08 18:21:42 b1 lxd.daemon[13595]: => Stopping LXCFS
> Aug 08 18:21:43 b1 systemd[1]: Stopped Service for snap application
> lxd.daemon.
> Aug 08 18:21:44 b1 systemd[1]: Started Service for snap application
> lxd.daemon.
> Aug 08 18:21:44 b1 lxd.daemon[13676]: => Preparing the system
> Aug 08 18:21:44 b1 lxd.daemon[13676]: ==> Loading snap configuration
> Aug 08 18:21:44 b1 lxd.daemon[13676]: ==> Setting up mntns symlink
> (mnt:[4026532794])
> Aug 08 18:21:44 b1 lxd.daemon[13676]: ==> Setting up kmod wrapper
> Aug 08 18:21:44 b1 lxd.daemon[13676]: ==> Preparing /boot
> Aug 08 18:21:44 b1 lxd.daemon[13676]: ==> Preparing a clean copy of /run
> Aug 08 18:21:44 b1 lxd.daemon[13676]: ==> Preparing a clean copy of /etc
> Aug 08 18:21:44 b1 lxd.daemon[13676]: ==> Setting up ceph configuration
> Aug 08 18:21:44 b1 lxd.daemon[13676]: ==> Setting up LVM configuration
> Aug 08 18:21:44 b1 lxd.daemon[13676]: ==> Rotating logs
> Aug 08 18:21:44 b1 lxd.daemon[13676]: ==> Setting up ZFS (0.7)
> Aug 08 18:21:44 b1 lxd.daemon[13676]: ==> Escaping the systemd cgroups
> Aug 08 18:21:44 b1 lxd.daemon[13676]: ==> Escaping the systemd process
> resource limits
> Aug 08 18:21:44 b1 lxd.daemon[13676]: => Starting LXCFS
> Aug 08 18:21:44 b1 lxd.daemon[13676]: => Starting LXD
> Aug 08 18:21:44 b1 lxd.daemon[13676]: lvl=warn msg="AppArmor support has
> been disabled because of lack of kernel support" t=2018-08-08T18:21:44+0000
> Aug 08 18:21:44 b1 lxd.daemon[13676]: lvl=warn msg="CGroup memory swap
> accounting is disabled, swap limits will be ignored."
> t=2018-08-08T18:21:44+0000
> Aug 08 18:21:44 b1 lxd.daemon[13676]: mount namespace: 5
> Aug 08 18:21:44 b1 lxd.daemon[13676]: hierarchies:
> Aug 08 18:21:44 b1 lxd.daemon[13676]:   0: fd:   6: hugetlb
> Aug 08 18:21:44 b1 lxd.daemon[13676]:   1: fd:   7: pids
> Aug 08 18:21:44 b1 lxd.daemon[13676]:   2: fd:   8: cpuset
> Aug 08 18:21:44 b1 lxd.daemon[13676]:   3: fd:   9: perf_event
> Aug 08 18:21:44 b1 lxd.daemon[13676]:   4: fd:  10: freezer
> Aug 08 18:21:44 b1 lxd.daemon[13676]:   5: fd:  11: memory
> Aug 08 18:21:44 b1 lxd.daemon[13676]:   6: fd:  12: devices
> Aug 08 18:21:44 b1 lxd.daemon[13676]:   7: fd:  13: blkio
> Aug 08 18:21:44 b1 lxd.daemon[13676]:   8: fd:  14: cpu,cpuacct
> Aug 08 18:21:44 b1 lxd.daemon[13676]:   9: fd:  15: net_cls,net_prio
> Aug 08 18:21:44 b1 lxd.daemon[13676]:  10: fd:  16: rdma
> Aug 08 18:21:44 b1 lxd.daemon[13676]:  11: fd:  17: name=systemd
> Aug 08 18:21:44 b1 lxd.daemon[13676]:  12: fd:  18: unified
> Aug 08 18:28:07 b1 systemd[1]: Stopping Service for snap application
> lxd.daemon...
> Aug 08 18:28:07 b1 lxd.daemon[18773]: => Stop reason is: host shutdown
> Aug 08 18:28:07 b1 lxd.daemon[18773]: => Stopping LXD (with container
> shutdown)
> Aug 08 18:37:24 b1 lxd.daemon[18773]: => Stopping LXCFS
> Aug 08 18:37:25 b1 systemd[1]: Stopped Service for snap application
> lxd.daemon.
> -- Reboot --
> Aug 08 18:38:55 b1 systemd[1]: Started Service for snap application
> lxd.daemon.
> Aug 08 18:38:57 b1 lxd.daemon[2345]: => Preparing the system
> Aug 08 18:38:57 b1 lxd.daemon[2345]: ==> Loading snap configuration
> Aug 08 18:38:57 b1 lxd.daemon[2345]: ==> Setting up mntns symlink
> (mnt:[4026532463])
> Aug 08 18:38:57 b1 lxd.daemon[2345]: ==> Setting up kmod wrapper
> Aug 08 18:38:58 b1 lxd.daemon[2345]: ==> Preparing /boot
> Aug 08 18:38:58 b1 lxd.daemon[2345]: ==> Preparing a clean copy of /run
> Aug 08 18:38:58 b1 lxd.daemon[2345]: ==> Preparing a clean copy of /etc
> Aug 08 18:38:58 b1 lxd.daemon[2345]: ==> Setting up ceph configuration
> Aug 08 18:38:58 b1 lxd.daemon[2345]: ==> Setting up LVM configuration
> Aug 08 18:38:59 b1 lxd.daemon[2345]: ==> Rotating logs
> Aug 08 18:38:59 b1 lxd.daemon[2345]: ==> Setting up ZFS (0.7)
> Aug 08 18:38:59 b1 lxd.daemon[2345]: ==> Escaping the systemd cgroups
> Aug 08 18:38:59 b1 lxd.daemon[2345]: ==> Escaping the systemd process
> resource limits
> Aug 08 18:38:59 b1 lxd.daemon[2345]: => Starting LXCFS
> Aug 08 18:38:59 b1 lxd.daemon[2345]: => Starting LXD
> Aug 08 18:38:59 b1 lxd.daemon[2345]: mount namespace: 5
> Aug 08 18:38:59 b1 lxd.daemon[2345]: hierarchies:
> Aug 08 18:38:59 b1 lxd.daemon[2345]:   0: fd:   6: devices
> Aug 08 18:38:59 b1 lxd.daemon[2345]:   1: fd:   7: cpuset
> Aug 08 18:38:59 b1 lxd.daemon[2345]:   2: fd:   8: blkio
> Aug 08 18:38:59 b1 lxd.daemon[2345]:   3: fd:   9: memory
> Aug 08 18:38:59 b1 lxd.daemon[2345]:   4: fd:  10: rdma
> Aug 08 18:38:59 b1 lxd.daemon[2345]:   5: fd:  11: cpu,cpuacct
> Aug 08 18:38:59 b1 lxd.daemon[2345]:   6: fd:  12: freezer
> Aug 08 18:38:59 b1 lxd.daemon[2345]:   7: fd:  13: hugetlb
> Aug 08 18:38:59 b1 lxd.daemon[2345]:   8: fd:  14: net_cls,net_prio
> Aug 08 18:38:59 b1 lxd.daemon[2345]:   9: fd:  15: perf_event
> Aug 08 18:38:59 b1 lxd.daemon[2345]:  10: fd:  16: pids
> Aug 08 18:38:59 b1 lxd.daemon[2345]:  11: fd:  17: name=systemd
> Aug 08 18:38:59 b1 lxd.daemon[2345]:  12: fd:  18: unified
> Aug 08 18:38:59 b1 lxd.daemon[2345]: lvl=warn msg="AppArmor support has been
> disabled because of lack of kernel support" t=2018-08-08T18:38:59+0000
> Aug 08 18:38:59 b1 lxd.daemon[2345]: lvl=warn msg="CGroup memory swap
> accounting is disabled, swap limits will be ignored."
> t=2018-08-08T18:38:59+0000
> 
> 
> > anything useful looking in /var/snap/lxd/common/lxd/logs/lxd.log?
> 
> # cat /var/snap/lxd/common/lxd/logs/lxd.log
> lvl=info msg="LXD 3.3 is starting in normal mode"
> path=/var/snap/lxd/common/lxd t=2018-08-08T18:38:59+0000
> lvl=info msg="Kernel uid/gid map:" t=2018-08-08T18:38:59+0000
> lvl=info msg=" - u 0 0 4294967295" t=2018-08-08T18:38:59+0000
> lvl=info msg=" - g 0 0 4294967295" t=2018-08-08T18:38:59+0000
> lvl=info msg="Configured LXD uid/gid map:" t=2018-08-08T18:38:59+0000
> lvl=info msg=" - u 0 1000000 1000000000" t=2018-08-08T18:38:59+0000
> lvl=info msg=" - g 0 1000000 1000000000" t=2018-08-08T18:38:59+0000
> lvl=warn msg="AppArmor support has been disabled because of lack of kernel
> support" t=2018-08-08T18:38:59+0000
> lvl=warn msg="CGroup memory swap accounting is disabled, swap limits will be
> ignored." t=2018-08-08T18:38:59+0000
> lvl=info msg="Initializing local database" t=2018-08-08T18:38:59+0000
> lvl=info msg="Initializing database gateway" t=2018-08-08T18:39:07+0000
> address= id=1 lvl=info msg="Start database node" t=2018-08-08T18:39:07+0000
> lvl=info msg="Raft: Initial configuration (index=1): [{Suffrage:Voter ID:1
> Address:0}]" t=2018-08-08T18:39:07+0000
> lvl=info msg="Raft: Node at 0 [Leader] entering Leader state"
> t=2018-08-08T18:39:07+0000
> lvl=info msg="LXD isn't socket activated" t=2018-08-08T18:39:07+0000
> lvl=info msg="Starting /dev/lxd handler:" t=2018-08-08T18:39:07+0000
> lvl=info msg=" - binding devlxd socket"
> socket=/var/snap/lxd/common/lxd/devlxd/sock t=2018-08-08T18:39:07+0000
> lvl=info msg="REST API daemon:" t=2018-08-08T18:39:07+0000
> lvl=info msg=" - binding Unix socket"
> socket=/var/snap/lxd/common/lxd/unix.socket t=2018-08-08T18:39:07+0000
> lvl=info msg="Initializing global database" t=2018-08-08T18:39:07+0000
> lvl=info msg="Initializing storage pools" t=2018-08-08T18:39:08+0000
> lvl=info msg="Initializing networks" t=2018-08-08T18:39:08+0000
> lvl=info msg="Loading configuration" t=2018-08-08T18:39:08+0000
> lvl=info msg="Connected to MAAS controller" t=2018-08-08T18:39:08+0000
> lvl=info msg="Pruning expired images" t=2018-08-08T18:39:08+0000
> lvl=info msg="Done pruning expired images" t=2018-08-08T18:39:08+0000
> lvl=info msg="Updating instance types" t=2018-08-08T18:39:08+0000
> lvl=info msg="Expiring log files" t=2018-08-08T18:39:08+0000
> lvl=info msg="Updating images" t=2018-08-08T18:39:08+0000
> lvl=info msg="Done updating images" t=2018-08-08T18:39:08+0000
> lvl=info msg="Done expiring log files" t=2018-08-08T18:39:08+0000
> lvl=info msg="Done updating instance types" t=2018-08-08T18:39:10+0000
> 
> 
> > It's expected that "systemctl start lxd" won't work anymore since the
> > data was moved over to the snap which then likely caused your database
> > to be upgraded, making it unreadable for your older deb version of LXD.
> > 
> > I'd recommend you do:
> >  - systemctl stop lxd lxd.socket
> >  - systemctl mask lxd lxd.socket
> > 
> > To prevent any accidental startup of your old LXD until the snap
> > migration is done and it can be safely removed.
> 
> Let's try:
> 
> # systemctl stop lxd lxd.socket
> # systemctl mask lxd lxd.socket
> Created symlink /etc/systemd/system/lxd.service → /dev/null.
> Created symlink /etc/systemd/system/lxd.socket → /dev/null.
> 
> # lxd.migrate
> => Connecting to source server
> error: Unable to connect to the source LXD: Get http://unix.socket/1.0: dial
> unix /var/lib/lxd/unix.socket: connect: no such file or directory
> 
> 
> # /snap/bin/lxc list
> +------+-------+------+------+------+-----------+
> | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
> +------+-------+------+------+------+-----------+
> 
> 
> 
> So I did the following:
> 
> - mv /var/snap/lxd/common/lxd /var/snap/lxd/common/lxd.orig
> 
> - modified /etc/fstab to mount the previous /var/lib/lxd to
> /var/snap/lxd/common/lxd
> 
> - rebooted
> 
> - removed lxd deb packages
> 
> And it now started properly.

Ah yeah, not sure why it didn't work immediately after upgrade, the logs
don't help much there.

The fstab change was indeed necessary and was hinted by lxd.migrate with:

    WARNING: /var/lib/lxd is a mountpoint. You will need to update that mount location after the migration.

We preferred that over trying to mess with people's fstab ourselves :)

> Thanks for helping!
> 
> 
> Tomasz Chmielewski
> https://lxadm.com

-- 
Stéphane Graber
Ubuntu developer
http://www.ubuntu.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.linuxcontainers.org/pipermail/lxc-users/attachments/20180808/dfe87916/attachment-0001.sig>


More information about the lxc-users mailing list