[lxc-users] How to recover from ERROR state
Christian Brauner
christian at brauner.io
Mon Sep 24 13:30:17 UTC 2018
On Mon, Sep 24, 2018 at 02:11:38PM +0200, Christian Brauner wrote:
> On Mon, Sep 24, 2018, 14:03 Kees Bakker <keesb at ghs.com> wrote:
>
> > Same question again: what is the best approach to recover
> > from a container in an ERROR state?
So another thing I would like to see is the current stack of the hung
monitor process. Could you please paste (or send privately) the output
of:
cat /proc/<pid-of-hung-monitor-process>/stack
Also, in what state is the monitor hung. Again in D state?
Christian
> >
>
> Please show me the dmesg output. If it is a kernel bug you're hitting
> there's nothing that LXD can do to help you.
>
>
> > This time it happened with Ubuntu 18.04 and LVM storage.
> >
> > The steps leading to this were as follows. It's just an FYI, I don't think
> > it
> > really matters, except for the stop and start.
> >
> > lvextend -L 20G local/containers_xyz
> > resize2fs /dev/local/containers_xyz
> > lxc stop xyz
> > e2fsck -f /dev/local/containers_
> > lxc start xyz
> >
> > ... the start command hanged.
> >
> > Some output os ps auxfwww
> >
> > root 6224 0.0 0.0 22912 4096 pts/1 S sep06 0:00
> > | \_ -bash
> > root 20900 0.0 0.0 1136140 12092 pts/1 Sl+ 12:19 0:00
> > | \_ lxc start xyz
> > --
> > root 18157 3.5 4.2 5581444 1398904 ? Ssl sep12 611:36
> > /usr/lib/lxd/lxd --group lxd --logfile=/var/log/lxd/lxd.log
> > root 20918 0.0 0.0 521720 19780 ? Sl 12:19 0:00 \_
> > /usr/lib/lxd/lxd forkstart xyz /var/lib/lxd/containers
> > /var/log/lxd/xyz/lxc.conf
> > root 20925 0.0 0.0 0 0 ? Z 12:19 0:00 \_
> > [lxd] <defunct>
> > --
> > root 20926 0.0 0.0 530432 7280 ? Ss 12:19 0:00 [lxc
> > monitor] /var/lib/lxd/containers xyz
> > root 20943 0.0 0.0 530432 3484 ? D 12:19 0:00 \_ [lxc
> > monitor] /var/lib/lxd/containers xyz
> >
> >
> >
> > On 11-09-18 15:13, Kees Bakker wrote:
> > > Hey,
> > >
> > > Every now and then we have one or more containers in state ERROR.
> > > Is there a clever method to recover from that, other than
> > > rebooting the LXD server?
> > >
> > > Killing the monitor and the forkstart does help. And also a kworker
> > > process (kworker/u16:0) is eating up one of the CPUs with 100% load.
> > > lxc info gives "error: Monitor is hung"
> > >
> > > I'm running Ubuntu 16.04 with BTRFS. The kernel is 4.15.0-33-generic
> >
> > _______________________________________________
> > lxc-users mailing list
> > lxc-users at lists.linuxcontainers.org
> > http://lists.linuxcontainers.org/listinfo/lxc-users
More information about the lxc-users
mailing list