[lxc-users] Live migration mkdtemp failure

jjs - mainphrame jjs at mainphrame.com
Wed Jun 22 18:09:43 UTC 2016


Hi Tycho.

It's been on a to-do list to file a bug for this limit, but I hadn't gotten
around to it.

You can see the size indications in the messages below -

root at olympia:~# lxc list
+--------+---------+-----------------------+------+------------+-----------+
|  NAME  |  STATE  |         IPV4          | IPV6 |    TYPE    | SNAPSHOTS |
+--------+---------+-----------------------+------+------------+-----------+
| akita  | RUNNING | 192.168.111.22 (eth0) |      | PERSISTENT | 0         |
+--------+---------+-----------------------+------+------------+-----------+
| kangal | RUNNING | 192.168.111.44 (eth0) |      | PERSISTENT | 0         |
+--------+---------+-----------------------+------+------------+-----------+
root at olympia:~# lxc move akita lxd1:
error: Error transferring container data: checkpoint failed:
(02.064728) Error (files-reg.c:683): Can't dump ghost file
/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21 of 1566440 size, increase
limit
(02.064730) Error (cr-dump.c:1356): Dump mappings (pid: 4685) failed with -1
(02.068126) Error (cr-dump.c:1600): Dumping FAILED.
root at olympia:~# lxc move kangal lxd1:
error: Error transferring container data: checkpoint failed:
(14.495544) Error (files-reg.c:683): Can't dump ghost file
/usr/lib/x86_64-linux-gnu/libxml2.so.2.9.1 of 1465592 size, increase limit
(14.495547) Error (cr-dump.c:1356): Dump mappings (pid: 11956) failed with
-1
(14.500840) Error (cr-dump.c:1600): Dumping FAILED.
root at olympia:~#

Regards,

Jake


On Wed, Jun 22, 2016 at 8:04 AM, Tycho Andersen <
tycho.andersen at canonical.com> wrote:

> On Tue, Jun 21, 2016 at 09:27:21AM -0700, jjs - mainphrame wrote:
> > That particular error was resolved, but the lxc live migration doesn't
> work
> > for a different reason now. We now get an error that says "can't dump
> ghost
> > file" because of apparent size limitations - a limit less than the size
> of
> > any lxc container we have running here.
>
> How big is the ghost file that you're running into and what
> application is it from? Perhaps we should just increase the default
> limit.
>
> Tycho
>
> > (In contrast, live migration on all of our Openvz 7 containers works
> > reliably)
> >
> > Jake
> >
> >
> >
> >
> > On Tue, Jun 21, 2016 at 4:19 AM, McDonagh, Ed <Ed.McDonagh at rmh.nhs.uk>
> > wrote:
> >
> > >
> > >
> > > > On Tue, Mar 29, 2016 at 09:30:19AM -0700, jjs - mainphrame wrote:
> > > > > On Tue, Mar 29, 2016 at 7:18 AM, Tycho Andersen <
> > > > > tycho.andersen at canonical.com> wrote:
> > > > >
> > > > > > On Mon, Mar 28, 2016 at 08:47:24PM -0700, jjs - mainphrame wrote:
> > > > >>  > I've looked at ct migration between 2 ubuntu 16.04 hosts today,
> > > and had
> > > > > > > some interesting problems;  I find that migration of stopped
> > > containers
> > > > > > > works fairly reliably; but live migration, well, it transfers a
> > > lot of
> > > > > > > data, then exits with a failure message. I can then move the
> same
> > > > > > > container, stopped, with no problem.
> > > > > > >
> > > > > > > The error is the same every time, a failure of "mkdtemp" -
> > > > > >
> > > > > > It looks like your host /tmp isn't writable by the uid map that
> the
> > > > > > container is being restored as?
> > > > > >
> > > > >
> > > > > Which is odd, since /tmp has 1777 perms on both hosts, so I don't
> see
> > > how
> > > > > it could be a permissions problem. Surely the default apparmor
> profile
> > > is
> > > > > not the cause? You did give me a new idea though, and I'll set up a
> > > test
> > > > > with privileged containers for comparison. Is there a switch to
> enable
> > > > > verbose logging?
> > > >
> > > > It already is enabled, you can find the full logs in
> > > > /var/log/lxd/$container/migration_*
> > > >
> > > > Perhaps the pwd of the CRIU task is what's broken instead, since CRIU
> > > > isn't supplying a full mkdtemp template. I'll have a deeper look in a
> > > > bit.
> > > >
> > > > Tycho
> > > >
> > > > >
> > > > > > >
> > > > > > > root at ronnie:~# lxc move third lxd:
> > > > > > > error: Error transferring container data: restore failed:
> > > > > > > (00.033172)      1: Error (cr-restore.c:1489): mkdtemp failed
> > > > > > > crtools-proc.x9p5OH: Permission denied
> > > > > > > (00.060072) Error (cr-restore.c:1352): 9188 killed by signal 9
> > > > > > > (00.117126) Error (cr-restore.c:2182): Restoring FAILED.
> > >
> > > I've been getting the same error - was the issue ever resolved for
> > > non-privileged containers?
> > >
> > > Kind regards
> > > Ed
> > >
> #########################################################################
> > > Attention:
> > > This e-mail and any attachment is for authorised use by the intended
> > > recipient(s) only. It may contain proprietary, confidential and/or
> > > privileged information and should not be copied, disclosed,
> distributed,
> > > retained or used by any other party. If you are not an intended
> recipient
> > > please notify the sender immediately and delete this e-mail (including
> > > attachments and copies).
> > >
> > > The statements and opinions expressed in this e-mail are those of the
> > > author and do not necessarily reflect those of the Royal Marsden NHS
> > > Foundation Trust. The Trust does not take any responsibility for the
> > > statements and opinions of the author.
> > >
> > > Website: http://www.royalmarsden.nhs.uk
> > >
> #########################################################################
> > > _______________________________________________
> > > lxc-users mailing list
> > > lxc-users at lists.linuxcontainers.org
> > > http://lists.linuxcontainers.org/listinfo/lxc-users
>
> > _______________________________________________
> > lxc-users mailing list
> > lxc-users at lists.linuxcontainers.org
> > http://lists.linuxcontainers.org/listinfo/lxc-users
>
> _______________________________________________
> lxc-users mailing list
> lxc-users at lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxcontainers.org/pipermail/lxc-users/attachments/20160622/e9097cff/attachment.html>


More information about the lxc-users mailing list