[lxc-devel] LXC live migrate

Marian Marinov mm at yuhu.biz
Tue Nov 26 16:19:38 UTC 2013


On 11/26/2013 05:29 PM, Dwight Engen wrote:
> On Mon, 25 Nov 2013 21:58:13 -0500
> Stéphane Graber <stgraber at ubuntu.com> wrote:
>
>> On Tue, Nov 26, 2013 at 04:04:36AM +0200, Marian Marinov wrote:
>>> Hey guys,
>>> I just read on LWN about the checkpoint/restore tool:
>>>     http://lwn.net/Articles/574917/
>>>
>>> With this, it seams possible to freeze and restore a whole
>>> container from one node to another.
>>>
>>> I'll give it a try this week to give more details on how it
>>> actually works.
>>>
>>> Marian
>>
>> I think I last tried it with CRIU 0.8 without much success but I took
>> an action item during Ubuntu's planning even last week to try with a
>> newer release and get in touch with Pavel if I'm still having issues.
>
> Hi all,
>
> I also started looking into this (just trying to dump a simple busybox
> container) and the first thing I ran into is that criu can't dump
> init's fd 0 -> /dev/zero. I believe this is because that inode is
> outside the container (ie. its the hosts' /dev/zero). I'm looking into
> having lxc_start open std[in,out,err] in do_start after it has cloned
> into the namespace. This means the container would have to have
> a /dev/zero and /dev/null.

On my test setup it works for processes like apache, dovecot and mysql.

However it does not work with containers:

root at s321:~# criu dump -D deb1 -t 19332 --file-locks
(00.004962) Error (namespaces.c:155): Can't dump nested pid namespace for 28352
(00.004985) Error (namespaces.c:321): Can't make pidns id
(00.005327) Error (cr-dump.c:1811): Dumping FAILED.
root at s321:~#
When I try to dump the init process(which I believe I should not do), here is what I see:
   http://pastebin.com/DFC0ADpp

(00.291294) Error (tty.c:222): tty: Unexpected format on path /dev/tty1
(00.291315) Error (cr-dump.c:1491): Dump files (pid: 29702) failed with -1
(00.291892) Error (cr-dump.c:1811): Dumping FAILED.

This is my setup:
19332 ?        Ss     0:00 lxc-start -n deb1 -d
28352 ?        Ss     0:00  \_ init [3]
28393 ?        Ss     0:00      \_ /usr/sbin/apache2 -k start
28419 ?        S      0:00      |   \_ /usr/sbin/apache2 -k start
28422 ?        Sl     0:00      |   \_ /usr/sbin/apache2 -k start
28423 ?        Sl     0:00      |   \_ /usr/sbin/apache2 -k start
28489 ?        S      0:00      \_ /bin/sh /usr/bin/mysqld_safe
28620 ?        Sl     0:00      |   \_ /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql 
--pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port
28621 ?        S      0:00      |   \_ logger -t mysqld -p daemon.error
28598 ?        Ss     0:00      \_ /usr/sbin/sshd
29702 pts/0    Ss+    0:00      \_ /sbin/getty 38400 tty1 linux

I rebooted the container without getty on tty1 and then I got this:

(00.260757) Error (mount.c:255): 86:/dev/tty4 doesn't have a proper root mount
(00.261007) Error (namespaces.c:445): Namespaces dumping finished with error 65280
(00.261454) Error (cr-dump.c:1811): Dumping FAILED.

This ithe relevant container config
## Device config
lxc.cgroup.devices.deny = a
# /dev/null and zero
lxc.cgroup.devices.allow = c 1:3 rwm
lxc.cgroup.devices.allow = c 1:5 rwm
# consoles
lxc.cgroup.devices.allow = c 5:1 rwm
lxc.cgroup.devices.allow = c 5:0 rwm
lxc.cgroup.devices.allow = c 4:0 rwm
lxc.cgroup.devices.allow = c 4:1 rwm
# /dev/{,u}random
lxc.cgroup.devices.allow = c 1:9 rwm
lxc.cgroup.devices.allow = c 1:8 rwm
lxc.cgroup.devices.allow = c 136:* rwm
lxc.cgroup.devices.allow = c 5:2 rwm
# rtc
lxc.cgroup.devices.allow = c 254:0 rm

# mounts point
lxc.mount.entry = devpts dev/pts devpts gid=5,mode=620 0 0
lxc.mount.auto = proc:mixed sys:ro


Am I doing something wrong?

Marian

>
>>  From what we discussed at Linux Plumbers, CRIU should indeed let you
>> dump a full container and restore it on the same machine or on another
>> so long as the filesystem and any other external dependency of the
>> container matches.
>>
>> If I can get this working and they've resolved a few of the known
>> issues (specifically the fact that it'd only build on x86_64), then
>> the plan is to add API calls to LXC's API that'll implement the
>> checkpoint/restore feature using CRIU.
>
> Assuming we can get it to work, I think we'd rather link to some sort
> of libcriu than to system() out to criu? If that is the case I think
> we'll need to do a bit of packaging work to make such a lib in crtools.
>
>





More information about the lxc-devel mailing list