[lxc-devel] [PATCH] Add support for checkpoint and restore via CRIU

Tycho Andersen tycho.andersen at canonical.com
Wed Sep 17 20:19:26 UTC 2014


Hi Krystof,

On Wed, Sep 17, 2014 at 03:08:44PM +0000, Zmudzinski, Krystof C wrote:
> test-lxc.conf:
> 
> lxc.utsname = test-lxc
> lxc.mount = /root/centos/etc/fstab
> lxc.rootfs = /root/centos/

Ah, I'm not sure if anyone has tried to dump a centos container yet :)

> lxc.console = none
> lxc.tty = 0
> lxc.network.type = veth
> lxc.network.flags = up
> lxc.network.link = lxcbr0
> lxc.network.name = eth0
> 
> # hax for criu
> lxc.console = none
> lxc.tty = 0
> lxc.cgroup.devices.deny = c 5:1 rwm
> lxc.aa_profile = lxc-container-default-with-mounting
> 
> 
> I meant -F not -V. Sorry.
> 
> ps axf:
> 24602 pts/3    S      0:00              \_ ./lxc-checkpoint -n test-lxc -v -D /tmp/checkpoint -r -F -d
> 24603 pts/3    S      0:00              |   \_ /usr/local/sbin/criu restore --tcp-established --evasive-devices --file-locks --link-remap --manage-cgroups 
> 24605 ?        Ss     0:00              |       \_ /sbin/init
> 24635 ?        Zs     0:00              |           \_ [criu] <defunct>
> 24636 ?        Ss     0:00              |           \_ /usr/sbin/httpd
> 24646 ?        S      0:00              |           |   \_ /usr/sbin/httpd
> 24637 ?        Z      0:00              |           \_ [criu] <defunct>
> 24638 ?        Zs     0:00              |           \_ [criu] <defunct>
> 24639 ?        Zs     0:00              |           \_ [criu] <defunct>

Looks like criu didn't wait() on some of its processes, which means
CRIU probably hung, and lxc is waiting for the criu process to die,
which is why lxc-info and such don't respond. What does the
restore.log look like?

Tycho

> 24640 ?        Ss     0:00              |           \_ /usr/sbin/saslauthd -m /var/run/saslauthd -a pam -n 2
> 24645 ?        S      0:00              |           |   \_ /usr/sbin/saslauthd -m /var/run/saslauthd -a pam -n 2
> 24641 ?        Ss     0:00              |           \_ xinetd -stayalive -pidfile /var/run/xinetd.pid
> 24642 ?        Ss     0:00              |           \_ /usr/sbin/sshd
> 24643 ?        S<s    0:00              |           \_ /sbin/udevd -d
> 
> The full criu command line:
> /usr/local/sbin/criu restore --tcp-established --evasive-devices --file-locks --link-remap --manage-cgroups --action-script /usr/local/libexec/lxc/lxc-restore-net -D /tmp/checkpoint -o /tmp/checkpoint/restore.log -vvvvvv --root /usr/local/lib/lxc/rootfs --restore-detached --pidfile /tmp/fileGCxb5C --veth-pair eth0 vethSID6CM
> 
> -----Original Message-----
> From: lxc-devel [mailto:lxc-devel-bounces at lists.linuxcontainers.org] On Behalf Of Tycho Andersen
> Sent: Tuesday, September 16, 2014 4:55 PM
> To: LXC development mailing-list
> Cc: criu at openvz.org
> Subject: Re: [lxc-devel] [PATCH] Add support for checkpoint and restore via CRIU
> 
> On Tue, Sep 16, 2014 at 11:11:04PM +0000, Zmudzinski, Krystof C wrote:
> > I've added DECLARE_ARG("--evasive-devices"); in lxccontainer.c/exec_criu and I was finally able to dump the container.
> 
> Ok, I've been trying to produce a situation where this is necessary but I couldn't. Can you paste your lxc configuration file?
> 
> > It also restored but only when both -V and -d were passed to lxc-checkpoint.
> 
> -V isn't an argument to lxc-checkpoint. (Perhaps you mean -v? That /shouldn't/ affect things, it is just logging.) What happens when you don't pass -d?
> 
> > But lxc-stop, lxc-attach, etc. hang  after the container is restored.  But that is expected at this point, isn't it?
> 
> No, those should work. Can you show the output of ps auxf?
> 
> > The interesting part is that something like this is not needed but it 
> > is used in run.sh
> > 
> > DECLARE_ARG("-n net -n mnt -n ipc -n pid");
> 
> That's not needed, it is an old criu option (CRIU's wiki is outdated).
> 
> > Lastly, could criu dump the entire command line to the logs when it is executed?  So the beginning of the log starts with something like:
> > 
> > (00.000047) ========================================
> > (00.000057) /usr/local/sbin/criu dump --tcp-established --evasive-devices --file-locks --link-remap --manage-cgroups.......
> > (00.000087) Dumping processes (pid: 22614)
> > (00.000093) ========================================
> 
> The log itself is generated by criu, so that is probably a question for the criu list, not the lxc list :)
> 
> Tycho
> 
> > Krystof
> > 
> 
> > _______________________________________________
> > lxc-devel mailing list
> > lxc-devel at lists.linuxcontainers.org
> > http://lists.linuxcontainers.org/listinfo/lxc-devel
> 
> _______________________________________________
> lxc-devel mailing list
> lxc-devel at lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-devel
> _______________________________________________
> lxc-devel mailing list
> lxc-devel at lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-devel


More information about the lxc-devel mailing list