[Lxc-users] [Dmtcp-forum] Running DMTCP inside a LXC container

Alexandre Gravier al.gravier at gmail.com
Sat Oct 5 21:24:31 UTC 2013


Hi Kapil,

Thank you for your insights. You are spot on. The /proc/<PID>/maps of
the processes I try to serialise with DMTCP contain the exact list of
incorrect filepaths that DMTCP spits out.

Making checkpoints does generate non-empty *.dmtcp files together with
the restart scripts, but running dmtcp_restart_script.sh coredumps
with the following (expected) last words:

[30211] mtcp_restart_nolibc.c:1053 fix_filename_if_new_cwd:
  error 13 creating directory /var/lib/lxc in path of
/var/lib/lxc/rootfs/usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache
Segmentation fault (core dumped)

I do not know the answers to your other questions about LXC and path
translation.

Thanks again for your help,
Alexandre

On Sat, Oct 5, 2013 at 10:51 PM, Kapil Arya <kapil at ccs.neu.edu> wrote:
> Hi Alexandre,
>
> Thanks for contacting us about this issue.
>
> We haven't tried DMTCP with LXC earlier so we can't be sure about the
> situation. However, I guess you are seeing these warnings because
> /proc/self/maps is listing those files with absolute paths. Can you verify
> this by looking at /proc/self/maps of a process running inside the
> container?
>
> In any case, was DMTCP able to generate checkpoint images? Also, what is a
> good way to figure out whether a process is running inside a container?
> Also, is there a way to translate between absolute and _fake_ paths?
>
> We might need to write a small DMTCP plugin for LXC to takes care of this.
>
> Kapil
>
>
> On Sat, Oct 5, 2013 at 4:31 PM, Alexandre Gravier <al.gravier at gmail.com>
> wrote:
>>
>> Hello,
>>
>> I'm trying to use DMTCP to save my user sessions on an online service
>> that offers LXC virtual machines. The service shutdowns the container
>> after 15 minutes of inactivity on my side, and I would like to be able
>> to save my tmux session and all related processes to disk before that
>> happens.
>>
>> So, I've compiled DMTCP 2.0, and 40 out of 45 tests pass*. I figured
>> that I may try to do something simple first, so I launch the
>> coordinator in a terminal, and `bin/dmtcp_launch cat` in another. The
>> coordinator notes that "NOTE at dmtcp_coordinator.cpp:1039 in
>> onConnect; REASON='worker connected'". It all seems fine, and the
>> clients list contains my cat.
>>
>> The problem comes when I actually make a checkpoint using 'c' in the
>> coordinator. The coordinator goes through what seems to be a
>> reasonable list of stages, from "starting checkpoint, suspending all
>> nodes" to "restarting all nodes", but in the terminal where the
>> dmtcp-monitored "cat" is running, there is an avalanche of errors that
>> all look like:
>>
>> [11795] mtcp_writeckpt.c:718 write_ckpt_to_file:
>>   ERROR: error statting
>>
>> /var/lib/lxc/vm-524f2a6f88716f1d3800785a/overlay/home/agravier/dmtcp-2.0/plugin/ipc/libdmtcp_ipc.so
>> : No such file or directory
>> [11795] mtcp_writeckpt.c:718 write_ckpt_to_file:
>>   ERROR: error statting
>> /var/lib/lxc/rootfs/lib/x86_64-linux-gnu/ld-2.17.so : No such file or
>> directory
>>
>> And there are dozens of those.
>>
>> Visibly, mtcp is looking for files using their absolute path *outside*
>> of the LXC container, which, of course, does not correspond to their
>> paths inside of it.
>>
>> I would like to understand how LXC chroots in a way that still allows
>> MTCP to get hold of the original filepaths. Why does MTCP get confused
>> when other applications don't? Or rather, why it doesn't get confused
>> by LXC :) ?
>>
>> Is there is anything that I might do about all that?
>>
>> Thanks and regards,
>> Alexandre
>>
>>
>> * Failing tests: shared-memory (rstr), pty2, bash (rstr), script, screen.
>>
>>
>> ------------------------------------------------------------------------------
>> October Webinars: Code for Performance
>> Free Intel webinars can help you accelerate application performance.
>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
>> from
>> the latest Intel processors and coprocessors. See abstracts and register >
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Dmtcp-forum mailing list
>> Dmtcp-forum at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/dmtcp-forum
>
>




More information about the lxc-users mailing list