[Lxc-users] lxc-start leaves temporary pivot dir behind
Daniel Lezcano
daniel.lezcano at free.fr
Tue May 11 16:22:13 UTC 2010
Ferenc Wagner wrote:
> Daniel Lezcano <daniel.lezcano at free.fr> writes:
>
>
>> Ferenc Wagner wrote:
>>
>>
>>> Daniel Lezcano <daniel.lezcano at free.fr> writes:
>>>
>>>
>>>> We can't simply remove it because of the pivot_root which returns
>>>> EBUSY. I suppose it's coming from: "new_root and put_old must not
>>>> be on the same file system as the current root."
>>>>
>>> Hmm, this could indeed be a problem if lxc.rootfs is on the current root
>>> file system. I didn't consider pivoting to the same FS, but looks like
>>> this is the very reason for the current complexity in the architecture.
>>>
>>> Btw. is this really a safe thing to do, to pivot into a subdirectory of
>>> a file system? Is there really no way out of that?
>>>
>> It seems pivot_root on the same fs works if an intermediate mount
>> point is inserted between old_root and new_root but at the cost of
>> having a lazy unmount when we unmount the old rootfs filesystems.
>>
>
> After pivoting? Could you please illustrate this?
>
After the pivot_root syscall, we have oldroot and newroot.
oldroot is underneath newroot, so after pivot_root, we can still access
/oldroot.
We want to umount the oldroot dir of course, but before umounting it, we
have to umount all the subdirectories.
When everything is unmounted, we finish to umount /oldroot. But in some
circumstances, this umount fails with EBUSY, so we "detach" the
directory with the MNT_DETACH option.
http://sourceforge.net/mailarchive/message.php?msg_name=4B5B6DA5.6050302%40free.fr
>> I am looking at making possible to specify a rootfs which is a file
>> system image or a block device. I am not sure this should be done by
>> lxc but looking forward ...
>>
>
> A device could be easily mounted by the user or by an lxc.mount.entry,
> so I don't think it needs special consideration.
>
There is the file system automatic detection of the image if the image
is specified in the mount entry.
I already coded that, but we can postpone that for the moment and focus
on the pivot_root.
>>>> But as we will pivot_root right after, we won't reuse the real
>>>> rootfs, so we can safely use the host /tmp.
>>>>
>>> That will cause problems if rootfs is under /tmp, don't you think?
>>>
>> Right :)
>>
>
> Btw. my use case is exactly that: I mostly want to prune the namespace
> of the container, so I bind mount / to /tmp/.../jail and a couple of
> things (but not everything!) below that, and set rootfs=/tmp/.../jail.
>
Ok, will fix that.
>>> Actually, I'm not sure you can fully solve this. If rootfs is a
>>> separate file system, this is only much ado about nothing. If rootfs
>>> isn't a separate filesystem, you can't automatically find a good
>>> place and also clean it up.
>>>
>> Maybe a single /tmp/lxc directory may be used as the mount points are
>> private to the container. So it would be acceptable to have a single
>> directory for N containers, no ?
>>
>
> Then why not /usr/lib/lxc/pivotdir or something like that? Such a
> directory could belong to the lxc package and not clutter up /tmp. As
> you pointed out, this directory would always be empty in the outer name
> space, so a single one would suffice. Thus there would be no need
> cleaning it up, either.
>
Agree. Shall we consider $(prefix)/var/run/lxc ?
>>> So why not require that rootfs is a separate filesystem, and let the
>>> user deal with it by doing the necessary bind mount in the lxc
>>> config?
>>>
>>
>> Hmm, that will break the actual user configurations.
>>
>
> Yes, sadly.
>
>
>> We can add a WARNING if rootfs is not a separate file system and
>> provide the ability to let the user to do whatever he wants, IMO if it
>> is well documented it is not a problem.
>>
>
> Sure. It adds some complexity to the code, but lxc is there to help
> doing common tasks. Now the question is: if rootfs is a separate file
> system (which includes bind mounts), is the superfluous rbind of the
> original root worth skipping, or should we just do it to avoid needing
> an extra code path?
>
Good question. IMO, skipping the rbind is ok for this case but it may be
interesting from a coding point of view to have a single place
identified for the rootfs (especially for mounting an image). I will
cook a patchset to fix the rootfs location and then we can look at
removing the superfluous rbind.
More information about the lxc-users
mailing list