[lxc-devel] [PATCH 1/1] pivot_root: switch to a new mechanism (v2)

Andy Lutomirski luto at amacapital.net
Mon Sep 29 23:13:38 UTC 2014


On Mon, Sep 29, 2014 at 4:07 PM, Eric W. Biederman
<ebiederm at xmission.com> wrote:
> Andy Lutomirski <luto at amacapital.net> writes:
>
>> On Mon, Sep 29, 2014 at 3:46 PM, Serge Hallyn <serge.hallyn at ubuntu.com> wrote:
>>> Quoting Andy Lutomirski (luto at amacapital.net):
>>>> On Mon, Sep 29, 2014 at 2:46 PM, Serge Hallyn <serge.hallyn at ubuntu.com> wrote:
>>>> I'm not sure that "/" is well-defined.  You have oldroot mounted on
>>>
>>> Whoa.  Seems you're right.  I would have expected it to mean precisely
>>> the dentry+vfsmount which I pivot-rooted to.  Which have been overmounted,
>>> so umount(/) would umount what's been mounted over them.
>>>
>>>> top of newroot, and "/" refers to one of them (presumably oldroot on
>>>> newer kernels, and maybe newroot on older kernels).
>>>
>>> So it seems.
>>>
>>>>I think that you
>>>> want to unmount oldroot, leaving only newroot mounted.  When you call
>>>> umount2, "." reliably refers to oldroot.
>>>
>>> Right
>>>
>>>> /me wonders whether there's a vulnerability here on new kernels if the
>>>> test were adjusted a bit.  mnt_ns oughtn't to be NULL, right?
>>>
>>> Wouldn't it be in the older kernels though?  That's where mnt_ns ends
>>> up being null.  So from 3.8..3.11 an unpriv user (though CLONE_NEWUSER)
>>> can do a pivot_root causing null MNT_NS, and presumably find an interesting
>>> way to dereference it.
>>
>> Eric?
>>
>> I wonder what happens if you unmount new_root on new kernels...
>
> There is chroot_fs_refs so it is clear that "/" is well defined after
> pivot_root.  I thought that expensive loop over all of the tasks
> had been removed at some put but it got hidden in an innocuous function
> call instead. :(
>
>
> As I recall what happens when you unmount "/" is that you get into a
> very weird state where.  chroot_fs_refs isn't called so you have a case
> where "/" refers to a lazily unmounted filesystem or the unmount
> implicitly becomes a remount read-only.  Which smells like a userns
> permission bug.
>
> I am looking at this related issue at the moment.
> https://github.com/avagin/userns_vs_mntns

To me, this smells like MNT_DETACH does something awful when there are
mounts under the detached mount.

For example:

mount --rbind / /mnt
umount -l /mnt

does *not* end well on my system.  I find it hard to believe that this
behavior is intentional.

--Andy


More information about the lxc-devel mailing list