[lxc-devel] [PATCH 1/1] pivot_root: switch to a new mechanism (v2)
Andy Lutomirski
luto at amacapital.net
Mon Sep 29 23:21:48 UTC 2014
On Mon, Sep 29, 2014 at 4:13 PM, Andy Lutomirski <luto at amacapital.net> wrote:
> On Mon, Sep 29, 2014 at 4:07 PM, Eric W. Biederman
> <ebiederm at xmission.com> wrote:
>> Andy Lutomirski <luto at amacapital.net> writes:
>>
>>> On Mon, Sep 29, 2014 at 3:46 PM, Serge Hallyn <serge.hallyn at ubuntu.com> wrote:
>>>> Quoting Andy Lutomirski (luto at amacapital.net):
>>>>> On Mon, Sep 29, 2014 at 2:46 PM, Serge Hallyn <serge.hallyn at ubuntu.com> wrote:
>>>>> I'm not sure that "/" is well-defined. You have oldroot mounted on
>>>>
>>>> Whoa. Seems you're right. I would have expected it to mean precisely
>>>> the dentry+vfsmount which I pivot-rooted to. Which have been overmounted,
>>>> so umount(/) would umount what's been mounted over them.
>>>>
>>>>> top of newroot, and "/" refers to one of them (presumably oldroot on
>>>>> newer kernels, and maybe newroot on older kernels).
>>>>
>>>> So it seems.
>>>>
>>>>>I think that you
>>>>> want to unmount oldroot, leaving only newroot mounted. When you call
>>>>> umount2, "." reliably refers to oldroot.
>>>>
>>>> Right
>>>>
>>>>> /me wonders whether there's a vulnerability here on new kernels if the
>>>>> test were adjusted a bit. mnt_ns oughtn't to be NULL, right?
>>>>
>>>> Wouldn't it be in the older kernels though? That's where mnt_ns ends
>>>> up being null. So from 3.8..3.11 an unpriv user (though CLONE_NEWUSER)
>>>> can do a pivot_root causing null MNT_NS, and presumably find an interesting
>>>> way to dereference it.
>>>
>>> Eric?
>>>
>>> I wonder what happens if you unmount new_root on new kernels...
>>
>> There is chroot_fs_refs so it is clear that "/" is well defined after
>> pivot_root. I thought that expensive loop over all of the tasks
>> had been removed at some put but it got hidden in an innocuous function
>> call instead. :(
>>
>>
>> As I recall what happens when you unmount "/" is that you get into a
>> very weird state where. chroot_fs_refs isn't called so you have a case
>> where "/" refers to a lazily unmounted filesystem or the unmount
>> implicitly becomes a remount read-only. Which smells like a userns
>> permission bug.
My initial attempt to make it blow up failed. But IIRC the really
weird one was mounting anything on "/" -- you end up with pwd, root,
and ns root being hopelessly out of sync.
>>
>> I am looking at this related issue at the moment.
>> https://github.com/avagin/userns_vs_mntns
>
> To me, this smells like MNT_DETACH does something awful when there are
> mounts under the detached mount.
>
> For example:
>
> mount --rbind / /mnt
> umount -l /mnt
>
> does *not* end well on my system. I find it hard to believe that this
> behavior is intentional.
By which I mean that this unmounts the world:
# mount --make-rshared /
# mount --rbind / /mnt
# umount -l /mnt
Dunno whether it's related.
--Andy
More information about the lxc-devel
mailing list