[lxc-devel] [PATCH 1/1] pivot_root: switch to a new mechanism (v2)

Dwight Engen dwight.engen at oracle.com
Wed Oct 1 22:04:25 UTC 2014


On Mon, 29 Sep 2014 22:46:26 +0000
Serge Hallyn <serge.hallyn at ubuntu.com> wrote:

> Quoting Andy Lutomirski (luto at amacapital.net):
> > On Mon, Sep 29, 2014 at 2:46 PM, Serge Hallyn
> > <serge.hallyn at ubuntu.com> wrote: I'm not sure that "/" is
> > well-defined.  You have oldroot mounted on
> 
> Whoa.  Seems you're right.  I would have expected it to mean precisely
> the dentry+vfsmount which I pivot-rooted to.  Which have been
> overmounted, so umount(/) would umount what's been mounted over them.
> 
> > top of newroot, and "/" refers to one of them (presumably oldroot on
> > newer kernels, and maybe newroot on older kernels). 
> 
> So it seems.
> 
> >I think that you
> > want to unmount oldroot, leaving only newroot mounted.  When you
> > call umount2, "." reliably refers to oldroot.
> 
> Right
> 
> > /me wonders whether there's a vulnerability here on new kernels if
> > the test were adjusted a bit.  mnt_ns oughtn't to be NULL, right?
> 
> Wouldn't it be in the older kernels though?  That's where mnt_ns ends
> up being null.  So from 3.8..3.11 an unpriv user (though
> CLONE_NEWUSER) can do a pivot_root causing null MNT_NS, and
> presumably find an interesting way to dereference it.

Yeah the mnt_ns being NULL seems strange to me, but I can't tell if
that is by design or not. The commit that changed the behavior between
3.11 and 3.12 is:

8033426e6bdb2690d302 vfs: allow umount to handle mountpoints without revalidating them

I added some printk debugging to see what was going on. So prior to
this change it looks like the umount2("/", MNT_DETACH) isn't really
working such that doing the kern_path() walk in do_mount() afterwards
gets you back to the same struct mount whose mnt_ns field was NULL'ed
out by the umount2. After the above change, its a different struct
mount (with a non-NULL mnt_ns) and thus the check_mnt(parent) in
do_add_mount() works and a mount can succeed.

However, I didn't see how having a NULL mnt_ns could be exploited, in
fact it looks like to me the code is setup to handle mnt->mnt_ns being
NULL (ie, see IS_MNT_NEW()) but someone who understands the namespace
code better than me could probably say.

> > >> I'm currently having trouble finding an old enough box.  Can you
> > >> try the attached fancier test and see what it prints?
> > >
> > > Exact same as mine:
> > >
> > > ubuntu at kvm-p3:~$ sudo ./x
> > > pivoted
> > > in new root
> > > I am 1441
> > > root at kvm-p3:/# mount --bind /mnt /mnt
> > 
> > Ah, OK, I completely misunderstood your original email.
> > 
> > If I change umount2 to umount "." instead of "/" in my code, the
> > subsequent mount --bind works for me on 3.2.
> 
> Same here, so I can push a fix for lxc - thanks!
> 
> > FWIW, your test does awful, awful things if I don't do the
> > MS_PRIVATE thing on top.
> 
> D'oh.  Sorry about that.
> 
> -serge
> _______________________________________________
> lxc-devel mailing list
> lxc-devel at lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-devel



More information about the lxc-devel mailing list