[lxc-devel] [PATCH 1/1] pivot_root: switch to a new mechanism (v2)

Serge Hallyn serge.hallyn at ubuntu.com
Mon Sep 29 21:46:45 UTC 2014


Quoting Andy Lutomirski (luto at amacapital.net):
> On Mon, Sep 29, 2014 at 1:55 PM, Serge Hallyn <serge.hallyn at ubuntu.com> wrote:
> > Quoting Dwight Engen (dwight.engen at oracle.com):
> >> On Sat, 20 Sep 2014 03:15:44 +0000
> >> Serge Hallyn <serge.hallyn at ubuntu.com> wrote:
> >>
> >> > This idea came from Andy Lutomirski.  Instead of using a
> >> > temporary directory for the pivot_root put-old, use "." both
> >> > for new-root and old-root.  Then fchdir into the old root
> >> > temporarily in order to unmount the old-root, and finally
> >> > chdir back into our '/'.
> >> >
> >> > Drop lxc.pivotdir from the lxc.container.conf manpage.
> >> >
> >> > Warn when we see a lxc.pivotdir entry (but keep it in the
> >> > lxc.conf for now).
> >> >
> >> > Signed-off-by: Serge Hallyn <serge.hallyn at ubuntu.com>
> >>
> >> Hey Serge
> >>
> >> Just a heads up that this change makes my containers not start on a
> >> 3.8.13 kernel. Reverting this change or using 3.13.11 kernel (just
> >> tried a newer one I had handy) works. I'll try to debug further.
> >> The log from trying to start a busybox container:
> >
> > Fascinating.
> >
> > It seems this is the case from at least v3.2 to v3.11.
> >
> > If my systemtap experiment isn't completely wrong, do_mount's
> > call to kern_path() for any path to mount is returning a path
> > with mnt->mnt_ns = NULL.  But it doesn't return an error.  So
> > later on, when do_loopback checks for the path's mnt_ns being
> > the same as current's, that fails since mnt_ns is NULL.
> >
> > The easiest way to reproduce is to use the program below, call it
> > 'pivot.c", build it, and run "sudo ./pivot", then in the resulting bash
> > shell try to, say, "mount --bind /mnt /mnt".
> >
> > I've reached the end of my ability to system-tap, so I'm going
> > to build a kernel with some printk'ing :) and continue in awhile.
> >
> > Note that editing the program to do pivot_root(".", "/mnt") and
> > umount2("/", MNT_DETACH) does NOT cause a problem.
> >
> > ===================================================================
> >
> > #include <sys/types.h>
> > #include <sys/stat.h>
> > #include <fcntl.h>
> > #include <stdio.h>
> > #include <stdlib.h>
> > #include <unistd.h>
> > #include <sys/mount.h>
> > #include <linux/sched.h>
> >
> >
> > int main()
> > {
> >         int ret;
> >         ret = unshare(CLONE_NEWNS);
> >         if (ret)
> >                 exit(1);
> >         int oldroot, newroot;
> >
> >         ret = mount("/", "/mnt", "none", MS_BIND | MS_REC, NULL);
> >         if (ret)
> >                 exit(1);
> >
> >         oldroot = open("/", O_DIRECTORY | O_RDONLY);
> >         if (oldroot < 0)
> >                 exit(1);
> >         newroot = open("/mnt", O_DIRECTORY | O_RDONLY);
> >         if (newroot < 0)
> >                 exit(1);
> >         if (fchdir(newroot))
> >                 exit(1);
> >
> >         if (pivot_root(".", "."))
> >                 exit(2);
> >         printf("pivoted");
> >
> >         if (fchdir(oldroot) < 0)
> >                 exit(3);
> >         if (umount2("/", MNT_DETACH) < 0)
> >                 exit(3);
> 
> Shouldn't that be umount2(".", MNT_DETACH)?

No, unless I'm misreading the state at this point:

I want / to be umounted, . is the oldroot.  We've got 

"." = oldroot but not referencable from / (and not pinning it)

"/" is the oldroot mounted over the newroot.

> I'm currently having trouble finding an old enough box.  Can you try
> the attached fancier test and see what it prints?

Exact same as mine:

ubuntu at kvm-p3:~$ sudo ./x
pivoted
in new root
I am 1441
root at kvm-p3:/# mount --bind /mnt /mnt
mount: wrong fs type, bad option, bad superblock on /mnt,
	missing codepage or helper program, or other error
	In some cases useful info is found in syslog - try
	dmesg | tail  or so

-serge


More information about the lxc-devel mailing list