[lxc-devel] A question about chroot_into_slave()

Serge Hallyn serge.hallyn at ubuntu.com
Fri Oct 3 14:24:55 UTC 2014


Quoting Andrey Wagin (avagin at gmail.com):
> 2014-10-03 1:23 GMT+04:00 Serge Hallyn <serge.hallyn at ubuntu.com>:
> > Quoting Andrey Wagin (avagin at gmail.com):
> >> 2014-10-02 20:25 GMT+04:00 Serge Hallyn <serge.hallyn at ubuntu.com>:
> >> > Quoting Andrey Wagin (avagin at gmail.com):
> >> >> Hi All,
> >> >>
> >> >> chroot_into_slave() is called if the root / is ramfs.
> >> >>
> >> >> chroot_into_slave() mount tmpfs, creates a directory there and
> >> >> bind-mounts the host root into this directory.
> >> >> _ host_root
> >> >>  \_ tmp
> >> >>   \_ host_root (bind)
> >> >>    \_ ct_root
> >> >>
> >> >> Then pivot_root() exchange "host_root (bind)" and "ct_root" and
> >> >> "host_root (bind)". After that "host_root (bind)" is umounted.
> >> >> _host_root
> >> >>  \_ tmp
> >> >>    \_ct_root
> >> >>
> >> >> Here is my question. Why we can't do chroot(conf->rootfs.mount)
> >> >> instead of chroot_into_slave() & pivot_root(conf->rootfs.mount). I
> >> >> think the result will be the same with less amount of not obvious
> >> >> steps. Have I missed something?
> >> >
> >> > Do you mean not to pivot_root at all?  If so, because chroot is not
> >> > an adequate replacement:  you can trivially escape it, and /proc/mounts
> >> > wil be polluted.
> >>
> >> /proc/ will not be polluted:
> >>
> >> root at ubuntu:/home/avagin# chroot centos
> >> [root at ubuntu /]# mount -t proc proc /proc/
> >> [root at ubuntu /]# cat /proc/self/mountinfo
> >> 64 21 0:3 / /proc rw,relatime - proc proc rw
> >>
> >> root at ubuntu:/home/avagin# cat /proc/9529/mountinfo
> >> 64 21 0:3 / /proc rw,relatime - proc proc rw
> >
> > cat /proc/$$/mounts
> >
> >> Now I want to show you, that chroot_into_slave() & pivot_root()
> >> doesn't defend you against escaping from a container. Let's repeate
> >> all actions, which lxc does for the root on ramfs. I don't have a host
> >> with root on ramfs, so I do all actions in my VM.
> >>
> >> # chroot_into_slave()
> >> root at ubuntu:/home/avagin# unshare -m -- /bin/bash
> >> root at ubuntu:/home/avagin# mount --bind rootfs rootfs
> >> root at ubuntu:/home/avagin# mount --make-slave rootfs
> >> root at ubuntu:/home/avagin# mount -t tmpfs tmpfs rootfs
> >> root at ubuntu:/home/avagin# mkdir rootfs/root
> >> root at ubuntu:/home/avagin# mount --rbind / rootfs/root/
> >> root at ubuntu:/home/avagin# mount --make-rslave rootfs/
> >> root at ubuntu:/home/avagin# chroot rootfs/root/
> >> root at ubuntu:/# cd /
> >>
> >> # mount_rootfs() & pivot_root()
> >> root at ubuntu:/# mount /home/avagin/centos /home/avagin/rootfs/
> >> mount: /home/avagin/centos is not a block device
> >> root at ubuntu:/# mount --bind /home/avagin/centos /home/avagin/rootfs/
> >> root at ubuntu:/# cd /home/avagin/rootfs/
> >> root at ubuntu:/home/avagin/rootfs# mkdir old
> >> root at ubuntu:/home/avagin/rootfs# pivot_root . old
> >> root at ubuntu:/home/avagin/rootfs# cd /
> >> root at ubuntu:/# cat /etc/redhat-release
> >> CentOS release 5.10 (Final)
> >> root at ubuntu:/# mount -t proc proc /proc/
> >> root at ubuntu:/# umount -l old/
> >> root at ubuntu:/# cat /proc/self/mountinfo
> >> 153 125 8:1 /home/avagin/centos / rw,relatime - ext4
> >> /dev/disk/by-uuid/291e7b1a-8396-44cf-9927-578b3401d0bd
> >> rw,errors=remount-ro,data=ordered
> >> 154 153 0:3 / /proc rw,relatime - proc proc rw
> >>
> >> Now let's try to escape from this CT. I take the code from
> >> http://www.bpfh.net/simes/computing/chroot-break.html
> >>
> >> root at ubuntu:/# ls -l /etc/redhat-release
> >> -rw-r--r-- 1 root root 28 Oct  7  2013 /etc/redhat-release
> >> root at ubuntu:/# ./breaking_chroot
> >> # ls -l /etc/redhat-release
> >> ls: cannot access /etc/redhat-release: No such file or directory
> >> #
> >> # cat /etc/lsb-release
> >> DISTRIB_ID=Ubuntu
> >> DISTRIB_RELEASE=14.04
> >> DISTRIB_CODENAME=trusty
> >> DISTRIB_DESCRIPTION="Ubuntu 14.04 LTS"
> >>
> >> It works. We are able to escape from CT.
> >
> > It doesn't work on my system.  I'm curious what the difference might be.
> 
> Did you use lxc-start to start a container? If it's yes, did you force


No, I did it by hand using the same steps you did above.  For
rootfs I used a utopic container rootfs sitting under
/var/lib/lxc/lb/rootfs


> to call chroot_into_slave()? The problem exists only if
> chroot_into_slave() is executed.
> 
> I use the following patch, because I doesn't have a host with the root on ramfs.
> diff --git a/src/lxc/conf.c b/src/lxc/conf.c
> index e8979c9..2d7ced9 100644
> --- a/src/lxc/conf.c
> +++ b/src/lxc/conf.c
> @@ -3952,7 +3952,8 @@ int do_rootfs_setup(struct lxc_conf *conf, const
> char *name, const char *lxcpath
>                 }
>         }
> 
> -       if (detect_ramfs_rootfs()) {
> +       if (1 || detect_ramfs_rootfs()) {
> +                       ERROR("Failed to chroot into slave /");
>                 if (chroot_into_slave(conf)) {
>                         ERROR("Failed to chroot into slave /");
>                         return -1;
> 
> >
> > what kernel are you running?  What is /proc/1/mountinfo on the host?
> 
> I can show you these data from the CRIU user (riya khanna), who has a
> probelm with dumping his containers. Actually I started to thinking
> about these things to investigate his question.
> 
> From root namespace
> ------------------------------
> 
> # cat /proc/self/mountinfo
> 
> 1 1 0:2 / / rw - rootfs rootfs rw,size=373124k,nr_inodes=93281
> 10 1 0:4 / /proc rw,relatime - proc proc rw
> 11 1 0:11 / /sys rw,relatime - sysfs sysfs rw
> 12 11 0:12 / /sys/fs/cgroup rw,relatime - cgroup none
> rw,cpuset,debug,cpu,cpuacct,memory,devices,freezer,blkio,perf_event,clone_children
> 13 11 0:6 / /sys/kernel/debug rw,relatime - debugfs none rw
> 14 1 0:10 / /dev/pts rw,relatime - devpts devpts rw,mode=600,ptmxmode=000
> 
> 
> # lxc-info -n container
> 
> Name:           container
> State:          RUNNING
> PID:            5489
> CPU use:        487.35 seconds
> Memory use:     27.86 MiB
> KMem use:       0 bytes
> 
> # cat /proc/self/mountinfo
> 
> 26 26 0:2 / / rw - rootfs rootfs rw,size=373124k,nr_inodes=93281
> 27 26 0:4 / /proc rw,relatime - proc proc rw
> 28 26 0:11 / /sys rw,relatime - sysfs sysfs rw
> 29 28 0:12 / /sys/fs/cgroup rw,relatime - cgroup none
> rw,cpuset,debug,cpu,cpuacct,memory,devices,freezer,blkio,perf_event,clone_children
> 30 28 0:6 / /sys/kernel/debug rw,relatime - debugfs none rw
> 31 26 0:10 / /dev/pts rw,relatime - devpts devpts rw,mode=600,ptmxmode=000
> 35 26 0:2 /usr/local/lib/lxc/rootfs /usr/local/lib/lxc/rootfs rw -
> rootfs rootfs rw,size=373124k,nr_inodes=93281
> 36 35 0:16 / /usr/local/lib/lxc/rootfs rw,relatime - tmpfs none
> rw,size=12k,mode=755
> 25 36 0:13 / /usr/local/lib/lxc/rootfs/root rw,relatime - tmpfs tmpfs rw
> 47 25 0:15 / /usr/local/lib/lxc/rootfs/root/proc rw,relatime - proc none rw
> 48 25 0:14 / /usr/local/lib/lxc/rootfs/root/sys rw,relatime - sysfs none rw
> 49 25 0:2 /dev /usr/local/lib/lxc/rootfs/root/dev rw,relatime - rootfs
> rootfs rw,size=373124k,nr_inodes=93281
> 37 49 0:17 / /usr/local/lib/lxc/rootfs/root/dev rw,nosuid,relatime -
> tmpfs tmpfs rw,mode=755
> 38 37 0:10 / /usr/local/lib/lxc/rootfs/root/dev/pts rw,relatime -
> devpts devpts rw,mode=600,ptmxmode=000

Ok now precisely which version of lxc is that?  If built from git,
then which git commit?  We recently switched the way we do pivot_root,
and for a few commits we were umounting "/" instead of "." for
umount old_root.  So it's possible that either updating to current
git HEAD (to get commit 479a4f14c) or resetting to commit
01db019751 would change your behavior.

-serge


More information about the lxc-devel mailing list