[lxc-devel] [PATCH] Support MS_SHARED /
Alexander Vladimirov
alexander.idkfa.vladimirov at gmail.com
Sun Jan 6 06:22:14 UTC 2013
I also noticed device nodes having strange permissions when /dev is
being auto-populated
[idkfa at lxc0 ~]$ ls -la /dev/{null,tty,urandom,zero,full}
crwxr-xr-x 1 root root 1, 7 Jan 6 05:56 /dev/full
crwxr-xr-x 1 root root 1, 3 Jan 6 05:56 /dev/null
crwxr-xr-x 1 root root 5, 0 Jan 6 05:56 /dev/tty
crwxr-xr-x 1 root root 1, 9 Jan 6 05:56 /dev/urandom
crwxr-xr-x 1 root root 1, 5 Jan 6 05:56 /dev/zero
not really sure what could cause this
2013/1/6 Michael H. Warfield <mhw at wittsend.com>:
> On Sun, 2013-01-06 at 06:39 +0800, Alexander Vladimirov wrote:
>> It is a separate package in Arch Linux and I dont have it installed on
>> the host, as well as in container since everything works well without
>> it
>
> Well, that would explain it. What isn't explained is why we need it.
>
> This is the run_makedev() function which is called from setup_autodev()
> in src/lxc/setup.c just before it tries to populate the .../dev
> directory in the container. There's some comments in there about making
> sure the /dev/vcs* entries are created.
>
> It's also not clear to me if it's even doing what it perports to do. It
> changes to the dev directory and then runs /sbin/MAKEDEV (without
> checking if it even exists) without a parameter (-d) for the target
> directory which would seem to me to cause MAKEDEV to attempt to create
> the devices in the host /dev and not the container .../dev directory at
> all. That actually appears consistent with the behavior I'm seeing. If
> I reboot the host system, all those tty devices do not exist in the host
> until after I fire up a container with autodev enabled. Then they
> appear in the host /dev which is not the correct behavior.
>
> I don't think we should be doing this but this is part of the earlier
> autodev patches Serge did for systemd that went into 9.0.0.a1. Maybe
> it's a difference in behavior between MAKEDEV on Ubuntu vs MAKEDEV on
> Fedora (et al) and not even guaranteed to exist.
>
> Serge?
>
> Regards,
> Mike
>
>> 2013/1/6 Michael H. Warfield <mhw at wittsend.com>:
>> > On Sun, 2013-01-06 at 06:31 +0800, Alexander Vladimirov wrote:
>> >> I can confirm it works for Arch Linux with systemd 196
>> >> However I see exactly one message saying:
>> >> sh: /sbin/MAKEDEV: No such file or directory
>> >
>> > Do you have /sbin/MAKEDEV in the host system? If not, that would make
>> > sense. I'm not sure what it's suppose to be doing in lxc.
>> >
>> > Regards,
>> > Mike
>> >
>> >> 2013/1/6 Michael H. Warfield <mhw at wittsend.com>
>> >> Hey Serge!
>> >>
>> >> Took longer for me to test this out on Fedora 18 Beta than I
>> >> had
>> >> expected. I got tangled up trying to get bridge networking
>> >> working and
>> >> my day job wanted to get in my way... :-P I hear down that
>> >> F18 final
>> >> has been delayed again but anticipated for Jan 15. I'll test
>> >> that when
>> >> it becomes available.
>> >>
>> >> IAC... I was able to confirm that the 0.9.0.a2 cut very
>> >> definitely
>> >> fails on an F18Beta host with the expected pivot root error
>> >> and that the
>> >> code in staging does seem to work and seems to do the right
>> >> thing. This
>> >> was starting an F17 container on an F18Beta host with autodev
>> >> enabled
>> >> and systemd 195 running in the container.
>> >>
>> >> I did notice a huge "pile" of MAKEDEV errors creating tty
>> >> devices when I
>> >> ran lxc-start like these:
>> >>
>> >> --
>> >> MAKEDEV: /dev/ttyEQ1001: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1002: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1003: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1004: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1005: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1006: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1007: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1008: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1009: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1010: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1011: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1012: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1013: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1014: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1015: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1016: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1017: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1018: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1019: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1020: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1021: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1022: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1023: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1024: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1025: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1026: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyEQ1027: unable to set file creation context "
>> >> system_u:object_r:tty_device_t:s0"
>> >> MAKEDEV: /dev/ttyUB0: unable to set file creation context "
>> >> system_u:object_r:device_t:s0"
>> >> MAKEDEV: /dev/ttyUB1: unable to set file creation context "
>> >> system_u:object_r:device_t:s0"
>> >> <30>systemd[1]: systemd 195 running in system mode. (+PAM
>> >> +LIBWRAP +AUDIT +SELINUX +IMA +SYSVINIT +LIBCRYPTSETUP +GCRYPT
>> >> +ACL +XZ; fedora)
>> >> <30>systemd[1]: Detected virtualization 'lxc'.
>> >>
>> >> Welcome to Fedora 17 (Beefy Miracle)!
>> >>
>> >> <30>systemd[1]: Set hostname to <alcove.wittsend.com>.
>> >> <28>systemd[1]: Cannot add dependency job for unit
>> >> display-manager.service, ignoring: Unit
>> >> display-manager.service failed to load: No such file or
>> >> directory. See system logs and 'systemctl status
>> >> display-manager.service' for details.
>> >> <30>systemd[1]: Started Collect Read-Ahead Data.
>> >> <30>systemd[1]: Started Replay Read-Ahead Data.
>> >> <30>systemd[1]: Starting Forward Password Requests to Wall
>> >> Directory Watch.
>> >> <30>systemd[1]: Started Forward Password Requests to Wall
>> >> Directory Watch.
>> >> <30>systemd[1]: Starting Syslog Socket.
>> >> [ OK ] Listening on Syslog Socket.
>> >> --
>> >>
>> >> It certainly appears to have done the right thing and that
>> >> same
>> >> container on an F17 host does not emit those MAKEDEV errors
>> >> and does not
>> >> contain those tty devices. Looks like an selinux issue inside
>> >> the
>> >> container. But it's happening even when I set selinux to
>> >> "permissive"
>> >> mode in both the host and container. Seems cosmetic,
>> >> however. Nothing
>> >> showing up in the syslog messages file on either the host or
>> >> the
>> >> container.
>> >>
>> >> I see a call to "/sbin/MAKEDEV console" in src/lxc/conf.c.
>> >> Not sure if
>> >> it's that call that's generating the problem but there is no
>> >> MAKEDEV in
>> >> the container. It's interesting that they're showing up
>> >> before systemd
>> >> in the container is announcing its presence. Looks like it's
>> >> running
>> >> the MAKEDEV command in the host environment and, if I run
>> >> "MAKEDEV
>> >> console" in the host itself, I get a couple thousand of those
>> >> tty
>> >> devices created in the host /dev, that were not present
>> >> before, and I
>> >> don't get any of the context errors... Might be worth looking
>> >> into just
>> >> to see what all the noise is all about.
>> >>
>> >> IAC... Looks like it works on F18Beta. I'm good.
>> >>
>> >> Regards,
>> >> Mike
>> >>
>> >> On Thu, 2012-12-27 at 22:45 -0500, Michael H. Warfield wrote:
>> >> > On Thu, 2012-12-20 at 09:03 -0600, Serge Hallyn wrote:
>> >> > > Quoting Stéphane Graber (stgraber at ubuntu.com):
>> >> > > > On 12/20/2012 06:58 AM, Serge Hallyn wrote:
>> >> > > ...
>> >> > > > /proc/mounts in the container will also end up being
>> >> polluted by all the
>> >> > > > mount points from the host, this in itself doesn't cause
>> >> any big
>> >> > > > problem, though the container will try (and fail) to
>> >> unmount all of those.
>> >> > > > Is there anything we can do to improve that situation or
>> >> is that a side
>> >> > > > effect of MS_SHARED that we can't workaround on our end?
>> >>
>> >> > > I think it's actually a side effect of pivot-root after
>> >> chroot. You
>> >> > > have /orig_root/foo/chroot_root/path/new_pivot/put_old.
>> >> Then you
>> >> > > chroot to /orig_root/foo/chroot_root. When you then pivot
>> >> to
>> >> > > /path/new_pivot, what ends up in put_old
>> >> is /orig_root/foo/chroot_root.
>> >> > > I'm actually not sure you can trim the mounts which were
>> >> under
>> >> > > /orig_root. We could figure out ones they are by
>> >> following the chain
>> >> > > of mount ids in /proc/self/mountinfo, but we can't reach
>> >> them to umount
>> >> > > them.
>> >>
>> >> > > It's much like how when you boot a livecd, you see things
>> >> like
>> >> > > the rootfs on / as well as /cow on /. You can't reach the
>> >> rootfs
>> >> > > which is parent of the /cow on / any more, but it's in the
>> >> mounts
>> >> > > table.
>> >> >
>> >> > > Now I tested, and with a simple setup we can use a much
>> >> simpler
>> >> > > patch which just does mount("", "/", NULL, MS_SLAVE|
>> >> MS_REC, 0);
>> >> > > for the whole of chroot_into_slave() (and skips the new
>> >> umount2()
>> >> > > in start.c). The container then starts, and its mounts
>> >> table
>> >> > > is clean.
>> >> >
>> >> > > Where that won't work is in a livecd or any fancy raid
>> >> setup,
>> >> > > where your process's / has a parent which is MS_SHARED.
>> >> >
>> >> > > Michael, can you show me your /proc/self/mountinfo in a
>> >> f18
>> >> > > box?
>> >> >
>> >> > Freshly installed clean box...
>> >> >
>> >> > [root at dwarf52 mhw]# cat /proc/self/mountinfo
>> >> > 15 34 0:3 / /proc rw,nosuid,nodev,noexec,relatime shared:5 -
>> >> proc proc rw
>> >> > 16 34 0:14 / /sys rw,nosuid,nodev,noexec,relatime shared:6 -
>> >> sysfs sysfs rw,seclabel
>> >> > 17 34 0:5 / /dev rw,nosuid shared:2 - devtmpfs devtmpfs
>> >> rw,seclabel,size=491520k,nr_inodes=122880,mode=755
>> >> > 18 16 0:15 / /sys/kernel/security
>> >> rw,nosuid,nodev,noexec,relatime shared:7 - securityfs
>> >> securityfs rw
>> >> > 19 16 0:13 / /sys/fs/selinux rw,relatime shared:8 -
>> >> selinuxfs selinuxfs rw
>> >> > 20 17 0:16 / /dev/shm rw,nosuid,nodev shared:3 - tmpfs tmpfs
>> >> rw,seclabel
>> >> > 21 17 0:10 / /dev/pts rw,nosuid,noexec,relatime shared:4 -
>> >> devpts devpts rw,seclabel,gid=5,mode=620,ptmxmode=000
>> >> > 22 34 0:17 / /run rw,nosuid,nodev shared:19 - tmpfs tmpfs
>> >> rw,seclabel,mode=755
>> >> > 23 16 0:18 / /sys/fs/cgroup rw,nosuid,nodev,noexec shared:9
>> >> - tmpfs tmpfs rw,seclabel,mode=755
>> >> > 24 23 0:19 / /sys/fs/cgroup/systemd
>> >> rw,nosuid,nodev,noexec,relatime shared:10 - cgroup cgroup
>> >> rw,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd
>> >> > 25 23 0:20 / /sys/fs/cgroup/cpuset
>> >> rw,nosuid,nodev,noexec,relatime shared:11 - cgroup cgroup
>> >> rw,cpuset
>> >> > 26 23 0:21 / /sys/fs/cgroup/cpu,cpuacct
>> >> rw,nosuid,nodev,noexec,relatime shared:12 - cgroup cgroup
>> >> rw,cpuacct,cpu
>> >> > 27 23 0:22 / /sys/fs/cgroup/memory
>> >> rw,nosuid,nodev,noexec,relatime shared:13 - cgroup cgroup
>> >> rw,memory
>> >> > 28 23 0:23 / /sys/fs/cgroup/devices
>> >> rw,nosuid,nodev,noexec,relatime shared:14 - cgroup cgroup
>> >> rw,devices
>> >> > 29 23 0:24 / /sys/fs/cgroup/freezer
>> >> rw,nosuid,nodev,noexec,relatime shared:15 - cgroup cgroup
>> >> rw,freezer
>> >> > 30 23 0:25 / /sys/fs/cgroup/net_cls
>> >> rw,nosuid,nodev,noexec,relatime shared:16 - cgroup cgroup
>> >> rw,net_cls
>> >> > 31 23 0:26 / /sys/fs/cgroup/blkio
>> >> rw,nosuid,nodev,noexec,relatime shared:17 - cgroup cgroup
>> >> rw,blkio
>> >> > 32 23 0:27 / /sys/fs/cgroup/perf_event
>> >> rw,nosuid,nodev,noexec,relatime shared:18 - cgroup cgroup
>> >> rw,perf_event
>> >> > 34 1 253:1 / / rw,relatime shared:1 -
>> >> ext4 /dev/mapper/fedora_dwarf52-root rw,seclabel,data=ordered
>> >> > 35 15 0:29 / /proc/sys/fs/binfmt_misc rw,relatime shared:20
>> >> - autofs systemd-1
>> >> rw,fd=29,pgrp=1,timeout=300,minproto=5,maxproto=5,direct
>> >> > 37 16 0:30 / /sys/kernel/config rw,relatime shared:21 -
>> >> configfs configfs rw
>> >> > 39 17 0:31 / /dev/hugepages rw,relatime shared:22 -
>> >> hugetlbfs hugetlbfs rw,seclabel
>> >> > 38 17 0:12 / /dev/mqueue rw,relatime shared:23 - mqueue
>> >> mqueue rw,seclabel
>> >> > 36 16 0:7 / /sys/kernel/debug rw,relatime shared:24 -
>> >> debugfs debugfs rw
>> >> > 40 34 0:32 / /tmp rw shared:25 - tmpfs tmpfs rw,seclabel
>> >> > 41 34 8:1 / /boot rw,relatime shared:26 - ext4 /dev/sda1
>> >> rw,seclabel,data=ordered
>> >> > 42 34 253:2 / /home rw,relatime shared:27 -
>> >> ext4 /dev/mapper/fedora_dwarf52-home rw,seclabel,data=ordered
>> >> > 74 22 0:33 / /run/user/1000/gvfs rw,nosuid,nodev,relatime
>> >> shared:57 - fuse.gvfsd-fuse gvfsd-fuse
>> >> rw,user_id=1000,group_id=1000
>> >> > 76 16 0:34 / /sys/fs/fuse/connections rw,relatime shared:59
>> >> - fusectl fusectl rw
>> >> >
>> >> > Looks like everything has "shared".
>> >> >
>> >> > I'll be testing lxc on this beast with and without this
>> >> patch over the
>> >> > next couple of days for both systemd and non-systemd
>> >> containers. I've
>> >> > got to get 0.9.0a2 built on it first and then go from there.
>> >> >
>> >> > > > I didn't spend much time reviewing the code itself, but
>> >> it applied to my
>> >> > > > local staging tree and built fine, so that's good enough
>> >> for me :)
>> >> >
>> >> > > Thanks - TBH the extra mounts are no more wrong than they
>> >> are in
>> >> > > a livecd, so I don't think it's a big problem. One we can
>> >> address
>> >> > > in January.
>> >> >
>> >> > > -serge
>> >> >
>> >> > Hope you (and everyone else) had a nice holiday!
>> >> >
>> >> > Regards,
>> >> > Mike
>> >>
>> >> >
>> >> ------------------------------------------------------------------------------
>> >> > Master HTML5, CSS3, ASP.NET, MVC, AJAX, Knockout.js, Web API
>> >> and
>> >> > much more. Get web development skills now with LearnDevNow -
>> >> > 350+ hours of step-by-step video tutorials by Microsoft MVPs
>> >> and experts.
>> >> > SALE $99.99 this month only -- learn more at:
>> >> > http://p.sf.net/sfu/learnmore_122812
>> >> > _______________________________________________ Lxc-devel
>> >> mailing list Lxc-devel at lists.sourceforge.net
>> >> https://lists.sourceforge.net/lists/listinfo/lxc-devel
>> >>
>> >>
>> >> --
>> >> Michael H. Warfield (AI4NB) | (770) 985-6132 |
>> >> mhw at WittsEnd.com
>> >> /\/\|=mhw=|\/\/ | (678) 463-0932 |
>> >> http://www.wittsend.com/mhw/
>> >> NIC whois: MHW9 | An optimist believes we live in
>> >> the best of all
>> >> PGP Key: 0x674627FF | possible worlds. A pessimist is
>> >> sure of it!
>> >>
>> >>
>> >> ------------------------------------------------------------------------------
>> >> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012,
>> >> HTML5, CSS,
>> >> MVC, Windows 8 Apps, JavaScript and much more. Keep your
>> >> skills current
>> >> with LearnDevNow - 3,200 step-by-step video tutorials by
>> >> Microsoft
>> >> MVPs and experts. SALE $99.99 this month only -- learn more
>> >> at:
>> >> http://p.sf.net/sfu/learnmore_122912
>> >> _______________________________________________
>> >> Lxc-devel mailing list
>> >> Lxc-devel at lists.sourceforge.net
>> >> https://lists.sourceforge.net/lists/listinfo/lxc-devel
>> >>
>> >>
>> >>
>> >> --
>> >> This message has been scanned for viruses and
>> >> dangerous content by MailScanner, and is
>> >> believed to be clean.
>> >
>> > --
>> > Michael H. Warfield (AI4NB) | (770) 985-6132 | mhw at WittsEnd.com
>> > /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/
>> > NIC whois: MHW9 | An optimist believes we live in the best of all
>> > PGP Key: 0x674627FF | possible worlds. A pessimist is sure of it!
>>
>
> --
> Michael H. Warfield (AI4NB) | (770) 985-6132 | mhw at WittsEnd.com
> /\/\|=mhw=|\/\/ | (678) 463-0932 | http://www.wittsend.com/mhw/
> NIC whois: MHW9 | An optimist believes we live in the best of all
> PGP Key: 0x674627FF | possible worlds. A pessimist is sure of it!
More information about the lxc-devel
mailing list