[lxc-devel] Strange problem (stray mounts) with lxc-create...

Michael H. Warfield mhw at WittsEnd.com
Mon Oct 14 18:31:43 UTC 2013


Hey Serge,

Was out of town the last several days.  Sorry about not getting back
sooner.  Just getting back to this now...

Because my big server, Hydra, has a lot of running containers on it now,
I switched testing over to another server, MtKing, that has no other
containers on it so I had a cleaner mount table to work from.  Hydra was
just too cluttered at this point.

On Wed, 2013-10-09 at 12:08 -0500, Serge Hallyn wrote: 
> Quoting Michael H. Warfield (mhw at WittsEnd.com):
> > On Wed, 2013-10-09 at 10:10 -0500, Serge Hallyn wrote: 
> > > Quoting Michael H. Warfield (mhw at WittsEnd.com):
> > > > On Wed, 2013-10-09 at 09:50 -0500, Serge Hallyn wrote: 
> > > > > > lxc-create -n Ubuntu-test -t ubuntu
> > > > > > 
> > > > > > Bingo...
> > > > > > 
> > > > > > /dev/mapper/fedora-root on /usr/lib64/lxc/rootfs type ext4 (rw,relatime,seclabel,data=ordered)
> > > > > > 
> > > > > > Why is lxc-create even creating that mount?  I don't see any reason for
> > > > > 
> > > > > Check lxccontainer.c:785 and line 805.  We call bdev_mount() in case its
> > > > > a blockdev.  In the case of a dir-backed container we still end up doing
> > > > > a bind mount of the rootfs.
> > > > > 
> > > > > > it.  We're never running the container in lxc-create.  Running
> > > > > > "umount /usr/lib64/lxc/rootfs" clears it and we're off to the races
> > > > > > again.
> > > > > > 
> > > > > > If I were to venture a WAG (Wild Ass Guess) some initialization code is
> > > > > > creating that bind mount that is not needed and that the cleanup code in
> > > > > > lxc-create is unaware of.  But I haven't gone to the trouble of trying
> > > > > > to track the code down yet.
> > > > 
> > > > > Now is your / still MS_SHARED?  The bdev create and templates
> > > > > run in a private namespace, but if MS_SHARED then the mounts get
> > > > > bounced back to host.  Maybe we need to manually set MS_PRIVATE every
> > > > > time after doing an unshare() in lxc code.
> > > > 
> > > > It doesn't seem to be...  Am I looking in the right spot.  I don't see
> > > > it in the options...
> > > > 
> > > > [root at hydra mhw]# mount | grep ' / '
> > > > /dev/mapper/fedora-root on / type ext4 (rw,relatime,seclabel,data=ordered)
> > 
> > > What does /proc/self/mountinfo on the host looks like?

Crud...  Now I see I looked in /proc/self/mounts
not /proc/self/mountinfo.  My bad.

> > [root at hydra mhw]# grep ' / ' /proc/self/mounts
> > rootfs / rootfs rw 0 0

> What about 'grep -i shared /proc/self/mount*?

Gave me everything.  Every entry is "shared".  I could just as well have
done a "cat" rather than a "grep -i".

Ok...  Let's do this formally...  Clean machine (MtKing) with no
containers running...

[root at mtking mhw]# grep -i shared /proc/self/mountinfo 
15 34 0:3 / /proc rw,nosuid,nodev,noexec,relatime shared:5 - proc proc rw
16 34 0:14 / /sys rw,nosuid,nodev,noexec,relatime shared:6 - sysfs sysfs rw
17 34 0:5 / /dev rw,nosuid shared:2 - devtmpfs devtmpfs rw,size=1792232k,nr_inodes=448058,mode=755
18 16 0:15 / /sys/kernel/security rw,nosuid,nodev,noexec,relatime shared:7 - securityfs securityfs rw
20 17 0:16 / /dev/shm rw,nosuid,nodev shared:3 - tmpfs tmpfs rw
21 17 0:10 / /dev/pts rw,nosuid,noexec,relatime shared:4 - devpts devpts rw,gid=5,mode=620,ptmxmode=000
22 34 0:17 / /run rw,nosuid,nodev shared:19 - tmpfs tmpfs rw,mode=755
23 16 0:18 / /sys/fs/cgroup rw,nosuid,nodev,noexec shared:8 - tmpfs tmpfs rw,mode=755
24 23 0:19 / /sys/fs/cgroup/systemd rw,nosuid,nodev,noexec,relatime shared:9 - cgroup cgroup rw,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd
25 16 0:20 / /sys/fs/pstore rw,nosuid,nodev,noexec,relatime shared:18 - pstore pstore rw
26 23 0:21 / /sys/fs/cgroup/cpuset rw,nosuid,nodev,noexec,relatime shared:10 - cgroup cgroup rw,cpuset
27 23 0:22 / /sys/fs/cgroup/cpu,cpuacct rw,nosuid,nodev,noexec,relatime shared:11 - cgroup cgroup rw,cpuacct,cpu
28 23 0:23 / /sys/fs/cgroup/memory rw,nosuid,nodev,noexec,relatime shared:12 - cgroup cgroup rw,memory
29 23 0:24 / /sys/fs/cgroup/devices rw,nosuid,nodev,noexec,relatime shared:13 - cgroup cgroup rw,devices
30 23 0:25 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime shared:14 - cgroup cgroup rw,freezer
31 23 0:26 / /sys/fs/cgroup/net_cls rw,nosuid,nodev,noexec,relatime shared:15 - cgroup cgroup rw,net_cls
32 23 0:27 / /sys/fs/cgroup/blkio rw,nosuid,nodev,noexec,relatime shared:16 - cgroup cgroup rw,blkio
33 23 0:28 / /sys/fs/cgroup/perf_event rw,nosuid,nodev,noexec,relatime shared:17 - cgroup cgroup rw,perf_event
34 1 8:3 / / rw,relatime shared:1 - ext4 /dev/sda3 rw,data=ordered
14 15 0:13 / /proc/sys/fs/binfmt_misc rw,relatime shared:20 - autofs systemd-1 rw,fd=32,pgrp=1,timeout=300,minproto=5,maxproto=5,direct
19 17 0:12 / /dev/mqueue rw,relatime shared:21 - mqueue mqueue rw
35 17 0:29 / /dev/hugepages rw,relatime shared:22 - hugetlbfs hugetlbfs rw
36 16 0:7 / /sys/kernel/debug rw,relatime shared:23 - debugfs debugfs rw
37 34 0:30 / /tmp rw shared:24 - tmpfs tmpfs rw
38 14 0:31 / /proc/sys/fs/binfmt_misc rw,relatime shared:25 - binfmt_misc binfmt_misc rw
39 16 0:32 / /sys/kernel/config rw,relatime shared:26 - configfs configfs rw
41 34 8:1 / /boot rw,relatime shared:27 - ext4 /dev/sda1 rw,stripe=4,data=ordered
42 34 8:5 / /home rw,relatime shared:28 - ext4 /dev/sda5 rw,data=ordered
74 42 0:33 / /home/mhw/Private rw,nosuid,nodev,relatime shared:59 - ecryptfs /home/mhw/.Private rw,ecryptfs_fnek_sig=a3e7ab91e66a3511,ecryptfs_sig=a3ff22626fa0be76,ecryptfs_cipher=aes,ecryptfs_key_bytes=16,ecryptfs_unlink_sigs
76 22 0:34 / /run/user/1000/gvfs rw,nosuid,nodev,relatime shared:61 - fuse.gvfsd-fuse gvfsd-fuse rw,user_id=1000,group_id=1000

Now, I dumped that to a file (mount.start) and I'll post diffs from
below...

Next I ran this...

lxc-create -n Ubuntu.precise -t ubuntu -- -r precise

And I notice this...

Checking cache download in /var/cache/lxc/precise/rootfs-amd64 ... 
Copy /var/cache/lxc/precise/rootfs-amd64 to /usr/lib64/lxc/rootfs ... 
Copying rootfs to /usr/lib64/lxc/rootfs ...

Oooo hooo....  Now I see why lxc-create is creating that bind mount to
begin with...  It's not copying to the "rootfs" path, it's copying to
that mount path.  Ok.  That mystery is solved for me now.  I wouldn't
have done it that way, as it's unnecessary, but it's a reasonable way to
do it.  Judgment call.  Season to taste...

I then added...

lxc-create -n Ubuntu.quantal -t ubuntu -- -r quantal
lxc-create -n Ubuntu.raring -t ubuntu -- -r raring
lxc-create -n Ubuntu.saucy -t ubuntu -- -r saucy

I had dumped the original mount table to "mount.start" and then dumped
the resulting mount table to mount.end and ran a diff:

[root at mtking mhw]# grep -i shared /proc/self/mountinfo  > mount.end
[root at mtking mhw]# diff mount.start mount.end
30a31,40
> 176 34 8:3 /var/lib/lxc/Ubuntu.precise/rootfs /usr/lib64/lxc/rootfs rw,relatime shared:1 - ext4 /dev/sda3 rw,data=ordered
> 180 34 8:3 /var/lib/lxc/Ubuntu.quantal/rootfs /var/lib/lxc/Ubuntu.precise/rootfs rw,relatime shared:1 - ext4 /dev/sda3 rw,data=ordered
> 189 176 8:3 /var/lib/lxc/Ubuntu.quantal/rootfs /usr/lib64/lxc/rootfs rw,relatime shared:1 - ext4 /dev/sda3 rw,data=ordered
> 196 34 8:3 /var/lib/lxc/Ubuntu.raring/rootfs /var/lib/lxc/Ubuntu.quantal/rootfs rw,relatime shared:1 - ext4 /dev/sda3 rw,data=ordered
> 204 180 8:3 /var/lib/lxc/Ubuntu.raring/rootfs /var/lib/lxc/Ubuntu.precise/rootfs rw,relatime shared:1 - ext4 /dev/sda3 rw,data=ordered
> 209 189 8:3 /var/lib/lxc/Ubuntu.raring/rootfs /usr/lib64/lxc/rootfs rw,relatime shared:1 - ext4 /dev/sda3 rw,data=ordered
> 221 34 8:3 /var/lib/lxc/Ubuntu.saucy/rootfs /var/lib/lxc/Ubuntu.raring/rootfs rw,relatime shared:1 - ext4 /dev/sda3 rw,data=ordered
> 229 196 8:3 /var/lib/lxc/Ubuntu.saucy/rootfs /var/lib/lxc/Ubuntu.quantal/rootfs rw,relatime shared:1 - ext4 /dev/sda3 rw,data=ordered
> 234 204 8:3 /var/lib/lxc/Ubuntu.saucy/rootfs /var/lib/lxc/Ubuntu.precise/rootfs rw,relatime shared:1 - ext4 /dev/sda3 rw,data=ordered
> 239 209 8:3 /var/lib/lxc/Ubuntu.saucy/rootfs /usr/lib64/lxc/rootfs rw,relatime shared:1 - ext4 /dev/sda3 rw,data=ordered

Ouch...

The machine should now be in the same state.  No containers are running,
yet we have these mounts present.  Yeah, they're all "shared".  WTH,
though.  The first one was left with one dangling mount.  The second one
got two.  The third one got three.  The forth one got four.  What's
going on here?  That's 10 dangling mounts total for 4 creates!

If I then run "umount /usr/lib64/lxc/rootfs" 4 times (the number of
lxc-creates) ALL of the mounts disappear, so there is some pinning of
one mount to another involved here.  When I do that command until I get
the error "umount: /usr/lib64/lxc/rootfs: not mounted", I'm then back to
my original state.

Host is a current Fedora19 host.

> > /dev/mapper/fedora-root / ext4 rw,seclabel,relatime,data=ordered 0 0
> > [root at hydra mhw]# grep ' /usr/lib64/lxc/rootfs ' /proc/self/mounts
> > /dev/mapper/fedora-root /usr/lib64/lxc/rootfs ext4 rw,seclabel,relatime,data=ordered 0 0

> What about /proc/self/mountinfo?

Regards,
Mike
-- 
Michael H. Warfield (AI4NB) | (770) 985-6132 |  mhw at WittsEnd.com
   /\/\|=mhw=|\/\/          | (678) 463-0932 |  http://www.wittsend.com/mhw/
   NIC whois: MHW9          | An optimist believes we live in the best of all
 PGP Key: 0x674627FF        | possible worlds.  A pessimist is sure of it!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 482 bytes
Desc: This is a digitally signed message part
URL: <http://lists.linuxcontainers.org/pipermail/lxc-devel/attachments/20131014/475575e9/attachment.pgp>


More information about the lxc-devel mailing list