[lxc-devel] mounts...
Michael Tokarev
mjt at tls.msk.ru
Sat Nov 14 11:54:20 UTC 2009
Hello!
Several questions here if I can... ;)
Why mountpoints in the per-container fstab can't be relative
to the container's rootfs? It's trivial to implement by
allowing non-absolute pathnames in there and chdir'ing into
the rootfs prior to mounting. That already works when running
lxc-start from within the container's rootfs.
I think it's the way to go really - to _require_ non-absolute
mountpoints in the container's mount file. Partly because it's
not a good idea to mount to a directory which is not visible from
within a container (but it might be useful still to grant access
to only part of the filesystem to a given container). And partly
because it's just somewhat ugly.
Second question is about the "other" mountpoints that exists on
the host system when starting a container. Is is a good idea to
umount "unrelated" filesystems that are not used in a container
but are still shown in /proc/mounts? I mean, is there a way to
access these from within a container somehow, bypassing the
"container barrier"?
The whole mount tree in a container is quite ugly too. Just try
to run df(1) from a container to see why. Ideally the whole namespace
tree should be cleared to remove all the unrelated mounts, but it
isn't quite possible as long as some ttys or console is mounted from
host's /dev or as long as we're using /var/lib/lxc/$name as a starting
point and as long as the container's rootfs itself is not on a dedicated
partition. But seriously, being able to run df(1) normally inside a
container is - IMHO - worth some efforts. Here's an example from my
test system:
Filesystem 1K-blocks Used Available Use% Mounted on
rootfs 67076224 17666768 49409456 27% /
/dev/md1 67076224 17666768 49409456 27% /
tmpfs 67076224 17666768 49409456 27% /lib/init/rw
sysfs 67076224 17666768 49409456 27% /sys
devfs 1024 0 1024 0% /dev
tmpfs 1024 0 1024 0% /dev/shm
cgroup 1024 0 1024 0% /dev/cgroup
/dev/md2 67076224 17666768 49409456 27% /usr
/dev/md3 67076224 17666768 49409456 27% /var
varrun 67076224 17666768 49409456 27% /var/run
varloc 67076224 17666768 49409456 27% /var/lock
df: `/guest': No such file or directory
df: `/stage': No such file or directory
/tmp 67076224 17666768 49409456 27% /tmp
df: `/guest/lenny/t0/dev': No such file or directory
df: `/guest/lenny/t0/dev/console': No such file or directory
df: `/guest/lenny/t0/dev/tty1': No such file or directory
df: `/guest/lenny/t0/dev/tty2': No such file or directory
df: `/guest/lenny/t0/dev/tty3': No such file or directory
df: `/guest/lenny/t0/dev/tty4': No such file or directory
/dev/md8p11 67076224 17666768 49409456 27% /
devfs 1024 0 1024 0% /dev
That's just... nonsense ;)
Another question is about how a container looks like from within a
host system. For example, here's an lsof(1) output for a pid1 in
a container above (bash):
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
bash 32164 root cwd DIR 259,12 4096 2061 /tmp/lxc-rC7sKKP
bash 32164 root rtd DIR 259,12 4096 2061 /tmp/lxc-rC7sKKP
bash 32164 root txt REG 259,12 700492 268582935 /tmp/lxc-rC7sKKP/bin/bash
bash 32164 root mem REG 259,12 245904 /tmp/lxc-rC7sKKP/lib/libnss_files-2.7.so (stat: No such file or directory)
bash 32164 root mem REG 259,12 245906 /tmp/lxc-rC7sKKP/lib/libnss_nis-2.7.so (stat: No such file or directory)
bash 32164 root mem REG 259,12 245901 /tmp/lxc-rC7sKKP/lib/libnsl-2.7.so (stat: No such file or directory)
bash 32164 root mem REG 259,12 245902 /tmp/lxc-rC7sKKP/lib/libnss_compat-2.7.so (stat: No such file or directory)
bash 32164 root mem REG 259,12 245895 /tmp/lxc-rC7sKKP/lib/libc-2.7.so (stat: No such file or directory)
bash 32164 root mem REG 259,12 245898 /tmp/lxc-rC7sKKP/lib/libdl-2.7.so (stat: No such file or directory)
bash 32164 root mem REG 259,12 2924566 /tmp/lxc-rC7sKKP/lib/libncurses.so.5.7 (stat: No such file or directory)
bash 32164 root mem REG 259,12 245892 /tmp/lxc-rC7sKKP/lib/ld-2.7.so (stat: No such file or directory)
bash 32164 root 0u CHR 136,8 11 /dev/pts/8
bash 32164 root 1u CHR 136,8 11 /dev/pts/8
bash 32164 root 2u CHR 136,8 11 /dev/pts/8
bash 32164 root 255u CHR 136,8 11 /dev/pts/8
Where that /tmp/lxc-rC7sKKP come from? What's the reason to
create a separate mount to start with, why not use rootfs directly?
I _think_ I don't understand something here and a separate mount is
actually required to be a rootfs for a container, in a way similar
to somewhat-fake (in a sense that on normal system it contains nothing)
rootfs on real host system.
But maybe /var/lib/lxc/rootfs is better suited for that instead of a random
name in /tmp ? And maybe it's a good idea to actually show whole mount tree
(at least as long as it's not modified in a container) on a host system?
And finally, isn't it simpler to run a script (or an external command) to
prepare the container's namespace (and do other necessary things) than to try
to do everything from within the conffile? I mean, instead of stuff like
the mounting (processing mounts file or conffile entries), setting up
cgroups(*), hostname, mounting consoles etc, there might be a place to call
a specified shell script that does all that and other things.
(*) for cgroups, especially for devices, it's quite ugly to specify things
by device numbers, having in mind the dynamic nature of devices nowadays.
It should be easy to let things like:
lxc.cgroup.devices.allow = /dev/null rwm
so that it gets translated to "c 1:3" at invocation time. That can be done
in a mentioned shell script just fine.
Thanks!
/mjt
More information about the lxc-devel
mailing list