[lxc-devel] [PATCH] procfs in containers based on fuse

Stéphane Graber stgraber at ubuntu.com
Fri Sep 19 21:45:55 UTC 2014


Hi,

Sorry for the late reply, been kind of busy lately...


I've actually been working on something pretty similar called
cgmanagerfs (with an older python prototype as lxcfs on my github
account).

Our plan there is to provide a fuse filesystem which offers two things:
 - a bunch of proc files returning the expected values based from cgroup
   limits, those files will initially be meminfo, stat and cpuinfo.
 - a cgroupfs compatible representation of the cgroup hierarchy with the
   root set to the client process' own cgroup.

This is called cgmanagerfs because it's done on top of cgmanager using
libfuse for the filesystem and libnih-dbus to talk to cgmanager. The
main reason for only supporting cgmanager besides the obviously reduced
complexitiy thanks to only supporting a single method of accessing the
cgroup information is so that we can allow this filesystem be mounted in
unprivileged containers where cgroupfs can't be mounted.

This filesystem combined with Seth's unprivileged fuse and our already
existing combination of cgmanager and cgproxy in unprivileged containers
will allow getting the real values for cpuinfo, meminfo and stat in all
containers, unprivileged ones included.



We're also not planning on including this directly in LXC, instead,
it'll be a separate daemon which will be started at system boot time and
will setup the filesystem in /var/lib/cgmanagerfs, containers will then
simply have the following bind-mounts:
 - /var/lib/cgmanagerfs/proc/cpuinfo => proc/cpuinfo
 - /var/lib/cgmanagerfs/proc/meminfo => proc/meminfo
 - /var/lib/cgmanagerfs/proc/stat => proc/stat
 - /var/lib/cgmanagerfs/cgroupfs => sys/fs/cgroup


This is a pretty lightweight approach as we only need a single fuse
filesystem for the whole system, it works with unprivileged containers
and shouldn't have much overhead as only those 3 files get overriden,
the rest is still straight procfs.



The python prototype is mostly working with the exception of a few
unmapped fields in meminfo (returning 0 as a result) but that was a
quick hack I did in a couple of hours during LinuxCon North America.
Rewriting it in C will take me quite a bit longer I suspect especially
as I'm currently busy with other things.

Anyway, this is the way we intend to solve this issue for LXC upstream,
if you have an interest in this and would like to contribute to
cgmanagerfs, that'd be greatly appreciated (however beware, dbus isn't
very much fun to deal with initially :)).



All that to say, I will not be merging this approach but it's an issue
we're aware of and are actively working to resolve in the near future,
thanks!

On Fri, Sep 12, 2014 at 09:15:39AM +0000, Zhou Kang(研究院) wrote:
> Hi
> This patch is for procfs in per container. It is based on Daniel Lezcano’s code which use fsue. ( https://github.com/hallyn/procfs ), but we made the following improvements.
> 
> l  The fuse_main is started by lxc_start, so it’s easier to manage the container
> 
> l  We mount the /proc path in /tmp instead of rootfs, the path is uniq.
> 
> l  We rewrite the following files: meminfo, stat, cpuinfo, sysrq-trigger. The command ‘top’ can show the right info in the container.
> 
> l  /proc/ sysrq-trigger is unwritable.
> 
> We tested this patch on hundreds of containers for a long time, it had worked very well.
> 
> ________________________________
> Kang Zhou


> _______________________________________________
> lxc-devel mailing list
> lxc-devel at lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-devel


-- 
Stéphane Graber
Ubuntu developer
http://www.ubuntu.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://lists.linuxcontainers.org/pipermail/lxc-devel/attachments/20140919/f2191687/attachment-0001.sig>


More information about the lxc-devel mailing list