[lxc-devel] cgroup V2 and LXC
Serge Hallyn
serge.hallyn at ubuntu.com
Wed Feb 10 17:45:48 UTC 2016
Quoting Christian Brauner (christian.brauner at mailbox.org):
> On Mon, Feb 01, 2016 at 04:56:08AM +0000, Serge Hallyn wrote:
> > Quoting Kevin Wilson (wkevils at gmail.com):
> > > Hi, LXC developers,
> > >
> > > The latest kernel release (4.4) includes initial support to cgroup v2
> > > with 2 controllers (memory and io). Also it seems that the PIDs
> > > controller works in cgroup v2, but I do not know if it is officially
> > > supported in v2.
> > >
> > > Is there any intention to replace the existing cgroup v1 usage in LXC
> > > by cgroup v2 ? or at least to enable working with both of them ?
> > >
> > > Regards,
> > > Kevin
> >
> > Replace, no, support, yes. I've added support for it to cgmanager, and have
> > used lxc with the unified hierarchy through cgmanager. Without cgmanager
> > it will currently definately not work. It's worth discussing how we should
> > handle it - and how init wants us to handle it. With cgmanager I actually
> > built in the support so that you could treat it as a legacy hierarchy, and
> > upstart was happy with that since it used cgmanager. Systemd will not be
> > happy with that, and it will be a problem. The only exception to the "no
> > tasks in a non-leaf node" rule is for the / cgroup. So lxc would need to
> > place init in say /lxc/c1/.leaf, and systemd would have to accept that
> > /lxc/c1 is the container's cgroup. A few possibilities:
> >
> > 1. maybe if we place systemd in /lxc/c1/init.scope it will be happy
> Well, here is how I thought it could go (sticking to systemd specifics here):
> - create a slice for all lxc "lxc.slice" (similar to "machine.slice" of
> systemd-nspawn backed containers)
> - "lxc.slice" contains a scope for each container (e.g. "c1.scope"
> - "c1.scope" contains an "init.scope"
> - "init.scope" only contains the PID of "/sbin/init" as seen from the
> host (obviously)
So if we are creating container c1, are you talking about
/lxc/c1/lxc.slice/c1.scope/init.scope
or are you talking about a host-global
/lxc.slice
with container-specific
/lxc.slice/c1.scope
per container?
?
> - All other processes are put in another slice "c1-something.slice"
Which other processes?
AFAIK all other processes will be created by systemd. The q is what will it
do. If we put systemd in /lxc.slice/c1.scope/init.scope, will it take that
as its cgroup root and try to create and move itself into
/lxc.slice/c1.scope/init.scope ? If so it will fail since it cannot create a
cgroup while it is in it.
So I think I've convinced myself that we need to collaborate with systemd
on this. Perhaps we can agree with it on a default cgroup in which it should
be started to tell it "this is the leaf cgroup for your init". So if it sees
it is in /a/b/c/.cg_leaf, then it will know that /a/b/c is its root.
> If we do not want to create scopes we are left with the option of
> forcing "init" in a separate cgroup from the rest of the containers
> processes.
>
> Christian
>
>
> > 2. maybe we can teach systemd to accept being in a leaf node
> > 3. maybe we can build an exception into cgroup namespaces such that
> > a cgns root also is an exception to the no-tasks-in-non-leaf-nodes
> > rule. But I doubt that will fly.
> > _______________________________________________
> > lxc-devel mailing list
> > lxc-devel at lists.linuxcontainers.org
> > http://lists.linuxcontainers.org/listinfo/lxc-devel
More information about the lxc-devel
mailing list