[lxc-devel] cgroup V2 and LXC

Serge Hallyn serge.hallyn at ubuntu.com
Wed Feb 10 17:45:48 UTC 2016


Quoting Christian Brauner (christian.brauner at mailbox.org):
> On Mon, Feb 01, 2016 at 04:56:08AM +0000, Serge Hallyn wrote:
> > Quoting Kevin Wilson (wkevils at gmail.com):
> > > Hi, LXC developers,
> > > 
> > > The latest kernel release (4.4) includes initial support to cgroup v2
> > > with 2 controllers (memory and io). Also it seems that the PIDs
> > > controller works in cgroup v2, but I do not know if it is officially
> > > supported in v2.
> > > 
> > > Is there any intention to replace the existing cgroup v1 usage in LXC
> > > by cgroup v2 ? or at least to enable working with both of them ?
> > > 
> > > Regards,
> > > Kevin
> > 
> > Replace, no, support, yes.  I've added support for it to cgmanager, and have
> > used lxc with the unified hierarchy through cgmanager.  Without cgmanager
> > it will currently definately not work.  It's worth discussing how we should
> > handle it - and how init wants us to handle it.   With cgmanager I actually
> > built in the support so that you could treat it as a legacy hierarchy, and
> > upstart was happy with that since it used cgmanager.  Systemd will not be
> > happy with that, and it will be a problem.  The only exception to the "no
> > tasks in a non-leaf node" rule is for the / cgroup.  So lxc would need to
> > place init in say /lxc/c1/.leaf, and systemd would have to accept that
> > /lxc/c1 is the container's cgroup.  A few possibilities:
> > 
> > 1. maybe if we place systemd in /lxc/c1/init.scope it will be happy
> Well, here is how I thought it could go (sticking to systemd specifics here):
>         - create a slice for all lxc "lxc.slice" (similar to "machine.slice" of
>           systemd-nspawn backed containers)
>         - "lxc.slice" contains a scope for each container (e.g. "c1.scope"
>         - "c1.scope" contains an "init.scope"
>         - "init.scope" only contains the PID of "/sbin/init" as seen from the
>           host (obviously)

So if we are creating container c1, are you talking about

/lxc/c1/lxc.slice/c1.scope/init.scope

or are you talking about a host-global

/lxc.slice

with container-specific

/lxc.slice/c1.scope

per container?

?

>         - All other processes are put in another slice "c1-something.slice"

Which other processes?

AFAIK all other processes will be created by systemd.  The q is what will it
do.  If we put systemd in /lxc.slice/c1.scope/init.scope, will it take that
as its cgroup root and try to create and move itself into
/lxc.slice/c1.scope/init.scope ?  If so it will fail since it cannot create a
cgroup while it is in it.

So I think I've convinced myself that we need to collaborate with systemd
on this.  Perhaps we can agree with it on a default cgroup in which it should
be started to tell it "this is the leaf cgroup for your init".  So if it sees
it is in /a/b/c/.cg_leaf, then it will know that /a/b/c is its root.

>         If we do not want to create scopes we are left with the option of
>         forcing "init" in a separate cgroup from the rest of the containers
>         processes.
> 
> Christian
> 
> 
> > 2. maybe we can teach systemd to accept being in a leaf node
> > 3. maybe we can build an exception into cgroup namespaces such that
> > a cgns root also is an exception to the no-tasks-in-non-leaf-nodes
> > rule.  But I doubt that will fly.
> > _______________________________________________
> > lxc-devel mailing list
> > lxc-devel at lists.linuxcontainers.org
> > http://lists.linuxcontainers.org/listinfo/lxc-devel


More information about the lxc-devel mailing list