[lxc-devel] [PATCH 3/4] cgroup: Add lxc_setup_mount_cgroup to setup /sys/fs/cgroup inside the container

Thu Sep 12 20:17:51 UTC 2013

Quoting Christian Seiler (christian at iwakd.de):
> Hi Serge,
> 
> >> I could get behind the following:
> >>
> >>    proc            - always read-write (no harm AFAICT)
> >>    sys             - default: read-only
> >>    sys:rw          - read-write
> >>    sys:ro          - explicit read-only
> >>    cgroup:ro       - completely ro (including paths)
> >>    cgroup:rw       - completely rw (including paths)
> > 
> > That sounds good.
> > 
> >>    cgroup:mixed    - paths ro, other rw
> > 
> > what is 'paths' vs. 'other' here?  There's
> > 
> > /sys/fs/cgroup
> > 
> > itself,
> > 
> > /sys/fs/cgroup/$subsys
> > 
> > then the paths up to the container's own path, and then
> > there's the stuff under the container's own path.  I'm not
> > clear on which you're calling what.
> 
> What I meant is that mixed is the current staging behaviour, i.e.
> 
>   - /sys/fs/cgroup:                           tmpfs, ro after setup
>   - /sys/fs/cgroup/$subsys/$container_cgroup: bind-mount, rw
> 
> So that /sys/fs/cgroup is r/o, /sys/fs/cgroup/$subsys is r/o,
> /sys/fs/cgroup/$subsys/$parent_of_container_cgroup is r/o but
> /sys/fs/cgroup/$subsys/$container_cgroup is r/w.
> 
> >>    cgroup-full:ro    - mount complete tree read-only (not just partial)
> >>    cgroup-full:rw    - mount complete tree read-write (not just partial)
> >>    cgroup-full:mixed - mount complete tree read-only but bind-mount
> >>                        partial tree read-write
> >>    cgroup-full       - defaults to cgroup-full: mixed
> > 
> > Hm, but you're doing the full tree by default.  What is the difference
> > between this and cgroup:ro?
> 
> cgroup-full:mixed would be:
> 
>  - /sys/fs/cgroup:                          tmpfs, ro
>  - /sys/fs/cgroup/$subsys                   bind-mount, ro
>  - /sys/fs/cgroup/$subsys/$container_cgroup bind-mount, rw
> 
> That has the advantage that /sys/fs/cgroup/$subsys is actually a cgroup
> filesystem (even though it's read-only), which may improve compatibility
> compared to the current behavior, but the disadvantage that the names of
> all cgroups of the host (including those in other containers) leak into
> the container (even though the container can't really do anything with
> them, if it doesn't have mount permissions).
> 
> cgroup-full:rw would just mount everything into /sys/fs/cgroup as it
> should be according to the standard and make everything read-write.
> 
> cgroup-full:ro would do the same as cgroup-full:rw but read-only.
> 
> It then depends on the policy of the administrator and the compatibility
> level of software that is to be run in the container what option should
> be chosen.
> 
> Would you agree?

Yup, sounds good.  This email should probably be cut-pasted into the
lxc.conf man page then :)

Should I apply the patch 4/4 as it stands now and the rest can be a
separate patch?

Oh, one other thing is lxc.mount.auto needs to be added to
write_config().  Otherwise lxc-clone won't work on these, since
the new container won't have lxc.mount.auto.

thanks,
-serge