[lxc-devel] [PATCH] cgfs: don't mount /sys/fs/cgroup readonly

Serge Hallyn serge.hallyn at ubuntu.com
Fri May 2 21:23:12 UTC 2014


Quoting Christian Seiler (christian at iwakd.de):
> Perhaps you could tell me what you did and what the problem was, before

Well that's what cc:ing you was  :)  Thanks for setting me straight.

Yeah so the first hunk should be dropped.  Meanwhile,

> modifying this part of the code? Then we could figure out a solution for
> that together. I put quite a bit of thought into it to make sure the
> logic behind it is sane, and I don't see the reason for your changes.

Sorry, thought I was clearer in my patch description than I was.

On an ubuntu system, mountall wants /sys/fs/cgroup to be mounted rw.
So on container startup, mountall will see that /sys/fs/cgroup is ro
and hang startup (waiting for the user to say whether to skip
or manually fix) because it's not allowed to remount /sys/fs/cgroup
rw.

> If you just want the tmpfs to be read-write, then the following does that:
> 
> > @@ -1487,14 +1479,6 @@ static bool cgroupfs_mount_cgroup(void *hdata, const char *root, int type)
> >  		parts = NULL;
> >  	}
> >  
> > -	/* try to remount the tmpfs readonly, since the container shouldn't
> > -	 * change anything (this will also make sure that trying to create
> > -	 * new cgroups outside the allowed area fails with an error instead
> > -	 * of simply causing this to create directories in the tmpfs itself)
> > -	 */
> > -	if (type != LXC_AUTO_CGROUP_RW && type != LXC_AUTO_CGROUP_FULL_RW)
> > -		mount(NULL, path, NULL, MS_REMOUNT|MS_RDONLY, NULL);
> > -
> >  	free(path); > >  
> >  	return true;
> 
> Although here I also see a problem for a different use case:
> 
> Let's say you have a container which is mounted cgroup:mixed (not
> cgroup-full). Let's assume for the sake of argument that we only have
> the 'cpu' hierarchy. Then the following mountpoints will exist (for the
> case cgroup:mixed):
> 
> /sys/fs/cgroup                  [tmpfs, ro]
> /sys/fs/cgroup/cpu/lxc/c1       [bind-mount of host, rw]
>
> If the container now has a script that says:
> echo $$ > /sys/fs/cgroup/cpu/tasks
> to put itself into the root cgroup, then in the current version of LXC,
> this will fail with a "read-only filesystem" error, i.e. there is an
> immediate feedback for the script that something went wrong and it
> couldn't put itself into the root cgroup.
> 
> If you now make the tmpfs (even if you just do the tmpfs part, not the
> other part of the code) read-write, then the above script will just
> create a file "tasks" in the tmpfs and put the script's current PID in
> it; the script will never know that this did not have the desired effect
> of moving itself into the root cgroup. So the script now thinks it did
> something which it actually didn't do. In order to avoid that, I
> explicitly remounted the tmpfs readonly for that case.

Hm.  So IIUC lxc would have to (and, you're telling me, now does not) have
the following mounts?

> /sys/fs/cgroup                  [tmpfs, rw]
> /sys/fs/cgroup/cpu              [tmpfs, ro]
> /sys/fs/cgroup/cpu/lxc/c1       [bind-mount of host, rw]

?

-serge


More information about the lxc-devel mailing list