[lxc-devel] [PATCH 1/1] lxc_attach: fix break with user namespaces (v2)
Serge Hallyn
serge.hallyn at canonical.com
Tue Jan 22 02:15:30 UTC 2013
Quoting Christian Seiler (christian at iwakd.de):
> Hi Serge,
>
> Just a few quick comments because I'm very interested in the lxc-attach
> utility:
>
> > + ret = lxc_cgroup_prepare_attach(my_args.name, &cgroup_data);
> > + if (ret < 0) {
> > + ERROR("failed to prepare attaching to cgroup");
> > + return -1;
> > + }
> > +
> > + ret = lxc_cgroup_finish_attach(cgroup_data, gchild);
> > + if (ret < 0) {
> > + ERROR("failed to attach process to cgroup");
> > + return -1;
> > + }
>
> Note that I made the whole cgroup attach logic so complicated a while
> back (i.e. prepare before setns/fork -> finish/dispose after fork),
> precisely so that one wouldn't need a second fork and so I didn't have
Right, that wasn't lost on me as I was doing this :)
> to play around with IPC to get the PID of the process that is to be
> added to the cgroup. If an additional fork is needed anyway due to user
> namespaces (that reminds me that I should definitely try them out...),
> the reason for making the cgroup attach logic so complicated disappears
> and one could return to the direct approach from before, probably makes
> it quite a bit easier to read.
Agreed. We should move that back out of attach.c.
> Side note 1: that I would use pid_t and not int as the data type of the
> object sent through the pipe, that seems more portable to me.
>
> Side note 2: Idea, but untested and not completely thought through, just
> wanted to put it out there: The "middle" process in your logic need not
> necessarily be kept around if the kernel version is at least 3.4 -
> because Linux then supports prctl(PR_SET_CHILD_SUBREAPER), which - if
> I'm not completely mistaken - would reparent the "inner" process (that
> will execve() to the requested program inside the container) to the
> "outer" attach process after the "middle" process exits. That way we
> might keep only one event loop around. On the other hand, this still
> requires both implementations if kernels < 3.4 are still to be
> supported... (as I said, not completely thought through, just wanted to
> put the idea out there)
>
> > + return -1;
> > +
> > + return 0;
>
> This seems a bit weird... ;-)
>
> > + /* XXX FIXME this should get the uid of the container init and setuid to that */
> > + /* XXX FIXME or perhaps try to map in the lxc-attach caller's uid? */
>
> I believe a sane default would be the first option (uid of init, on the
> other hand, that isn't so easy to get by, because one has to assume
> /proc is mounted and certain security features are turned off) and let
> the user specify a uid otherwise.
Hm, i didn't really think mounted /proc on the host was all that
optional for containers anyway. We could always just avoid the setuid
if we find /proc unmounted.
Would you care to update the patch along these lines?
-serge
More information about the lxc-devel
mailing list