[lxc-users] Running docker inside unprivileged LXC containers

Thu Jun 11 01:39:58 UTC 2015

On Wed, Jun 10, 2015 at 9:14 AM, Stewart Brodie <sbrodie at espial.com> wrote:

> Serge Hallyn <serge.hallyn at ubuntu.com> wrote:
>
> > Quoting Akshay Karle (akshay.a.karle at gmail.com):
> > > Hello,
> > >
> > > I'm currently working on a project that requires to run docker
> > > containers inside unprivileged LXC containers. I've managed to run
> > > unprivileged containers on an Ubuntu 14.04 host. I've also managed to
> > > get the docker daemon running using the LXC driver instead of native
> > > docker exec driver. Right now I'm stuck when trying to start a docker
> > > container as it attempts to create special devices which fails as it
> > > doesn't have the permissions to do so in the unprivileged container.
>
> > You'll need to coordinate between the container and the host to create
> the
> > devices.  This is something I do want to think about, but have not yet
> had
> > time to do so.  It may involve updating Docker to use a service, when
> > available, to request devices be created.  This could be a dbus service
> > which gets (vetted and) passed through to the host.
>
> I've thought about this a bit too, as it's the same problem I'm facing
> (although in my case, there's very little software in the host or the
> container, just a pretty minimal busybox plus a couple of applications, so
> anything based around requiring dbus or systemd is useless for my purposes)
>
>
well, if this route does work, then its definitely possible. Since i create
LXC containers via Chef, which also configures the host system, this can be
easily automated.
@akshay if iirc you too using some automated system to build the LXC
containers, cant you automate the upfront device creation and selectively
bind mount them for docker ?

> I'm attempting to start an unprivileged container and populate the devices
> using an autodev hook, but that doesn't work, because the user namespace
> has
> already been changed.  So I'm stuck with having to bind mount all the
> devices individually, which would be great - except that the device nodes
> don't all exist in the host, so I'm having to create them in the host in
> advance of starting the containers.
>
> Could lxc-start create the device nodes before the user namespace is
> changed?  It'd have to apply the uid_map and gid_map manually, but that
> might be doable.
>
> Of course, once the container is running, you can use lxc-device to create
> the devices inside the container, but that's no use if you need the devices
> early on.  You can't do this in the start hook, because you need lxc-start
> to release all its locks before you can run lxc-device.
>
> I considered changing lxc-start so it cached a thread that would remain
> authorised to do the mknod() calls and could be called upon as necessary,
> but didn't actually try it yet.  Perhaps that's worth looking into?
>
> An alternative idea I had was to not run /sbin/init in the container but
> instead run a shell script that communicates its readiness state to the
> host
> and then waits for an indication that it is safe to continue, at which
> point
> it would exec /sbin/init.  Meanwhile the process in the host would be
> waiting for the ready indication and run lxc-device as required and then
> send back the indication to the container to continue.  I'm confident that
> this will work, as I've done this sort of thing before, and I already have
> a
> signalling mechanism working so that I know when the container's init has
> finished running all the sysinit tasks from /etc/inittab and thus the
> applications are ready.  It would be quite easy to adapt, but it's one of
> those "neat, but really ugly" kind of scripts - a one-liner, based around
> inotifywait.  The big advantage is that this will work without any changes
> to lxc at all.  The main disadvantage is that it's ugly.
>
> Another alternative could be to have the process that is the new PID 1 in
> the container to SIGSTOP itself just before the execve, then lxc-start and
> a
> new "post-start" hook could collude to run the hook without lxc-start
> holding any of the locks.  This sounds incredibly messy though, not to
> mention failure-prone.
>
> However, another far neater way of doing this could be to use the freezer
> instead.  Just give lxc-start a new command-line option to start the
> container *but* crucially, leave it frozen when lxc-start exits.  The
> caller
> can then just do lxc-start, lxc-device, lxc-unfreeze.  This would seem to
> be
> the least invasive way of doing it, and stands a good chance of working
> reliably, I would have thought, as long as you can execute the freeze at
> the
> right point (just before the execve of the new PID 1) and as long as
> lxc-device works on a frozen container (does it?  I can't get to my dev box
> right now to test it)
>
>
> --
> Stewart Brodie
> Senior Software Engineer
> Espial UK
> _______________________________________________
> lxc-users mailing list
> lxc-users at lists.linuxcontainers.org
> http://lists.linuxcontainers.org/listinfo/lxc-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxcontainers.org/pipermail/lxc-users/attachments/20150610/bf5953bd/attachment.html>