[lxc-devel] [PATCH] Add mechanism for container to notify host about end of boot

Daniel P. Berrange berrange at redhat.com
Fri Sep 14 11:46:28 UTC 2012


On Fri, Sep 14, 2012 at 12:12:57PM +0100, Christian Seiler wrote:
> >>If we want to have a back-channel, we'd need a socket, which
> >>makes just
> >>doing echo RUNNING > /dev/lxc-notify impossible, you'd need a
> >>special
> >>program for that. Having the template scripts dump an additional
> >>script
> >>or upstart job or systemd unit file or whatever in the container
> >>when
> >>creating it seems a lot easier than having to use a special program.
> >
> >FYI, the systemd team actually want to be able to expose a full
> >socket
> >from the container to the host, so that the host systemd/systemctl
> >cmd
> >can directly communicate with the container's systemd. So I don't
> >think
> >that /dev/lxc-notify would be useful for systemd.
> 
> First of all, you have to separate two things - I mentioned systemd
> here in the sense that when the system reaches default.target,
> /dev/lxc-notify should be pinged so that the lxc state now changes from
> BOOTING to RUNNING. What you are talking about is a systemctl on the
> outside of the container affecting the inside. I wanted to solve the
> first problem, where /dev/lxc-notify is useful anyway, with or without
> systemd.

Actually what I was anticipating that it work both ways - SystemD
uses DBus over its control socket, which would allow for both RPC
calls from host into the container, and signals from the container
to be emitted upon service status change and received by the host.

> The use case you are describing is a bit more complicated. You want to
> expose a socket outside the container that is listened to by a program
> inside the container. The problem here is that if you want to
> bind-mount it before the pivot_root call, this will not work since
> bind() for a socket will fail if the file already exists. But as soon
> as you are already in the container, if systemd actually does listen to
> a socket somewhere, you'll have a hard time bind-mounting it back to
> the outside, How do you bridge the mount namespace? Obviously, if the
> container's filesystem is mounted on the host anyway, you don't have a
> problem, since you don't need to take care about the namespace; but
> what if the lxc config specifies a block device that is then only
> mounted inside the container's namespace?

I must admit the details aren't worked out, but the rough idea was
something like the following. On the host have a directory per
container, in which the socket is setup

   /var/lib/systemd/containerXXXX/

And bind '/var/lib/systemd/containerXXX' into the container in some
location, lets say '/var/lib/systemd/self/'. The idea is that if
systemd in the container now listens on /var/lib/systemd/self/systemd.sock
that a process in the host can connect via

  /var/lib/systemd/containerXXXX/systemd.sock

I'm a little fuzzy on exactly how UNIX domain socket paths interact
wrt mount namespaces though, so this idea may not actually work in
practice - I'm yet to try it.

Another option is actually to ignore the filesystem and have systemd
in the host simply pass in a pre-opened file descriptor when creating
the container, which systemd in the container can just inherit and
use.

> That being said, if we actually implement /dev/lxc-notify (or however
> one wants to call it, perhaps /run/lxc-host-interface?) as a socket
> with an extensible protocol, it would be possible not only to have a
> command that tells lxc to open a socket on the host and pass the fd
> back through the connection, then systemd on the inside would be in
> posession of a socket that listens on the outside and that an outside
> systemctl could affect. So my proposal with the modifications suggested
> by Stéphane would actually be able also solve your use case.

The socket passing idea you describe is an interesting idea.

> However, first I'd like to have the basic version just for status
> updates (because that is a useful feature anyway, independently of the
> init system) in order to keep it simple - and once that is done, one
> may think about how/whether to extend this to include other use cases
> that are more specialized.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|




More information about the lxc-devel mailing list