[lxc-devel] [RFC PATCH] allow multiple monitor clients

Thu Apr 18 19:18:07 UTC 2013

On Thu, 18 Apr 2013 13:48:25 -0500
Serge Hallyn <serge.hallyn at ubuntu.com> wrote:

> Quoting Dwight Engen (dwight.engen at oracle.com):
> > After the recent discussions on the lxc monitor, I took a closer
> > look at it. It seems like Serge was resigned to having some sort of
> > daemon as having multicast unix domain supported by the kernel is
> > not going to happen any time soon, and doing it over loopback has
> > some complications too.
> > 
> > Initially I thought it would be nice to just have lxc-start "be the
> > daemon" as Stéphane had suggested. I tried this out, putting the fds
> > for the consumer clients (lxc-monitors) in lxc_mainloop , but this
> > is problematic for the scenario where lxc-monitor runs first and we
> > miss some startup states.
> > 
> > So here is what I'm proposing: when lxc-monitor starts, it attempts
> > to start lxc-monitord. lxc-monitord creates a fifo and a socket on
> > lxcpath.
> 
> Thanks, Dwight.  Looks awesome.  Some comments below, but I'm only
> not adding an ack bc you say you want to make some changes first
> anyway.
> 
> When you send the next version I'll run it through my testsuite (and
> presumably ack).

Oh yeah I'm not ready for a review yet, need to do more testing and fix
those FIXME's, just wanted to get feedback on the approach :) Thanks
for the quick response.

> > Producers (containers) wishing to send messages open the fifo and
> > write their lxc_msg's on it. The reason I used a fifo is so that we
> > get permission checks on the opening of it. Right now it is created
> > 0600. This fixes the problem we have today where anyone can open
> > the unix socket and send state changes in.
> > 
> > Consumers (lxc-monitor) connect to the unix socket and each gets a
> > copy of the messages. lxc-monitord can do a PEERCRED to ensure the
> > client should be able to get these messages, this is one benefit to
> > using unix domain. I don't think it would be difficult to write a
> > lxc-monitor like client that would bridge the messages onto d-bus.
> > 
> > When lxc-monitord hasn't had a client for 30 seconds it will exit.
> > 
> > Attached is a proof of concept patch for your comments. It works for
> > the common cases that I've tried but I know it needs some cleanup
> > and testing if we like the approach. The monitor.c part is hard to
> > read because that file is mostly rewritten.
> > 
> > I need to test S.Çağlar Onur's scenario of parallel startup with his
> > test script. We need to decide if we want to monitor per lxcpath or
> > per container,
> 
> Do you mean 'per lxcpath or per host'?  We definately don't want per
> container.

Right I agree, but I think some of Caglar's patches added the container
name to the path, which would've made it per container. Just verifying
we don't actually want that.

> > today it is per path and I think that being able to do
> > lxc-monitor * would be very useful.
> 
> If by that you mean 'watch all paths', well - I don't think * is
> as useful, but certainly
> 'lxc-monitor /var/lib/lxc /home/serge/lxcbase' would be nice to do.
> But we can do that later with your monitord, just by watching both
> sockets, right?

Yep, should be trivial fd wise. For this to work with multiple uids I
guess the check in monitord should be peercred == getuid() || peercred
== 0 that way if you to monitor across uids you have to be root.

> > 
> > --
> > 
> > Signed-off-by: Dwight Engen <dwight.engen at oracle.com>
> > ---
> ...
> > +static void lxc_monitord_delete(struct lxc_monitor *mon, const
> > char *lxcpath) +{
> > +	int i;
> > +
> > +	lxc_mainloop_del_handler(&mon->descr, mon->listenfd);
> > +	close(mon->listenfd);
> 
> The ordering here might need to change to ensure we don't get any
> clients hanging between the two steps.

Hmm, good point. Have to think about this, there might be a race the
other way with still having the fd in the epoll set also.

> > +/* routines used by monitor subscribers (lxc-monitor) */
> > +int lxc_monitor_close(int fd)
> >  {
> > -	struct sockaddr_un addr = { .sun_family = AF_UNIX };
> > -	char *offset = &addr.sun_path[1];
> > -	int fd;
> > -	size_t ret, len;
> > +	return close(fd);
> > +}
> >  
> > -	/*
> > -	 * addr.sun_path is only 108 bytes.
> > -	 * should we take a hash of lxcpath?  a subset of it?
> > +int lxc_monitor_sock_name(const char *lxcpath, struct sockaddr_un
> > *addr) {
> > +	size_t len,ret;
> > +	char *sockname = &addr->sun_path[0]; // 1 for abstract
> > +
> > +	/* addr.sun_path is only 108 bytes.
> > +	 * should we take a hash of lxcpath?  a subset of it?
> > ftok()? none
> > +	 * are guaranteed.
> 
> guaranteed to what?  The hash should be pretty immune to duplicates,
> and does guarantee the length.  The problem with that is simply that
> we lose the ability to easily predict (without a tool :) the abstract
> socket name and type it by hand.

Sorry, guaranteed unique. 108 bytes is plenty for a uuid for instance,
but we really don't want collisions. Yeah I think last time this came
up we liked the obviousness of the real name. I wanted to not use unix
sockets to get rid of this limitation, but we want to count
connected clients so the daemon can go away.

> > diff --git a/src/lxc/start.c b/src/lxc/start.c
> > index aefccd6..9451fb7 100644
> > --- a/src/lxc/start.c
> > +++ b/src/lxc/start.c
> > @@ -429,6 +429,9 @@ struct lxc_handler *lxc_init(const char *name,
> > struct lxc_conf *conf, const char if (lxc_command_init(name,
> > handler, lxcpath)) goto out_free_name;
> >  
> > +	if (lxc_monitord_spawn(lxcpath))
> > +		goto out_close_maincmd_fd;
> 
> Can I ask why you're doing this?  I don't think it's necessary.
> Simply accept that we might send events that get ignored (as we do
> now).

Left over from when I was trying to make stuff work within lxc-start.
We don't need this.

> > +
> >  	if (lxc_read_seccomp_config(conf) != 0) {
> >  		ERROR("failed loading seccomp policy");
> >  		goto out_close_maincmd_fd;
> > @@ -437,7 +440,7 @@ struct lxc_handler *lxc_init(const char *name,
> > struct lxc_conf *conf, const char /* Begin the set the state to
> > STARTING*/ if (lxc_set_state(name, handler, STARTING)) {
> >  		ERROR("failed to set state '%s'",
> > lxc_state2str(STARTING));
> > -		goto out_free_name;
> > +		goto out_close_maincmd_fd;
> >  	}
> >  
> >  	/* Start of environment variable setup for hooks */
> > @@ -808,7 +811,7 @@ int lxc_spawn(struct lxc_handler *handler)
> >  	/* TODO - pass lxc.cgroup.dir (or user's pam cgroup) in
> > for first argument */ if ((handler->cgroup =
> > lxc_cgroup_path_create(NULL, name)) == NULL) goto out_delete_net;
> > -	
> > +
> >  	if (lxc_cgroup_enter(handler->cgroup, handler->pid) < 0)
> >  		goto out_delete_net;
> >  
> > -- 
> > 1.8.1.4
> > 
> > 
> > 
> > ------------------------------------------------------------------------------
> > Precog is a next-generation analytics platform capable of advanced
> > analytics on semi-structured data. The platform includes APIs for
> > building apps and a phenomenal toolset for data science. Developers
> > can use our toolset for easy data analysis & visualization. Get a
> > free account!
> > http://www2.precog.com/precogplatform/slashdotnewsletter
> > _______________________________________________ Lxc-devel mailing
> > list Lxc-devel at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/lxc-devel