[lxc-devel] versioning the container monitor api
Serge Hallyn
serge.hallyn at ubuntu.com
Tue Aug 27 21:25:03 UTC 2013
I like it, thanks :)
-serge
Quoting Christian Seiler (christian at iwakd.de):
> Hi Serge,
>
> > I start a container running a crucial mail server. I upgrade
> > lxc. The new lxc has changed the format of messages for the
> > commands api. Now I do 'lxc-list', which queries the running
> > monitor to check its init pid with LXC_CMD_GET_INIT_PID. The
> > container monitor crashes on bad input.
>
> Yes, that's a problem I frequently also had.
>
> > The lxc_af_unix_connect function could start with a handshake with a
> > version number, or we could tack a version # onto the lxc_cmd_req
> > struct. Best would be if we agreed the client always sends its version
> > to the monitor, then vice versa, and then both sides decide whether
> > they can proceed (so both sides can log error). We could just use
> > a monotonically increasing int, hand-inserted. However that's subject
> > to error - if we make a change without remembering to update the version
> > number, then we could still get a crash. We could automate this perhaps
> > by having a Makefile do some sort of check, i.e. hashing all the structs
> > which may be communicated over the socket.
>
> I think the real solution is far easier: previously, the command
> interface changed quite a bit because it was quite a bit more limited
> than it is now. But now the basic structure of the current command
> interface seems to be rather complete. Each request is just a tuple
> (cmd, datalen, data_ptr (mostly ignored)) + possibly additional data of
> length datalen on the line afterwards. Each response is (ret, datalen,
> data_ptr (mostly ignored)) + possibly data of length datalen on the line
> afterwards. I don't see how even quite complicated stuff couldn't in
> principle fit in there. The only question is what the semantics of
> cmd/ret, datalen, data_ptr and the data itself are.
>
> So we should just declare that for the current commands, the semantics
> are completely fixed. Meaning that LXC_CMD_CONSOLE will always have the
> same on-the-wire semantics as it currently does.
>
> But let's suppose at some point in the future, LXC_CMD_CONSOLE is
> supposed change semantics completely. Then we change the enum to:
>
> typedef enum {
> LXC_CMD_DEPRECATED1, // <- LXC_CMD_CONSOLE was here
> ...,
> LXC_CMD_CONSOLE, // <- newly added, gets a new number
> LXC_CMD_MAX,
> };
>
> Then we can change the semantics of datalen / data_ptr and additional
> data and we will still be backwards compatible with all the other
> options. We just have to make sure that the processing routines always
> eat up all of the data, even if the command is not recognized, so that
> the connection will be in a sane state after that and communication may
> proceed.
>
> If the server now doesn't recognize a command, it will issue the trivial
> response { -ENOSYS, 0, 0 } back to the client. Then the client will know
> that the server is too old / too new to support the command and will
> have to cope with it. In the case of something like LXC_CMD_GET_STATE
> and LXC_CMD_GET_INIT_PID one might want to write a fallback routine for
> the client, in the case of LXC_CMD_CONSOLE perhaps not, depends on why
> the change is required.
>
> Add big fat comments in the appropriate parts of commands.h/commands.c
> to make sure that nobody changes this (+ perhaps a few unit tests) and
> there will be compatibility even between versions.
>
> > But we might want to try and accomodate newer clients talking to
> > older versions, somehow. I suspect that'd be fragile, but it might
> > be worthwhile.
>
> I think that's generally a good idea (for clients post 1.0; I think for
> 1.0 it's reasonable to say we do a final incompatible break) and at
> least for core functionality it should be policy that there will be
> compatibility.
>
> Just my 2¢.
>
> -- Christian
More information about the lxc-devel
mailing list