[lxc-devel] Integration Kubernetes and LXD/LXC

Tue Oct 2 16:04:12 UTC 2018

Oliver Schad <oliver.schad at automatic-server.com> writes:

> On Tue, 02 Oct 2018 16:49:36 +0200
> Free Ekanayaka <free.ekanayaka at canonical.com> wrote:
>
>> I know that folks to run stateful services on k8s, PostgreSQL is one
>> of those IIRC. I wouldn't expect MySQL do be fundamentally different.
>
> Sorry, I have to repeat my point: if the container engine isn't made
> to run 24/7, what includes that a container must be muteable(!), you're
> out of business.
>
> So the foundation of a stateful (and high quality) service is the
> container engine.
>
> In this area, Docker and crio-o are broken by design for that purpose.
>
>> Although LXE might be an approach that solves your immediate needs, it
>> feels like a band aid. If you haven't already, I'd recommend
>> approaching the k8s team/community describing the issues that you're
>> seeing when using standard CRI implementations such as
>> containerd/docker and cri-o.
>
> We did, no answer. But we know the answer for a lot of topics we
> researched: the design goal was to deal without state.
>
> All the stuff with state is just a gimmick for in fact restartable
> services.
>
> We did a lot of kubernetes, read a lot of code, read a lot of comments
> to a lot issues and the result is: 24/7 always online is *not* a design
> goal of docker or crio-o, nor CRI and nor Kubernetes.

Yeah, that's probably the key point indeed. The design is based on the
"pets vs kettle" idea, so the expectation is that you are fine with any
of your pod being restarted at any time. And if you use kubernetes,
you'll have a hard time going against this design principle.

That being said, one thing that is still not clear to me it's why k8s
would issue disruptive restart requests against your (MySQL?) pods. Does
it happen as consequence of some manifest/spec change? I'd expect k8s to
not restart stuff unless there's a good reason to do that.

If you have a cluster of stateful pods that need special orchestration
to handle restart (say the master must first perform a graceful failover
or things like that), I think the recommended way these days would be to
implement your own "operator" which drives the dance in the way you
want. See:

https://coreos.com/operators/

(the example operators include stateful services such as etcd and
vault).

For MySQL specifically, Oracle itself seems to have written an operator:

https://medium.com/oracledevs/introducing-the-oracle-mysql-operator-for-kubernetes-b06bd0608726

I don't know the details, but I would expect such operator to handle any
operation gracefully, including possible restarts (but, again, I think
those shouldn't really happen quite that often in a stable production
environment).

Free