The etcd operator: Simplify etcd cluster configuration and management (opens in new tab)

(coreos.com)

100 pointspolvi9y ago35 comments

35 comments

I would love to understand the design rational on why a custom controller is needed to run etcd as opposed to leveraging the existing k8s constructs such as replicasets or petsets. While this is a very useful piece of technology it gives me the wrong impression that if you want to run a persistent or "more complicated" work load then you must develop a significant amount of code for that to work on k8s. Which I don't believe is the case, which I why I'm asking why this route was chosen.

theptip9y ago

The FAQ at the end of the OP addresses this:

"Q: How is this different than StatefulSets (previously PetSets)?

A: StatefulSets are designed to enable support in Kubernetes for applications that require the cluster to give them "stateful resources" like static IPs and storage. Applications that need this more stateful deployment model still need Operator automation to alert and act on failure, backup, or reconfigure. So, an Operator for applications needing these deployment properties and could use StatefulSets instead of leveraging ReplicaSets or Deployments."

There is inevitably some app-specific logic required to modify a complex stateful deployment; the Operator encapsulates this logic so that the external interface is a simple config file.

philips9y ago

Darren, the FAQ is at the overview post here: https://coreos.com/blog/introducing-operators.html

3 more replies

hatred9y ago

The concept of custom controllers looks similar to what schedulers are in Mesos. It's nice to see the two communities taking a leaf out of each other's books e.g., Mesos would introduce experimental support for task groups (aka Pods) in 1.1.

Disclaimer: I work at Mesosphere on Mesos.

ideal02279y ago

Yea. They are similar in functionality.

But they work differently. The operator does not really “schedule” containers. It finishes the controlling logic by using Kubernetes APIs. For example, it uses native Kubernetes health checking, service discovery, deployment. It works completely on top of Kubernetes API, so no specialized scheduler, executor or proxy are needed comparing to https://github.com/mesosphere/etcd-mesos/blob/master/docs/ar....

The advantages of Mesos is exposing lower level APIs and resources to allow more control. The etcd operator we built does not really need that. Building this kind of application operator may be simpler on k8s than on native Mesos.

Disclaimer: I work at CoreOS on Kubernetes and etcd.

ex3ndr9y ago

Can someone clarify some points?

* Isn't etcd2 is required to start kubernetes? I found that if etcd2 is not helaty or connection is just temporary lost then k8s just freezes it's scheduling and API. So what if Operator and etcd2 is working on one node and it is down? Also i found that etcd2 also freezes event when one node is down. Isn't it unrecoverable situation?

* k8s/coreos manual recommends to have etcd2 servers not that far from each other mostly because it have very strict requirements about networks (ping 5ms or so) that for some pairs of servers couldn't work well.

* What if we will lost ALL nodes and it will create almost new cluster from backups, but what if we will need to restore latest version (not 30 mins ago)?

philips9y ago

1) Yes, Kubernetes relies on etcd as its primary database. Right now the etcd Operator does not tackle trying to manage the etcd that Kubernetes relies on. But! We are working on that as part of our self-hosted work https://coreos.com/blog/self-hosted-kubernetes.html. Stay tuned.

2) etcd can deal with any latency up to seconds long for say a globally replicated etcd. But! You need to tune etcd to expect that latency so it doesn't trigger a leader election. See the tuning guide: https://coreos.com/etcd/docs/latest/tuning.html

3) The backups are something that we are just getting to with the etcd Operator. Our intention is to help you create backups and create new clusters from arbitrarily old backups, but that work hasn't started yet.

jbpetersen9y ago

Being someone who's been getting more familiar lately with backend engineering and has been trying to make sense of various options, I've got a strong enough impression of CoreOS that I'm betting my time it'll be dominating the next few years.

I also can't wait to see an open version of AWS Lambda / Google Functions appear.

justincormack9y ago

IBM has already launched openwhisk, and there are some other open lambda implementations.

duaneb9y ago

There are already lambda implementations available; I can't speak to google functions.

jbpetersen9y ago

Is there a significant difference?

russell_h9y ago

I've been thinking about implementing a custom controller that would use Third Party Resources as a way to install and manage an application on top of Kubernetes. The way that Kuberetes controllers work (watching a declarative configuration, and "making it so") seems like a great fit for the problem.

Its exciting to see CoreOS working in the same direction - this looks much more elegant than what I would have hacked up.

theptip9y ago

I've been thinking the same way; the k8s Third Party Resource API really enables some clever solutions.

While most k8s users are (from what I can tell) currently writing YAML config files and loading them by hand (encouraged by tools like Helm and Spread), I think that the k8s apps of the future will be more like the Operator;

1) The 'deploy scripts' are controllers that run in your k8s cluster and dynamically ensure the rest of your code is running, and the primitives that you operate on will be your custom ThirdPartyResources.

2) All of the config for your app is wrapped in a domain-specific k8s object spec; instead of writing a YAML file and uploading it as a raw Deployment, you would create a FooService API object with just the parameters that you actually care about for configuring your service.

Right now it's a pain and a lot of code (>10kloc of Go for the etcd-operator!) but I'm sure that a bunch of that could be abstracted out into a framework that makes it easy to generate/build operators for a variety of application use-cases.

Currently the solutions that build and deploy your code for you in k8s seem to be PaaS replacements (Deis, Openshift), which take a very generic approach to bundling your code. That's probably going to work for common use-cases, but I suspect the more bespoke deployments will need something more like the Operator approach, and I'm looking forward to seeing what tooling evolves in this area.

dantiberian9y ago

This sounds a lot like Joyent's Autopilot Pattern (http://autopilotpattern.io), but will be more integrated with Kubernetes, rather than being agnostic.

doublerebel9y ago

Thanks, I remember seeing the autopilot pattern mentioned on Joyent's blog, but haven't seen that website. The lifecycle [0] looks remarkably similar to the build and deployment steps outlined in Distelli's manifest [1]. I use Distelli+Consul on Joyent so I suppose I've been doing the autopilot pattern without realizing it!

I know that much of Distelli's workflow comes from the founders' experience at AWS, so I wonder where the root of this pattern lies. Perhaps that would help unify these similar methods.

[0]: http://autopilotpattern.io/#how-do-we-do-it

[1]: https://www.distelli.com/docs/manifest/deployment-types

0x74696d9y ago

I'm the lead developer for Joyent of ContainerPilot, which is the tool at the core of our Autopilot Pattern implementation examples. The lifecycle events you recognize in Distelli are definitely similar. And Chef's new tool Habitat has a supervisor that was independently developed but ended up having interesting parallels with ContainerPilot. So there's a universal idea lurking under there, which is why we called Autopilot a "Pattern" rather than a tool in itself.

But it's not clear to me from a casual glance at the the docs whether Distelli lives inside the container during those hooks? That's part of the distinction of the Autopilot Pattern is making the higher-level orchestration layer as thin as possible.

(As far as the root, some of it is derived from my experiences as a perhaps-foolishly-early adopter of Docker in prod at my previous gig at a streaming media startup. The rest is derived from both principals with which Joyent's own Triton infra is built and our experiences speaking with enterprise devs and ops teams.)

2 more replies

adieu9y ago

This is great news. We developed an internal controller managing etcd cluster used by kubernetes apiserver using third party resource too. The control loop design pattern works really well.

why-el9y ago

Somewhat unrelated, but I am just curious. For those who use etcd (and this is coming from a place of ignorance), does the key layout (which the keys are currently stored, how they are structured) get out of hand? Meaning, does it get to a place where a dev working with etcd might not have an idea about what in etcd at any given time? Or do teams force some kind of policy (in documentation or code) that everyone must respect?

I am asking because I was in situation where I was introduced to other key-value stores, and because the team working with them is big and no process was followed to group all keys in one place, it was hard to know "what is in the store" at any moment, short of exhausting all the entry points in the code.

NegatioN9y ago

I see it mentioned in the article that they have created a tool similar to Chaos Monkey for k8s, but I don't see any resources linking to it.

Will this at some point be available publically? Although k8s ensures pods are rescheduled, many applications do not handle it well, so I think a lot of teams can benefit from having something like that.

ideal02279y ago

The "Chaos Monkey" lives inside the project as a sub-pkg right now: https://github.com/coreos/etcd-operator/tree/master/pkg/chao....

We plan to make it a separate project once we feel good about its functionality and reliability.

If you have any potential use case, requirement in mind, please tell us. :)

hosh9y ago

This is brilliant. It's like the promise-theory-based convergence tools (CFEngine, Puppet, Chef) on top of K8S primitives. Better yet, the extension works like other K8S addons -- you start it up by scheduling the controller pod. That means potentially, I could use it in say, GKE, which I might not have direct control over the kube-master.

I wonder if it is leverging PetSets. I also wonder how this overlaps or plays with Deis's Helm project.

I'm looking forward to some things implemented like this: Kafka/Zookeeper, PostgreSQL, Mongodb, Vault, to name a few.

I also wonder it means something like Chef could be retooled as a K8S controller.

philips9y ago

All of your questions are answered in the FAQ section of the overview post: https://coreos.com/blog/introducing-operators.html

hosh9y ago

I don't think my specific questions are answered by FAQ on that page.

The only answer I found that addresses one part of what I'm wondering about is "How is this different from configuration management like Puppet or Chef?" However, I did not ask that question.

If you read some of the Mike Burgess's "Promise Theory: Principles and Applications", you'll realize that Operators (and Kubernetes controllers for that matter) are applications and implementations of specific parts of the Promise Theory. This idea that Puppet or Chef is "configuration management" is a story sold to non-technical people. I would argue that Operators may be a _better_ application of Promise Theory than previous-generation tools.

Puppet or Chef running as a Kubernetes controller might be able to twiddle things. It's not exactly a great fit because both would be calling each respective servers rather than using Third Party Resources on the kube master (and such, unwieldy) The DSL in each would have to be extended for things useful for controlling a Kubernetes cluster, but once in place, it can do exactly what that etcd Operator does: converge on the desired state by managing memberships and doing cleanups.

Don't get me wrong: I like the CoreOS technology as well as Kubernetes. I've deployed on CoreOS and Kubernetes before. I get that companies have a responsibility to control the story and the messaging ... but what I am asking are questions that are bigger than any single technology or company, and I like to make up my own mind about things.

otterley9y ago

Where is the functional specification for an Operator? It sounds like a K8S primitive; is that in fact true? If not, why does this post make it sound like one?

j / k navigate · click thread line to collapse

35 comments

darren09y ago

theptip9y ago

The FAQ at the end of the OP addresses this:

"Q: How is this different than StatefulSets (previously PetSets)?

There is inevitably some app-specific logic required to modify a complex stateful deployment; the Operator encapsulates this logic so that the external interface is a simple config file.

philips9y ago

Darren, the FAQ is at the overview post here: https://coreos.com/blog/introducing-operators.html

3 more replies

hatred9y ago

Disclaimer: I work at Mesosphere on Mesos.

ideal02279y ago

Yea. They are similar in functionality.

Disclaimer: I work at CoreOS on Kubernetes and etcd.

ex3ndr9y ago

Can someone clarify some points?

* What if we will lost ALL nodes and it will create almost new cluster from backups, but what if we will need to restore latest version (not 30 mins ago)?

philips9y ago

jbpetersen9y ago

I also can't wait to see an open version of AWS Lambda / Google Functions appear.

justincormack9y ago

IBM has already launched openwhisk, and there are some other open lambda implementations.

duaneb9y ago

There are already lambda implementations available; I can't speak to google functions.

jbpetersen9y ago

Is there a significant difference?

russell_h9y ago

Its exciting to see CoreOS working in the same direction - this looks much more elegant than what I would have hacked up.

theptip9y ago

I've been thinking the same way; the k8s Third Party Resource API really enables some clever solutions.

dantiberian9y ago

This sounds a lot like Joyent's Autopilot Pattern (http://autopilotpattern.io), but will be more integrated with Kubernetes, rather than being agnostic.

doublerebel9y ago

I know that much of Distelli's workflow comes from the founders' experience at AWS, so I wonder where the root of this pattern lies. Perhaps that would help unify these similar methods.

[0]: http://autopilotpattern.io/#how-do-we-do-it

[1]: https://www.distelli.com/docs/manifest/deployment-types

0x74696d9y ago

2 more replies

adieu9y ago

This is great news. We developed an internal controller managing etcd cluster used by kubernetes apiserver using third party resource too. The control loop design pattern works really well.

why-el9y ago

NegatioN9y ago

I see it mentioned in the article that they have created a tool similar to Chaos Monkey for k8s, but I don't see any resources linking to it.

ideal02279y ago

The "Chaos Monkey" lives inside the project as a sub-pkg right now: https://github.com/coreos/etcd-operator/tree/master/pkg/chao....

We plan to make it a separate project once we feel good about its functionality and reliability.

If you have any potential use case, requirement in mind, please tell us. :)

hosh9y ago

I wonder if it is leverging PetSets. I also wonder how this overlaps or plays with Deis's Helm project.

I'm looking forward to some things implemented like this: Kafka/Zookeeper, PostgreSQL, Mongodb, Vault, to name a few.

I also wonder it means something like Chef could be retooled as a K8S controller.

philips9y ago

All of your questions are answered in the FAQ section of the overview post: https://coreos.com/blog/introducing-operators.html

hosh9y ago

I don't think my specific questions are answered by FAQ on that page.

The only answer I found that addresses one part of what I'm wondering about is "How is this different from configuration management like Puppet or Chef?" However, I did not ask that question.

otterley9y ago

Where is the functional specification for an Operator? It sounds like a K8S primitive; is that in fact true? If not, why does this post make it sound like one?

j / k navigate · click thread line to collapse