"Q: How is this different than StatefulSets (previously PetSets)?
A: StatefulSets are designed to enable support in Kubernetes for applications that require the cluster to give them "stateful resources" like static IPs and storage. Applications that need this more stateful deployment model still need Operator automation to alert and act on failure, backup, or reconfigure. So, an Operator for applications needing these deployment properties and could use StatefulSets instead of leveraging ReplicaSets or Deployments."
There is inevitably some app-specific logic required to modify a complex stateful deployment; the Operator encapsulates this logic so that the external interface is a simple config file.
Disclaimer: I work at Mesosphere on Mesos.
But they work differently. The operator does not really “schedule” containers. It finishes the controlling logic by using Kubernetes APIs. For example, it uses native Kubernetes health checking, service discovery, deployment. It works completely on top of Kubernetes API, so no specialized scheduler, executor or proxy are needed comparing to https://github.com/mesosphere/etcd-mesos/blob/master/docs/ar....
The advantages of Mesos is exposing lower level APIs and resources to allow more control. The etcd operator we built does not really need that. Building this kind of application operator may be simpler on k8s than on native Mesos.
Disclaimer: I work at CoreOS on Kubernetes and etcd.
* Isn't etcd2 is required to start kubernetes? I found that if etcd2 is not helaty or connection is just temporary lost then k8s just freezes it's scheduling and API. So what if Operator and etcd2 is working on one node and it is down? Also i found that etcd2 also freezes event when one node is down. Isn't it unrecoverable situation?
* k8s/coreos manual recommends to have etcd2 servers not that far from each other mostly because it have very strict requirements about networks (ping 5ms or so) that for some pairs of servers couldn't work well.
* What if we will lost ALL nodes and it will create almost new cluster from backups, but what if we will need to restore latest version (not 30 mins ago)?
2) etcd can deal with any latency up to seconds long for say a globally replicated etcd. But! You need to tune etcd to expect that latency so it doesn't trigger a leader election. See the tuning guide: https://coreos.com/etcd/docs/latest/tuning.html
3) The backups are something that we are just getting to with the etcd Operator. Our intention is to help you create backups and create new clusters from arbitrarily old backups, but that work hasn't started yet.
I also can't wait to see an open version of AWS Lambda / Google Functions appear.
Its exciting to see CoreOS working in the same direction - this looks much more elegant than what I would have hacked up.
While most k8s users are (from what I can tell) currently writing YAML config files and loading them by hand (encouraged by tools like Helm and Spread), I think that the k8s apps of the future will be more like the Operator;
1) The 'deploy scripts' are controllers that run in your k8s cluster and dynamically ensure the rest of your code is running, and the primitives that you operate on will be your custom ThirdPartyResources.
2) All of the config for your app is wrapped in a domain-specific k8s object spec; instead of writing a YAML file and uploading it as a raw Deployment, you would create a FooService API object with just the parameters that you actually care about for configuring your service.
Right now it's a pain and a lot of code (>10kloc of Go for the etcd-operator!) but I'm sure that a bunch of that could be abstracted out into a framework that makes it easy to generate/build operators for a variety of application use-cases.
Currently the solutions that build and deploy your code for you in k8s seem to be PaaS replacements (Deis, Openshift), which take a very generic approach to bundling your code. That's probably going to work for common use-cases, but I suspect the more bespoke deployments will need something more like the Operator approach, and I'm looking forward to seeing what tooling evolves in this area.
I know that much of Distelli's workflow comes from the founders' experience at AWS, so I wonder where the root of this pattern lies. Perhaps that would help unify these similar methods.
[0]: http://autopilotpattern.io/#how-do-we-do-it
[1]: https://www.distelli.com/docs/manifest/deployment-types
But it's not clear to me from a casual glance at the the docs whether Distelli lives inside the container during those hooks? That's part of the distinction of the Autopilot Pattern is making the higher-level orchestration layer as thin as possible.
(As far as the root, some of it is derived from my experiences as a perhaps-foolishly-early adopter of Docker in prod at my previous gig at a streaming media startup. The rest is derived from both principals with which Joyent's own Triton infra is built and our experiences speaking with enterprise devs and ops teams.)
I am asking because I was in situation where I was introduced to other key-value stores, and because the team working with them is big and no process was followed to group all keys in one place, it was hard to know "what is in the store" at any moment, short of exhausting all the entry points in the code.
Will this at some point be available publically? Although k8s ensures pods are rescheduled, many applications do not handle it well, so I think a lot of teams can benefit from having something like that.
We plan to make it a separate project once we feel good about its functionality and reliability.
If you have any potential use case, requirement in mind, please tell us. :)
I wonder if it is leverging PetSets. I also wonder how this overlaps or plays with Deis's Helm project.
I'm looking forward to some things implemented like this: Kafka/Zookeeper, PostgreSQL, Mongodb, Vault, to name a few.
I also wonder it means something like Chef could be retooled as a K8S controller.
The only answer I found that addresses one part of what I'm wondering about is "How is this different from configuration management like Puppet or Chef?" However, I did not ask that question.
If you read some of the Mike Burgess's "Promise Theory: Principles and Applications", you'll realize that Operators (and Kubernetes controllers for that matter) are applications and implementations of specific parts of the Promise Theory. This idea that Puppet or Chef is "configuration management" is a story sold to non-technical people. I would argue that Operators may be a _better_ application of Promise Theory than previous-generation tools.
Puppet or Chef running as a Kubernetes controller might be able to twiddle things. It's not exactly a great fit because both would be calling each respective servers rather than using Third Party Resources on the kube master (and such, unwieldy) The DSL in each would have to be extended for things useful for controlling a Kubernetes cluster, but once in place, it can do exactly what that etcd Operator does: converge on the desired state by managing memberships and doing cleanups.
Don't get me wrong: I like the CoreOS technology as well as Kubernetes. I've deployed on CoreOS and Kubernetes before. I get that companies have a responsibility to control the story and the messaging ... but what I am asking are questions that are bigger than any single technology or company, and I like to make up my own mind about things.