I'm the Nomad Team Engineering Lead and would be happy to answer any questions people might have.
Congratulations on this milestone release, we're using Nomad since March this year on a single 'bare-metal' server and it serves our needs perfectly. We set it up with a simple gorilla/mux API in front and use the Nomad API to submit jobs from all our other applications and it works flawlessly.
With regards to 1.0's features:
HCL2 is a welcome addition for us since we had a lot of repetition in our job files using artifacts for tasks.
Also the addition of the PostStop lifecycle couldn't come at a better time, we were discussing workarounds for this recently.
One area of potential improvement would be the behaviour of file/directory permissions through different task drivers, I know this is totally dependent of the different drivers we can use with Nomad but more than once we bumped into this while setting up our jobs (and others too [1][2])
[1] https://github.com/hashicorp/nomad/issues/2625 [2] https://github.com/hashicorp/nomad/issues/8892
Thanks for all the work your team did. I am a big fan of the Hashicorp ecosystem.
Thanks for mentioning these. Everyone interested should definitely +1 them as we do use reaction emoji during prioritization.
The task driver dependent nature does make this tricky, but since the Nomad agent is usually run as root and controls the allocation/task directory tree we should have some options here.
My only comments would be:
I wish there was more content available (maybe on HashiCorp Learn Nomad) for working with single-instance Nomad clusters.
And a demo of realworld dev-to-prod workflow. Something like "Okay here's a local Docker Compose setup with Postgres, a backend API, and a frontend web app, and here's the workflow for getting it into production with Nomad."
We do "small" users a pretty big disservice by effectively dismissing single-server deployments and jumping straight to "real" production deployments (3+ servers distinct from the nodes that run workloads).
We have people who use Nomad locally as a systemd alternative. We have people who use single-scheduler-multiple-nodes clusters at home or at work, and it works quite well! If the scheduler crashes, work continues to run uninterrupted, so there's little reason not to start small and scale up.
The problem is largely that it's tricky to separate out and accurately address these various personas. People looking for a systemd alternative are obviously highly technical and will likely figure everything out through experimentation. However, "small" cluster users need to be carefully educated on the differences between an HA cluster (3+ schedulers) and a single scheduler cluster.
Not only that, but we would need to automate testing for each of these recommendations to ensure changes don't break an official recommendation.
It seems to me one of the advantages K8s has is that there are multiple "Kubernetes as a Service" services out there, such as EKS on AWS, Google Cloud's offering, even Digital Ocean.
Are there any plans to make Nomad Enterprise more accessible? It seems a managed Nomad hits the sweet spot of simple scheduling, which removes a lot of the K8s bloat for many people.
And we aren't ignoring that. I have no idea what I can say publicly, so I'll just link to what our cofounder/CTO has already said:
> And hosted Nomad clusters are on the way
This along with Waypoint seems like a great solution for smaller side projects.
To compete with Kubernetes Nomad will most likely acquire more features (e.g. storage) until it's as complex as Kubernetes because it's just what some businesses seem to want.
On the other hand Kubernetes gets simpler with projects like k3s or k0s.
That being said Nomad's CSI support doesn't impact clusters that don't optin to use it. Jobs that use host volumes or ephemeral volumes still work. Only using Nomad for stateless workloads still works. We try very hard to introduce new features in a way that only impacts people who use them.
While the principle is the same for CNI, our migration to group networks ("groups" in Nomad are like "pods" in k8s) and away from task networks has been more painful than we had hoped. Existing jobs should still work with task networks and we're rapidly trying to fix differences in the two approaches.
Nomad's Consul dependency does introduce complexity. The migration to group networks actually included a change that made service addressing available to servers in such a way that Nomad could offer native service discovery. It's still being discussed whether we want to pursue that since offering multiple solutions has obvious downsides as well.