The book is clearly very unfinished still, but what exists is very good. Got my mind spinning on what a good statically typed language running on a VM could look like.
I have no experience with OTP but have read some books and did toy projects.
They do not. Supervisor trees are a way to manage failures at the level of one node.
> How would I tell the cluster that there should be N processes running, each on a different node?
It's not something that is builtin in OTP. There are libraries that solve it, like libcluster.
> And is there anything in OTP that would help me elect a leader or do I still have to implement that myself?
Not directly in OTP but there are libraries, for example for raft.
Ok so why is Erlang/Elixir/OTP good then? Well first it makes a single running application more robust to failures thanks to its supervision trees but it also allow to build distributed applications more easily. GenServers allow to build robust services very easily with common patterns. Local calls or remote calls to GenServers are the same, allowing to scale services. Message passing and pattern matching is part of the core of the language (no need for protobuf for example). Observability and introspection is excellent when a problem arise (inspecting processes, their memory, their message queues, the schedulers etc). Immutable datastructures and processes that do not share memory also make it easier to scale horizontally, at a cluster level. And probably lot of other good things I forgot :-).
What you say makes sense. I can see the benefit in message passing as a first class citizen so it allows extraction of some processes to a different node. But you still have to manage the process placements.
There is https://github.com/rabbitmq/ra which is a Raft implementation in Erlang that is Jepsen-tested. You could use it to build "etcd in Erlang", or https://github.com/rabbitmq/khepri which is built on top of Ra.
global:register/3 may be helpful. I haven't used it, so no direct experience. I think you would need to provide the resolution function for when a cluster merges and the name is registered on both partitions, and the logic to register a potential leader if there is none.
From experience with other parts of global, you'll want to be careful and test what happens on your system if a thousand nodes across several locations all try to join/register at once. Especially if one or several of those nodes are running really slow because of hardware issues.
I think some of this might be covered in distributed OTP applications with takeover[1], but where I worked with Erlang, we certainly weren't applying OTP applications as the OTP team intended, I think as a result of most of the team, including all of early server engineers learning Erlang on the job.
[1] https://learnyousomeerlang.com/distributed-otp-applications
and an attempt to correct it by Hans Svensson: https://erlang.org/workshop/2005/NewLeaderElection.pdf
This project attempts to modernize it: https://github.com/lehoff/gen_leader
But from what I can tell, theres no standardized solution. There are quite a few libraries I can see out there, however.