For production servers, Pythonx is a bit more risky (and the developers aren't claiming it's the right tool for this use case). Because it's running on the same OS process as your Elixir app, you bypass the failure recovery that makes an Elixir/BEAM application so powerful.
Normally, an Elixir app has a supervision tree that can gracefully handle failures of its own BEAM processes (an internal concurrency unit -- kind like a synthetic OS process) and keep the rest of the app's processes running. That's one of the big selling points of languages like Elixir, Erlang, and Gleam that build upon the BEAM architecture.
Because it uses NIFs (natively-implemented functions), an unhandled exception in Pythonx would take down your whole OS process along with all other BEAM processes, making your supervision tree a bit worthless in that regard.
There are cases when NIFs are super helpful (for instance, Rustler is a popular NIF wrapper for Rust in Elixir), but you have to architect around the fact that it could take down the whole app. Using Ports (Erlang and Elixir's long-standing external execution handler) to run other native code like Python or Rust is less risky in this respect because the non-Elixir code it's still running in a separate OS process.
This is what we use at https://transport.data.gouv.fr/ (the French National Access Point for transportation data - more background at https://elixir-lang.org/blog/2021/11/10/embracing-open-data-...).
Note that we're not using Pythonx, but running some memory hungry processes which can sometime take the worker node down.
Is there a class of exceptions that wouldn't be caught by PythonX's wrapper? FTA (with emphasis added):
> Pythonx ties Python and Erlang garbage collection, so that the objects can be safely kept between evaluations. Also, it conveniently handles conversion between Elixir and Python data structures, bubbles Python exceptions and captures standard output.
And...
> Rustler is a popular NIF wrapper for Rust in Elixir
From Rustler's Git README:
> The code you write in a Rust NIF should never be able to crash the BEAM.
I haven't used Rustler, Zigler or PythonX (yet), so I'm genuinely asking if I'm mistaken in my understanding of their safety.
This approach targets more performance-sensitive cases with stuff like passing data frames around and vectors/matrices that are costly to serialize/deserialize a lot of the time.
And it seems to make for a tighter integration.
What's the Elixir equivalent if "Pythonic"? An architecture that allows a NIF to take down your entire supervision tree is the opposite of that, as it defeats a the stacks' philosophy.
The best practice for integrating Python into Elixir or Erlang would be to have an assigned genserver, or other supervision-tree element - responsible for hosting the Python NIF(s), and the design should allow for each branch or leaf of that tree to be killed/restarted safely, with no loss of state. BEAM message passing is cheap
> As a NIF library is dynamically linked into the emulator process, this is the fastest way of calling C-code from Erlang (alongside port drivers). Calling NIFs requires no context switches. But it is also the least safe, because a crash in a NIF brings the emulator down too. (https://www.erlang.org/doc/system/nif.html)
The emulator in this context is the BEAM VM that is running the whole application (including the supervisors).
Apparently Rustler has a way of running Rust NIFs but capturing Rust panics before they trickle down and crash the whole BEAM VM, but that seems like more of a Rust trick that Pythonx likely doesn't have.
The tl;dr is that NIFs are risky by default, and not really... Elixironic?
The difference between an off-road "sounds dangerous" idea and its safe execution is often only the quantity of work but our runtime encourages that. Here, it's a NIF so there's still a bit of risk, but it's always possible to spawn a separate BEAM instance and distribute yourself with it.
Toy example that illustrates it, first crashing with a NIF that is made to segfault :
my_nif_app iex --name my_app@127.0.0.1 --cookie cookie -S mix
iex(my_app@127.0.0.1)1> MyNifApp.crash
[1] 97437 segmentation fault
In the second example, we have a "SafeNif" module that spawns another elixir node, connects to it, and runs the unsafe operation on it. my_nif_app iex --name my_app@127.0.0.1 --cookie cookie -S mix
iex(my_app@127.0.0.1)1> MyNifApp.SafeNif.call(MyNifApp, :crash, [])
Starting temporary node: safe_nif_4998973
Starting node with: elixir --name safe_nif_4998973@127.0.0.1 --cookie :cookie --no-halt /tmp/safe_nif_4998973_init.exs
Successfully connected to temporary node
Calling MyNifApp.crash() on temporary node
:error
iex(my_app@127.0.0.1)2>
Thankfully Python, Zig and Rust should be good to go without that kind of dance :) .The only thing I'd would have like to see in added would be calling a function defined in Python from Elixir, instead of only the `Pythonx.eval` example.
The `%{"binary" => binary}` is very telling, but a couple of more and different examples would have been nice.
- The Erlang VM scheduler can't preempt a NIF, so a long-running Python call risks hanging the VM. This is a non-issue for ports, since Python's running in a separate OS process. A NIF can mitigate this by spawning an OS thread and yielding until it finishes; ain't clear if that's what this library is doing.
- The article already mentions that the GIL prevents concurrent Python execution, but this would also be a non-issue for ports, since the Erlang caller would just spin up multiple Python interpreters. Does Python allow multiple interpreters per (OS) process, like e.g. Tcl does? If so, then that'd be a possible avenue for mitigating this issue while sticking with NIFs.
The downside - unfortunately while bumblebee, Axon, and Nx are libraries that seem to have a fantastically engineered base most of the latest models don’t have native elixir implementations yet and making my own is a little beyond my skill still. So a lot of the models you can easily run are older.
But the advantages - easy long running processes, great multiprocessing support, solid error handling and recovery - all pair very well with AI systems.
For example, it’s very easy to make an application that grabs files, caches them locally, and runs ML tasks against them. You can use process monitoring and linking to manage the locally cached files, and there’s no runtime duration limit like you might hit in a serverless system like lambda. Interprocess messaging means you can easily run ML in a background task and stream results asynchronously to a user. Additionally, logs are automatically streamed to the parent process and it’s easy to tag logs with process metadata, so tracking what is going on in your application is dead simple.
That’s basically a whole stack for a live ML service with all the difficult infrastructure bits already taken care of.
> Also, it conveniently handles conversion between Elixir and Python data structures, bubbles Python exceptions and captures standard output
Sooo nice
It wasn't mentioned in the article, but there's older blog post on fly.io [1] about live book, GPUs, and their FLAME serverless pattern [2]. Since there seems to be some common ground between these companies I'm now hoping Pythonx support is coming to FLAME enabled Erlang VM. I'm just going off from the blog posts, and am probably using wrong terminology here.
For Python's GIL problem mentioned in the article I wonder if they have experimented with free threading [3].
[1] https://fly.io/blog/ai-gpu-clusters-from-your-laptop-liveboo...
[2] https://fly.io/blog/rethinking-serverless-with-flame/
[3] https://docs.python.org/3/howto/free-threading-python.html
Chris Grainger who pushed for the value of Python in Livebook has given at least two talks about the power and value of FLAME.
And of course Chris McCord (creator of Phoenix and FLAME) works at Fly and collaborates closely with Dashbit who do Livebook and all that.
These are some of the benefits of a cohesive ecosystem. Something I enjoy a lot in Elixir. All these efforts are aligned. There is nothing weird going on, no special work you need to do.
I'll add: FLAME is probably a great addition to pythonx. While a NIF can crash the node it is executed on, FLAME calls are executed on other nodes by default. So a crash here would only hard-crash processes on the same node (FLAME lets you group calls so that a flame node can have many being executed on it at any time).
Errors bubble back up to the calling process (and crash it by default but can be handled explicitly), so managing and retrying failed calls is easy.
- atoms
- everything (or most things) is a macro, even def, etc.
- pipes |>, and no, I don't want to write a "pipe" class in Python to use it like pipe(foo, bar, ...). 90% of the |> power comes from its 'flow' programming style.
- true immutability
- true parallelism and concurrency thanks to the supervision trees
- hot code reloading (you recompile the app WHILE it's running)
- fault tolerance (again, thanks for supervision trees)
Well, The whole language itself is built on macros. The following series of articles certainly helped me stop worrying and love the macros..
https://www.theerlangelist.com/article/macros_1
Some interesting insights from the article: "Elixir itself is heavily powered by macros. Many constructs, such as defmodule, def, if, unless, and even defmacro[1] are actually macros...."
[1] https://github.com/elixir-lang/elixir/blob/v1.18.2/lib/elixi...
So one could write
class Piped:
def __init__(self, value):
self.value = value
def __or__(self, func):
return Piped(func(self.value))
def __repr__(self):
return f"Piped({self.value!r})"
Piped('test') | str.upper | (lambda x: x.replace('T', 't')) | "prefix_".__add__ # => prefix_tESt
but whether that is a good idea is a whole different matter. counts = (
lines
| 'Split' >> (
beam.FlatMap(
lambda x: re.findall(r'[A-Za-z\']+', x)).with_output_types(str))
| 'PairWithOne' >> beam.Map(lambda x: (x, 1))
| 'GroupAndSum' >> beam.CombinePerKey(sum))
I'm not sure how I feel about it, other than the fact that I'd 100x rather write Beam pipelines in basically any other language. But that's about more than syntax."...if you are using this library to integrate with Python, make sure it happens in a single Elixir process..."
just kidding, this is pretty cool.
Could be better, but that is what mainstream gets.
>>> def progn(*args):
... if args:
... return args[-1]
... else:
... return None
...
>>> fun = lambda x : progn(print('abc'),
... print(x),
... print('def'))
>>>
>>> fun(42)
abc
42
def
:)