I’ve worked deeply in an async rust codebase at a FAANG company. The vast majority chooses a dialect of async Rust which involves arcs, mutices, boxing etc everywhere, not to mention the giant dep tree of crates to do even menial things. The ones who try to use proper lifetimes etc are haunted by the compiler and give up after enough suffering.
Async was an extremely impressive demo that got partially accepted before knowing the implications. The remaining 10% turned out to be orders of magnitude more complex. (If you disagree, try to explain pin projection in simple terms.) The damage to the ecosystem from fragmentation is massive.
Look, maybe it was correct to skip green threads. But the layer of abstraction for async is too invasive. It would have been better to create a “runtime backend” contract - default would be the same as sync rust today (ie syscalls, threads, atomic ops etc – I mean it’s already half way there except it’s a bunch of conditional compilation for different targets). Then, alternative runtimes could have been built independently and plugged in without changing a line of code, it’d be all behind the scenes. We could have simple single-threaded concurrent runtimes for embedded and maybe wasm. Work stealing runtimes for web servers, and so on.
I’m not saying it would be easy or solve all use-cases on a short time scale with this approach. But I do believe it would have been possible, and better for both the runtime geeks and much better for the average user.
Pin projection is the proccess of getting a pinned reference to a struct's field from a pinned reference to the whole struct. Simple concept, but the APIs currently on offer for it (`unsafe` code or macro hackery) are very subpar.
My argument is more along the lines of: modularity is the (only) way to reduce complexity. We already have modular runtimes in other languages (project loom in Java, webassembly etc). Most people should not care about runtimes much. The ecosystem cost of async ended up being high. Thus, runtimes should be an implementation detail for most users.
Doesn’t mean Rome has to be rebuilt. Perhaps the async we have can be saved, but even so it involves biting the apple of actually defining precisely what a runtime is so that crate authors can think of them just like they think of allocators today (ie not at all).
I followed Rust in the very early days and definitely came away with the sense in this article. I would have said (and may have said to some people) that Graydon is really great, but that the exciting things about Rust weren't the things he liked or cared about; basically the expressivity and zero cost abstractions sections of this article.
But reading the article he linked about first class modules, I think that seems pretty good, and I think he's definitely right about making borrowing "second class" without explicit lifetimes (or at least discouraging them more so than the language does today), and about existential types (I'm always surprised I don't see these more in library APIs).
I also had no idea he wanted built in bignums. In pre-1.0 (and pre-cargo) rust, I created a very incomplete library for that, and would have loved to have it built in instead. Also yeah, decimal literals would be excellent.
But I didn't find the async vs. green threads section convincing. The green thread implementation wasn't a great fit at the time it existed, and I haven't seen anything since then that convinces me there was some great solution available to make it work better. Async isn't great in rust, but it's a much better fit, and I think it can be used well. I have hopes that best practices developing over time and maybe language features or changes can push people in a more sane direction of usage (once it becomes more clear what that should be).
Is the runtime something the compiler adds to the binary to make sure it is able to correctly interact with the system it is built for?
It seems like people argue that green threads require a runtime as if async doesn't? I don't understand the arguments on either side. In terms of what code looks like I far prefer being able to just declare green threads like golang does.
Honestly I wish I understood on a deep level, but I've been programming for 17+ years and the fact that I still don't implies to me that I never will.
> I don't understand the arguments on either side. In terms of what code looks like I far prefer being able to just declare green threads like golang does.
Under the hood `async` is sugar over a function such that the function returns a `Future<T>` instead of a `T`. What is done with that future is up to the caller.
In most cases this is handed off to a runtime (your choice of runtime, generally speaking) that will figure out how to execute it. You could also manually poll the future until it's complete, which does happen sometimes if you're manually implementing the Future trait.
If you have no async code you can simply avoid having an async runtime altogether, reducing the required runtime for an arbitrary program.
> I far prefer being able to just declare green threads like golang does
This relies on an implicit runtime. That's fine - lots of Rust libraries that work the way you're suggesting will just assume a runtime exists.
That lets you write:
spawn(async {println!("hello from async");});
And, just like a goroutine, it will be scheduled for execution by the implicit runtime (or it will panic if that runtime is not there).Note that this implicit runtime has to be there or you'll panic. This means that the reasonable behavior would be to always provide such a runtime, which would mean that even "sync" programs would need it. Or otherwise you'd need to somehow determine that no "async" code is ever actually called and statically remove it. That is a major reason why you wouldn't want this model in a language that tries to minimize its runtime.
> but I've been programming for 17+ years and the fact that I still don't implies to me that I never will.
I think it's just a matter of exposure. Try writing in more languages like C, C++, Rust, etc, and dig into these features.
Exactly. RAII works beautifully in regular Rust, so you create references with the static ownership rules and pass them around, before the value is dropped at a deterministic place. This is like the main value prop of Rust.
In async Rust OTOH (in fact regular threads as well) it’s much harder to use references when they normally would make sense. So instead of `&T` and `&mut T` you need `Arc<T>` and `Arc<Mutex<T>>`, respectively.
Then you lose both on performance (the initial blog post claimed that the pervasive arcing is worse than GC) but also UX. Arcs are much easier to leak, for instance.
You can use atomics if the data fits in a machine word. That's a lot faster than a full mutex.
And?
> not to mention the giant dep tree of crates to do even menial things.
Again, And?
I don't really care about having to pull in crates. That has always been how Rust does things - it prefers many small crates over fewer large crates. Async is no different.
And I don't care about Arc either. Writing `let x = blah()` is not much better than `let x = Arc::new(blah())`.
If you're talking about something else, like idk, maintaining mutability across multiple threads, yeah that's going to be more painful. It's also painful in most other languages and is generally avoided for that reason.
> The ones who try to use proper lifetimes etc are haunted by the compiler and give up after enough suffering.
You say "proper lifetimes" as if lifetimes are desirable. In async code they are not - your lifetime is often "arbitrary" and that's what an Arc gives you. The solution is, as mentioned, using an Arc or Box or Mutex.
> Async was an extremely impressive demo that got partially accepted before knowing the implications.
I think this is a totally ignorant characterization of async, which was years in the making, took lessons learned from decades of async in other languages, and was frankly led by some of the most knowledgeable people in regards to these sorts of systems.
> (If you disagree, try to explain pin projection in simple terms.)
The vast majority of people will never have to know what a pin projection is, let alone how it works. It rarely comes up, and virtually only if you're writing libraries. I could explain it but I see no reason to do so here (it is not complicated at all, `Pin` is probably the harder one to explain).
> The damage to the ecosystem from fragmentation is massive.
It's not even noticeable lol like, what? What fragmentation? I've never run into an issue of fragmentation and I've written 100s of thousands of lines of Rust.
> It would have been better to
How nice to sit on the sidelines and throw out a paragraph sized proposal. Everything looks great when you hand wave away the complexity of the problem space.
Async Rust isn't perfect (I frankly don't think there is a "perfect" solution, that should not be contentious I hope) and I welcome criticism, but your post is totally unconstructive and unsubstantial.
That being said I still have a deep dislike of having to read through nested Arc Mutexes or whatever to figure out what the code does in principle before I figure out what is going on in detail with the ownership.
I know there are no perfect solutions and there are trade-offs to be made, but I wish there was a way to have it more readable.
So instead of this:
let s = Arc::new(Mutex::new(Something::new("foo")));
something a bit like this: let s = Something::new("bar").arc()
.mutex();Thinking too much and in particularly going with over complicated solutions from the very start because "might" is just bad engineering.
Also, even if I do need async in a certain place, doesn't mean I need to endure the limitations and complexity of async Rust everywhere in my codebase. I can just spawn a single executor and pass around messages over channels to do what requires async in async runtime, and what doesn't in normal and simpler (and better) blocking IO Rust.
You need async IO? Great. I also need it sometimes. But that doesn't explain the fact that every single thing in Rust ecosystem nowadays is async-only, or at best blocking wrapper over async-only. Because "async is web-scale, and blocking is not web-scale".
Edit: Also the "just use smol" comically misses the problem. Yeah, smol might be simpler to use than tokio (it is, I like it better personally), but most stuff is based on tokio. It's an uphill battle for the same reasons using blocking IO Rust is becoming an uphill battle. Only thing better than using async when you don't want to is having to use 3 flavors (executors) of async, when you didn't want to use any in the first place.
Everything would be perfect and no one would complain about async all the time if the community defaulted to blocking, interoperable Rust, and then projects would pull in async in that few places that do actually need async. But nobody wants to write a library that isn't "web-scale" anymore, so tough luck.
This means that any crate that uses IO will be bound to a limited number of Runtimes. Everything being Tokio-only is pretty bad (though Tokio itself is great), but here we are...
[0] https://github.com/bluejekyll/trust-dns/pull/1373#issuecomme...
It's more like "I want to be able to put timeouts in my code". 99% of why I want async is so that if something takes too long I can just stop that. That is incredibly hard to do without async.
Now that’s just plainly untrue.
It's not a problem with tokio either. The author's point is specifically about the multi-threaded tokio runtime that allows tasks to be moved between worker threads, which is why it requires the tasks to be Send + 'static. Alternatively you can either a) create a single-threaded tokio runtime instead which will remove the need for tasks to be Send, or b) use a LocalSet within the current worker that will scope all tasks to that LocalSet's lifetime so they will not need to be Send or 'static.
If you go the single-threaded tokio runtime route, that doesn't mean you're limited to one worker total. You can create your own pseudo-multi-threaded tokio runtime by creating multiple OS threads and running one single-threaded tokio runtime on each. This will be similar to the real multi-threaded tokio runtime except it doesn't support moving tasks between workers, which means it won't require the tasks to be Send. This is also what the author's smol example does. But note that allowing tasks to migrate between workers prevents hotspots, so there are pros and cons to both approaches.
Looking back with like 30 years of hindsight it seems to me that Java’s greatest contribution to software reuse was efficient garbage collection; memory allocation is a global property of an application that can’t efficiently be localized as you might want a library to use a buffer it got from the client or vice versa and fighting with the borrow checker all the time to do that is just saying “i choose to not be able to develop applications above a certain level of complexity.”
Static lifetimes are also a large part of the rest of Rust's safety features (like statically enforced thread-safety).
A usable Rust-without-lifetimes would end up looking a lot more like Haskell than Go.
RAII works only for the simplest case: when your cleanup takes no parameters, when the cleanup doesn't perform async operations, etc. Rust has RAII but it's unusable in async because the drop method isn't itself async (and thus may block the whole thread if it does I/O)
Thus the complexity of handling memory is greater than that of other resources and the consequences of getting it not 100% right are frequently worse.
Has anyone build a collector that tracks multiple types of resources an object might consume? It seems possible.
The problem is that things like closing a socket are not just generic resources, a lot of the time nonmemory stuff has to be closed at a certain point in the program, for correctness, and you can't just let GC get to it whenever.
Why is that a "problem with GC"?
Abstracting away >90% of resource management (i.e. local memory) is a significant benefit.
It's like saying the "problem with timesharing OS" is that it doesn't address 100% of concurrency/parallelism needs.
I disagree that this criticism applies to Rust. For 99% of the cases, the idiomatic combination of borrow checking, Box and Arc gets back to a unified, global, compiler-enforced convention. I agree that there's a non-trivial initial skill hurdle, one that I also struggled with, but you only have to climb that once. I don't see that there's a limit to program complexity with these mechanisms.
Lol wut. The C++ resource management paradigm is RAII. If you write a library that doesn't use RAII, it's a bad library. Not a fault of the language.
widely used though. not sure if that count for appreciation, but i think it's one of the highest forms.
it's not bad, not not great either. i miss proper sum types, and it really lament the fact that static things are nearly impossible to be mocked which prompts everyone to use DI for everything instead of static.
Your argument is looking at the advantages Java brought to development speed and entirely disregarding runtime speed
It's hip to hate on java, but at least do it from an informed position.
Java is extremely fast, which is why it's so popular for server code where performance matters.
In my experience (which, admittedly, is far less than the author, a developer of smol!) the answer to "I'm starting to do a lot of things at once" in Rust is usually to spin up a few worker threads and send messages between them to handle jobs, a la Ripgrep's beautiful implementation.
In a way, it seems like async Rust appears more often when you need to do io operations, and not so much when you just need to do work in parallel.
Of course, you surely can use async rust for work in parallel. But it's often easier to keep async out of it if you just need to split up some work across threads without bringing an entire async executor runtime into the mix.
I don't think async/await was poorly implemented in Rust - in fact, I think it avoids a lot of problems and pitfalls that could have happened. The complications arise because async/await is, kind of, ideologically antithetical to Rust's other goal of memory safety and single-writer. Rust really wants to have its cake (compile-time memory safety) and eat it too (async/await). And while you can criticize it, you have to admit they did a pretty good job given the circumstances.
Yep this makes sense to me.
If your workload is CPU-bound then context switching to make progress on 10 tasks concurrently, is going to be slower than doing them sequentially.
But if it’s IO-bound you will spend most of your time waiting, which you could use to make progress on the other tasks.
That's pretty simple. The primary goal of every software engineer is (or at least should be) ... no, not to learn a new cool technology, but to get the shit done. There are cases where async might be beneficial, but those cases are few and far in between. In all other cases a simple thread model, or even a single thread works just fine without incurring extra mental overhead. As professionals we need to think not only if some technology is fun, but how much it actually costs to our employer and about those who are going to maintain our "cool" code when we leave for better pastures. I know, I know, I sound like a grandpa (and I actually am).
That's not "fun", that's table stakes.
It’s been a long time since I did this in Rust. But why do you not have access to the sockets or at least a set_timeout method? Is it a higher level lib that omits such crucial features?
In Go, the super common net.Conn interface has deadline methods. Not everyone knows their importance but generally you have something like it piped through to the higher layers.
EDIT: Oh I see you replied to my other comment. Please disregard.
Async isn't a lark, it's a workhorse. The goal is not to write sexy code, it's to achieve better utilization (which is to say, save money).
But nobody will sell you just a CPU cycle. They come in bundles of varying size.
I recently heard a successful argument that we should take the pod that's 99% unutilized and double its CPU capacity so it can be 99.9% unutilized, that way we don't get paged when the data size spikes.
When I proposed we flatten those spikes since they're only 100ms wide it was sort down because "implementing a queueing architecture" wasn't worth the developer time.
I suppose you could call it a queueing architecture. I'd call it a for loop.
If we presuppose that all software eventually develops an async, and we therefore should use async. Would it not stand to reason that greenspun's rule that all software contains a lisp would imply that we must also all use lisp?
The implicit argument doesn't stand alone though. The author goes on to write:
> It happens like this: programs are naturally complicated. Even the simple, Unix-esque atomic programs can’t help but do two or three things at once. Okay, now you set it up so, instead of waiting on read or accept or whatnot, you register your file descriptors into poll and wait on that, then switching on the result of poll to figure out what you actually want to do.
The implication is clear. Even simple programs will eventually require async, and should therefore just use it right now. unix-esque in this paragraph is supposed to evoke ls or cat. Is your program really going to be simpler than cat? No? Then you apparently need async.
Maybe Rust isn’t a good tool for massively concurrent, userspace software - https://news.ycombinator.com/item?id=37435515 - Sept 2023 (567 comments)
Now we have async/await and I'm always happy to see it.
I use it in C# and JS with no friction or mental overhead required.
In C#, I can still use channels or threads if I want to as well. But async/await is great for any I/O heavy code.
> Eventually, two or three sockets becomes a hundred, or even an unlimited amount. Guess it’s time to bring in epoll! Or, if you want to be cross-platform, it’s now time to write a wrapper around that, kqueue and, if you’re brave, IOCP.
This feels like a straw man. Nobody is saying "don't use async; use epoll!". The alternative to async is traditional OS threads. This option is weirdly not mentioned in the article at all.
And yes they have a reputation for being very hard - and they can be - but Rust makes traditional multithreading MUCH easier than in C++. And I would argue that Rust's async is equally hard.
Rust makes traditional threading way easier than other languages, and traditional async way harder than other languages, enough that threads are arguably simpler.
In some ways it's worse because you have to explicitly add them, and I have yet to see any Rust APIs that actually use them (though there is a `cancellation` crate so at least some must be).
In other ways it's better because it gives you control and explicit visibility over the cancellation points.
Yet tokio is the de facto standard and everything links against it. It’s really annoying. Rust should have either put a runtime in the standard library or made it a lot easier to be runtime neutral.
* EDIT: corrected, thanks
EDIT: I originally incorrectly claimed that stjepang also created rather than maintained crossbeam, making the same msitake as I was correcting.
The first transformation is that every async block / fn compiles to a generator where `future.await` is essentially replaced by `loop { match future.poll() { Ready(value) => break value, Pending => yield } }`. ie either polling the inner future will resolve immediately, or it will return Pending and yield the generator, and the next time the generator is resumed it will go back to the start of the loop to poll the future again.
The second transformation is that every generator compiles to essentially an enum. Every variant of the enum represents one region of code between two `yield`s, and the data of that variant is all the local variables that in the scope of that region.
Putting both together:
async fn foo(i: i32, j: i32) {
sleep(5).await;
i + j
}
... essentially compiles to: fn foo(i: i32, j: i32) -> FooFuture {
FooFuture::Step0 { i, j }
}
enum FooFuture {
Step0 { i: i32, j: i32 }
Step1 { i: i32, j: i32, sleep: SleepFuture }
Step2,
}
impl Future for FooFuture {
fn poll(self) -> Poll<i32> {
loop {
match self {
Self::Step0 { i, j } => {
let sleep = sleep(5);
self = Self::Step1 { i, j, sleep };
}
Self::Step1 { i, j, sleep } => {
let () = match sleep.poll() {
Poll::Ready(()) => (),
Poll::Pending => return Poll::Pending,
};
self = Self::Step2;
return Poll::Ready(i + j);
}
Self::Step2 => panic!("already run to completion"),
}
}
}
}"There is a common sentiment I’ve seen over and over in the Rust community that I think is ignorant at best and harmful at worst."
just refuses to read the rest? If you are actually trying to make a point to people that think differently than you, why antagonize them by telling them they don't know what they are talking about?
This is only partly true -- if you want to `spawn` a task on another thread then yes it has to be Send and 'static. But if you use `spawn_local`, it spawns on the same thread, and it doesn't have to be Send (still has to be 'static).
How could I unleash all the processors on my computer on this workload and allow them to correctly avoid repeated calculation of results of shared subtasks?
For example, I’m using an outbox: im::OrdMap<String, Array2<_>> and a situation might arise where one task could avoid repeating work on a subtask because that’s already in progress elsewhere by waiting for the key/value pair (so that process could do something else)
Would it be worth going to async for that?
How could a worker function know if some key in the outbox was already being calculated and it could work on something else?
How would you share an outbox like that across a bunch of rayon processes communicating with async?
(I’ll read smol docs and try to figure it out but this article made a lot of sense, thank you)
The orchestrator can then keep track of what's going on and avoid duplicating tasks and the workers don't need to worry about any global state.
I’m tired of everyone implementing async on their own.
There is simplicity in a avoiding that and having code that gets compiled to something that is straightforward and single threaded.
Exercising judgement about when to use or shirk an abstraction is a lot of what being a software engineer is about.
It adds complexity, but it's at the level where you don't have to think about it. If you're doing something advanced enough to where async is a leaky abstraction, you're probably doing something big enough to where you would want the advantages it offers.
If you're doing something simple, async is just a black box primitive that is pretty easy to use.
Furthermore, async rust can be run single threaded
> When you declare a path operation function with normal def instead of async def, it is run in an external threadpool that is then awaited, instead of being called directly (as it would block the server).
https://fastapi.tiangolo.com/async/#path-operation-functions
OP either meant this, or its variation, such as async_to_sync and sync_to_async. https://github.com/django/asgiref/blob/main/asgiref/sync.py
Ofc this is a python example. I have no idea how it works in different languages.
When writing a Future that will block for 5 seconds you will need to find somewhere to that you can put the code to block for 5 seconds. You don't technically need to even use an executor here.