The guy's got a point in that doing a bunch of Arc, RwLock, and general sharing of state is going to get messy. Especially once you are sprinkling 'static all over the place, it infects everything, much like colored functions. I did this whole thing once back when I was starting off where I would Arc<RwLock> stuff, and try to be smart about borrow lifetimes. Total mess.
But then rust also has channels. When you read about it, it talks about "messages", which to me means little objects. Like a few bytes little. This is the solution, pretty much everything I write now is just a few tasks that service some channels. They look at what's arrived and if there's something to output, they will put a message on the appropriate channel for another task to deal with. No sharing objects or anything. If there's a large object that more than one task needs, either you put it in a task that sends messages containing the relevant query result, or you let each task construct its own copy from the stream of messages.
And yet I see a heck of a lot of articles about how to Arc or what to do about lifetimes. They seem to be things that the language needs, especially if you are implementing the async runtime, but I don't understand why the average library user needs to focus so much on this.
When moving between threads I do what you suggest here and use channels to send signals rather than having a lot of shared state. Sometimes there is a crucial global state something that’s easier to just directly access, but I just write struct that manages all the Arc/RwLock or whatever other exclusive access mechanism I need for the access patterns. From the callers point of view everything is just a simple function call. When writing the struct I need to be thoughtful of sharing semantics but it’s a very small struct and I write it once and move on.
I also don’t understand their concern about making things Send+Sync. In my experience almost everything is easily Send+Sync, and things that aren’t shouldn’t or couldn’t be.
I get that sometimes you just want to wear sweatpants and write code without thought of the details, but most languages that offer that out of the box don’t really offer efficient concurrency and parallelism. And frankly you rarely actually need those things even if the “but it’s cool” itch is driving you. Most of the time a nodejs-esque single threaded async program is entirely sufficient, and a lot of the time Async isn’t even necessary or particularly useful. But when you need all these things, you probably need to hike up your sweatpants and write some actual computer code - because microseconds matter, profiled throughput is crucial, and nothing in life that’s complex is easy and anyone selling you otherwise is lying.
This is a recurring pattern I've started to notice with Rust: most things that repeatedly feel clunky, or noisy, or arduous, can be wrapped in an abstraction that allows your business logic to come back into focus. I've started to think this mentality is essential to any significant Rust project.
Async the keyword doesn’t, but Tokio forces all of your async functions to be multi thread safe. And at the moment, tokio is almost exclusively the only async runtime used today. 95% of async libraries only support tokio. So you’re basically forced to write multi thread safe code even if you’d benefit more from a single thread event loop.
Rust async’s set up is horrid and I wish the community would pivot away to something else like Project Loom.
I write a fair amount of code in Elixir professionally and this isn't how I view it.
There are some specific Elixir/Erlang bits of ceremony you need to do to set up your supervision tree of GenServers, but then once that's done you get to write code that feels like so gle threaded "ignore the rest of the world" code. Some of the function calls you're making might be "send and message and wait for a response" from GenServers etc. but the framework takes care of that.
I wrote some driver code for an NXP tag chip. Driving the inventory process is a bit involved, you have to do a series of things, set up hardware, turn on radio, wait a bit, send data, service the SPI the whole time in parallel. With the right setup for the hardware interface I just wrote the whole thing as a sequence, it was the simplest possible code you could imagine for it. And this at the same time as running a web server, and servicing hardware interrupts that cause it to reload the state of some registers and show them to each connected web session.
I imagine Rust to be a language far more similar to Go, in both use cases and functionality, than JS.
The dream of Smalltalk and true OOP is still alive.
If you say Smalltalk is better OOP I might agree, but calling it "true" is not correct.
When you need the absolute best performance sharing state is sometimes better - but you need a deep understanding of how your CPUs share state. A mutex or atomic write operation is almost always needed (the exceptions are really weird), and those will kill performance so you better spend a lot of time minimizing where you have them.
I would also suggest looking into ringbuffers and LMAX Disruptor pattern.
There is also Red Planet Lab's Rama, which takes the data flow idea and uses it to scale.
As a wise programmer once said, "Do not communicate by sharing memory; instead, share memory by communicating"
(But if you're only firing up a few tasks, why not just use threads? To get a nice wrapper around an I/O event loop?)
(This is assuming you are already switching to communicating using channels or similar abstraction.)
To get easier timers, to make cancellation at all possible (how to cancel a sync I/O operation?), and to write composable code.
There are patterns that become simpler in async code and much more complicated in sync code.
From https://news.ycombinator.com/item?id=37289579 :
> I haven't checked, but by the end of the day, I doubt eBPF is much slower than select() on a pipe()?
Channels have a per-platform implementation.
- "Patterns of Distributed Systems (2022)" (2023) https://news.ycombinator.com/item?id=36504073
Async code can scale essentially infinitely, because it can multiplex thousands of Futures onto a single thread. And you can have millions of Futures multiplexed onto a dozen threads.
This makes async ideal for situations where your program needs to handle a lot of simultaneous I/O operations... such as a web server:
http://aturon.github.io/blog/2016/08/11/futures/
Async wasn't invented for the fun of it, it was invented to solve practical real world problems.
Ultimately, it depends on your data model.
When you can guarantee sole ownership, why not put that exclusive pointer in the message? I’d think that this sort of compile-time lock would be an important advantage for the type system. (I think some VMs actually do this sort of thing dynamically, but I can’t quite remember where I read about it.)
On a multiprocessor, there’s of course a balance to be determined between the overhead of shuffling the object’s data back and forth between CPUs and the overhead of serializing and shuffling the queries and responses to the object’s owning thread. But I don’t think the latter approach always wins, does it? At least I can’t tell why it obviously should.
like “send request to channel A with message 123, make sure to get a response back from channel B exactly for that message”
But green threads were not and are not the right solution for Rust, so it's kind of beside the point. Async Rust is difficult, but it will eventually be possible to use Async Rust inside the Linux kernel, which is something you can't do with the Go approach.
Rust: it turns out that not every concurrency needs to be zero-cost abstraction
If you have a service that handles massive amounts of network calls at the core (think linkerd, nginx, etc.), or you want to have a massive amount of lightweight tasks in your game, or working on an embedded software where you want cooperative concurrency, async Rust is an amazing super-power.
Most system/application level things is not going to need async IO. Your REST app is going to be perfectly fine with a threadpool. Even when you do need async, you probably want to use it in a relatively small part of your software (network), while doing most of the things in threads, using channels to pass work around between async/blocking IO parts (aka hybrid model).
Rust community just mindlessly over-did using async literally everywhere, to the point where the blocking IO Rust (the actually better UX one) became a second class citizen in the ecosystem.
Especially visible with web frameworks where there is N well designed async web frameworks (Axum, Wrap, etc.) and if you want a blocking one you get:
tiny_http, absolute bare bones but very well done
rouille - more wholesome, on top of tiny_http, but APIs feel very meh comparing to e.g. Axum
astra - very interesting but immature, and rather barebonesBut it also praises Go for its implementation, which is also based on a coroutine of a different kind. Stackful coroutines, which do not have any of these problems.
Rust considered using those (and, at first, that was the project's direction). Ultimately, they went to the stackless operation model because stackfull coroutine requires a runtime that preempts coroutines (to do essentially what the kernel does with threads). This was deemed too expensive.
Most people forget, however, that almost no one is using runtime-free async Rust. Most people use Tokio, which is a runtime that does essentially everything the runtime they were trying to avoid building would have done.
So we are left in a situation where most people using async Rust have the worst of both worlds.
That being said, you can use async Rust without an async runtime (or rather, an extremely rudimentary one with extremely low overhead). People in the embedded world do. But they are few, and even they often are unconvinced by async Rust for their own reasons.
However, async Rust is not using stackless coroutines for this reason - it's using stackless coroutines because they achieve a better performance profile than stackful coroutines. You can read all about it on Aaron Turon's blog from 2016, when the futures library was first released:
http://aturon.github.io/blog/2016/08/11/futures/
http://aturon.github.io/blog/2016/09/07/futures-design/
It is not the case that people using async Rust are getting the "worst of both worlds." They are getting better performance by default and far greater control over their runtime than they would be using a stackful coroutine feature like Go provides. The trade off is that it's a lot more complicated and has a bunch of additional moving parts they have to learn about and understand. There's no free lunch.
I think that stackless coroutines are better than stackfull, in particular for Rust. Everything was done correctly by the Rust team.
Again, this is all fair and good, as long as people understand the tradeoff and make good technical decisions around. If they all jump on async bandwagon blind o the obvious limitations, we get where Rust ecosystem is now.
Stackful coroutines don't require a preemptive runtime. I certainly hope that we didn't end up with colored functions in Rust because of such a misconception.
I've used stackful coroutines many times in many codebases. It never required or used a runtime or preemption. I'm not sure why having a runtime that preempts them would even be useful, since it defeats the reason most people use stackful coroutines in the first place.
Yes. I just noticed that Tokio was pulled into my program as a dependency. Again. It's not being used, but I'm using a crate which has a function I'm not using which imports reqwest, which imports h2, which imports tokio.
I ask as someone who uses java and is about to rewrite a bunch of code to be able to chuck the entire async paradigm into the trash can and use a blocking model but on virtual threads where blocking is ok.
I enjoy Rust, and I love how the compiler helps me solve problems. However, the ecosystem is "async or gtfo", or "just write it yourself if you dont want async lmao", and that's not good enough.
Right now even building a library that support multiple async runtimes is a PITA, I have done it a couple times. So you end up supporting either just tokio and maybe async-std.
https://docs.rs/futures/latest/futures/executor/fn.block_on....
imagine you have an:
async fn do_things() -> Something { /* ... */ }
you can: use futures::executor::block_on;
fn my_normal_code() {
let something = block_on(do_things());
}
but this does get messy if the async code you're running isn't runtime-agnostic :(This is one of the goals of the async working group. Hopefully, when ready, that'll make it possible to swap out async runtimes underneath arbitrary code without issues.
If you’re learning the language, I would suggest starting out with some more vanilla sync code, loops and if statements, get used to the borrowing. Async is clearly still under heavy development, and not just from an implementation level, but also from the level of our philosophical paradigm about what async means and how it ought to work for the user. It’s entirely possible for humanity to have the wrong approach to this issue and maybe someone in this discussion will be able to answer it more effectively.
The compiler really depends on traits, and the ability for traits to handle async is not stable. Many highly intelligent people are hard at work thinking about how to make async rust more correct, readable, and accessible. For example, look here: https://blog.rust-lang.org/inside-rust/2022/11/17/async-fn-i...
I would argue, if the async functionality of traits is not stable in rust, then it is silly for us to attack rust for not having nice async code, because we’re effectively criticizing an early rough draft of what will eventually be a correct and performant and accessible book.
What does a good async API look like?
Also how do you prevent it spreading throughout a codebase?
I am trying to design a scalable architecture pattern for multithreaded and async servers. My design is that you have IO threads have asynchronous events into two halves "submit" and "handle". For example, system events from liburing or epoll are routed to other components. Those IO thread event loops run and block on epoll.poll/io_uring_wait_cqe.
For example, if you create a "tcp-connection" you can subscribe to async events that are "ready-for-writing" and "ready-for-reading". Ready-for-writing would take data out of a buffer (that was written to with a regular mutex) for the IO thread to send when EPOLLOUT/io_uring_prep_writev.
We can use the LMAX Disruptor pattern - multiproducer multiconsumer ringbuffers to communicate events between threads. Your application or thread pool threads have their own event loops and they service these ringbuffers.
I am working on a syntax to describe async event firing sequences. It looks like a bash pipeline, I call it statelines:
initialstate1 initialstate2 = state1 | {state1a state1b state1c} {state2a state2b state2d} | state3
It first waits for "initialstate1" and "initialstate2" in any order, then it waits for "state1", then it waits for the states "state1a state1b state1c" and "state2a state2b state2d" in any order.Edit: Of course, since this is what "unstable" means, right?
The lifetime of an Arc isn’t unknowable, it’s determined by where and how you hold it.
I think maybe the disconnect in this article is that the author is coming at Rust and trying to force their previous mental models on to it (such as garbage collection) rather than learning how to work with the language. It’s a common trap for anyone trying a new programming language, but Rust seems to trip people up more than most.
In the same sense that the lifetime of an object in a GC'd system has a lower bound of, "as long as it's referenced", sure. But that's nearly the opposite of what the borrow checker tries to do by statically bounding objects, at compile time.
> maybe the disconnect in this article is that the author is coming at Rust and trying to force their previous mental models on to it
The opposite actually! I spent about a decade doing systems programming in C, C++, and Rust before writing a bunch of Haskell at my current job. The degree to which a big language runtime and GC weren't a boogeyman for some problem spaces was really eye-opening.
Arc isn't an end-run around the borrow checker. If you need mutable references to the data inside of Arc, you still need to use something like a Mutex or Atomic types as appropriate.
> The degree to which a big language runtime and GC weren't a boogeyman for some problem spaces was really eye-opening.
I have the opposite experience, actually. I was an early adopter of Go and championed Garbage Collection for a long time. Then as our Go platforms scaled, we spent increasing amounts of our time playing games to appease the garbage collector, minimize allocations, and otherwise shape the code to be kind to the garbage collector.
The Go GC situation has improved continuously over the years, but it's still common to see libraries compete to reduce allocations and add complexity like pools specifically to minimize GC burden.
It was great when we were small, but as the GC became a bigger part of our performance narrative it started to feel like a burden to constantly be structuring things in a way to appease the garbage collector. With Rust it's nice to be able to handle things more explicitly and, importantly, without having to explain to newcomers to the codebase why we made a lot of decisions to appease the GC that appear unnecessarily complex at first glance.
Rust will do a lot of invisible memory relocations under the covers. Which can work great in single threaded contexts. However, once you start talking about threading those invisible memory moves are a hazard. The moment shared memory comes into play everything just gets a whole lot harder with the rust async story.
Contrast that with a language like java or go. It's true that the compiler won't catch you when 2 threads access the same shared memory, but at the same time the mental burden around "Where is this in memory, how do I make sure it deallocates correctly, etc" just evaporates. A whole host of complex types are erased and the language simply cleans up stuff when nothing references it.
To me, it seems like GCs simply make a language better for concurrency. They generally solve a complex problem.
These are not the same.
The problem with GC'd systems is that you don't know when the GC will run and eat up your cpu cycles. It is impossible to determine when the memory will actually be freed in such systems. With ARC, you know exactly when you will release your last reference and that's when the resource is freed up.
In terms of performance, ARC offers massive benefits because the memory that's being dereferenced is already in the cache. It's hard to understate how big of a deal this is. There's a reason people like ARC and stay away from GC when performance actually begins to matter. :)
The point about wrangling with Weak suggests that they're trying to build complex ownership structures (which, to be fair, would be easier in to deal with a single thread) which isn't really something easy to express in Rust in general. I use weak smart pointers exceedingly rarely. Outside of the first section (which isn't talking about async Rust specifically, it's just speaking about concurrency generally) channels aren't even mentioned. They're the main thing I use for communication between different parts of my program when writing async code and when interfacing between async and non-async code, plus the other signalling abstractions like Notify, semaphores, etc. Mutexes are slow and bottlenecky and shared state quickly gets complicated to manage, this has been known for ages. I think the problem might be more the `BIG_GLOBAL_STATIC_REF_OR_SIMILAR_HORROR` in the first place.
The comment about nothing stopping you from calling blocking code in an async context is valid, but it's relatively manageable and you can use `tokio::spawn_blocking` or similar when you must do it.
I think it's a fair assumption to say that the author is aware of what Arcs are and how they work. I believe their point is more so that because of how async works in Rust, users have to reach for Arc over normal RAII far more often than in sync code. So at a certain point, if you have a program where 90% of objects are refcounted, you might as well use a tracing GC and not have the overhead of many small heap allocations/frees plus atomic ops.
Perhaps there are in fact ways around Arc-ing things for the author's use cases. But in my (limited) experience with Rust async I've definitely run into things like this, and plenty of example code out there seems to do the same thing [1].
For what it's worth, I've definitely wondered whether a real tracing GC (e.g. [2]) could meaningfully speed up many common async applications like HTTP servers. I'd assume that other async use cases like embedded state machines would likely have pretty different performance characteristics, though.
[0] https://en.wikipedia.org/wiki/Garbage_collection_(computer_s...
[1] https://tokio.rs/tokio/tutorial/shared-state
[2] https://manishearth.github.io/blog/2015/09/01/designing-a-gc...
Fair, but when reading an article like this I have to refer to what's written, not what we think the author knew but didn't write.
…on a server where you can have a ton of RAM. It's superior on client machines because it's friendlier to swapped out memory, which is why Swift doesn't have a GC.
Obviously it's not random. It's statically unknowable.
In many cases this means it's much cheaper than objects in languages with implicit reference counting.
I'm currently plumbing through some logic to call a sync method on a struct that implements Future and it's... an interesting challenge.
While we can make zero-cost async abstractions somewhat easy for users, the library developers are the ones who suffer the pain.
You cannot run scoped fibers, forcing you to "Arc shit up", Pins are unusable without unsafe, and a tiniest change in an async-function could make the future !Send across the entire codebase.
A good candidate for this is Graal. It can compile (JIT/AOT) both WASM and also LLVM bitcode directly so Rust programs can have full hardware/OS access without WASM limitations, and in theory it could allow apps to fully benefit from the work done on Loom and async. The pieces are all there. The main issue is you need to virtualize IO so that it goes back into the JVM, so the JVM controls all the code on the stack at all times. I think Graal can do this but only in the enterprise edition. Then you'd be able to run ~millions of Rust threads.
Async/await was a terrible idea for fixing JavaScript's lack of proper blocked threading that is currently being bolted onto every language. It splits every language and every library-ecosystem in half and will cause pains for many years to come.
Everyone who worked with multi-threading outside of JavaScript knows that using actors/communicating sequential processes is the best way to do multi-threading.
I recently found an explanation for that in Joe Armstrong's thesis. He argues that the only way to understand multi-threaded programs is writing strictly sequential code for every thread and not muddling all the code for all the threads in one place:
"The structure of the program should exactly follow the structure of the problem. Each real world concurrent activity should be mapped onto exactly one concurrent process in our programming language. If there is a 1:1 mapping of the problem onto the program we say that the program is isomorphic to the problem.
It is extremely important that the mapping is exactly 1:1. The reason for this is that it minimizes the conceptual gap between the problem and the solution. If this mapping is not 1:1 the program will quickly degenerate, and become difficult to understand. This degeneration is often observed when non-CO languages ["non concurrency-oriented", looking at you JavaScript!] are used to solve concurrent problems. Often the only way to get the program to work is to force several independent activities to be controlled by the same language thread or process. This leads to a inevitable loss of clarity, and makes the programs subject to complex and irreproducible interference errors." [0]
[0] https://erlang.org/download/armstrong_thesis_2003.pdf
There is also a good rant against async/await by Ron Pressler who implemented project loom in java: https://www.youtube.com/watch?v=oNnITaBseYQ
As fun as it is to hate on JavaScript, it's really interesting to go back and watch Ryan Dahl's talk introducing Node.js to the world (https://www.youtube.com/watch?v=EeYvFl7li9E). He's pretty ambivalent about it being JavaScript. His main goal was to find an abstraction around the epoll() I/O event loop that didn't make him want to tear his eyes out, and he tried a bunch of other stuff first.
I don't think it's a "good" solution in the abstract, but in the concrete of "I have a dynamically-typed scripting language with already over a decade of development and many more years of development that will happen before the event-based stuff is really standard", it's nearly the only choice. Python's gevent was the only other thing I saw that kinda solved the problem, and I really liked it, but I'm not sure it's a sustainable model in the end as it involves writing a package that aggressively reaches into other packages to do its magic; it is a constant game of catch-up.
I do think it's a grave error in the 2020s to adopt async as the only model for a language, though. There are better choices. And I actually exclude Rust here, because async is not mandatory and not the only model; I think in some sense the community is making the error of not realizing that your task will never have more than maybe a hundred threads in it and a 2023 computer will chomp on that without you noticing. Don't scale for millions of concurrent tasks when you're only looking at a couple dozen max, no matter what language or environment you're in. Very common problem for programmers this decade. It may well be the most impactful premature optimization in programming I see today.
JS callbacks are indeed better than C callbacks because you can hold onto some state. Although I guess the capture is implicit rather than explicit, so some people might say it's more confusing.
I'm pretty sure Joyent adopted and funded node.js because they were doing lots of async code in C, and they liked the callback style in JavaScript better. It does match the kind of problems that Go is now being used for, and this was pre-Go.
But anyway it is interesting how nobody really talks about callbacks anymore. Seems like async/await has taken over in most languages, although I sorta agree with the parent that it could have been better if designed from scratch.
Agreed. JavaScript was actually my first language after TurboPascal in 1996.
I was also there listening to the first podcasts when node came out.
JavaScript is a very interesting language, especially with it's prototype memory model. And the eventloop apart from the language is interesting as well. And it's no coincidence Apple went as far as baking optimizations for JavaScript primitive operations into the M1 microcode.
But I still think multithreading is best done by using blocking operations.
NIO can be implemented on top of blocking IO as far as I know but not the other way round.
Also, sidenote, I think JavaScript's only real failure is the lack of a canonical module/import system. That error lead to countless re-implementations of buildsystems and tens of thousands of hours wasted debugging.
but I get it, you can always go back to the promises and callbacks if you want.
I actually think it was a great solution in JS/TS given it's a single threaded event loop. The lower level the language the worse of an abstraction it is though. So I think most of the complaints here about async Rust are valid.
The async patterns in Rust, especially with regards to data safety assurances for the compiler, are emblematic of this philosophy. Though there are complexities, the value proposition is a safer concurrency model that requires developers to think deeply about their data and execution flow. I do concur that Rust might not be the go-to for every massively concurrent userspace application, but for systems where robustness and safety are paramount, the trade-offs are justifiable. It's also worth noting that as the ecosystem evolves, we'll likely see more abstractions and libraries that ease these pain points.
Still, diving into the intricacies as this article does, gives developers a better foundational understanding, which in itself is invaluable.
This implies that you can't statically guarantee that a future is cleaned up properly, which means that if you spawn some async work, something may std::mem::forget a future, and then the borrow checker won't know that the references that were transitively handed out by the future are still live.
Rather than sprinkle Arc everywhere, I just use an unsafe crate like this:
https://docs.rs/async-scoped/latest/async_scoped/
This catches 99% of the bugs I would have written in C++, so it's a reasonable compromise. There's been some work to try to implement non-'static futures in a safe way. I'm hoping it succeeds.
The other big problem with rust (but this is on the roadmap to be fixed this year) is that async trait's currently require Box'ed futures, which adds a malloc/free to function call boundaries(!!!)
As for the "just use a channel" advice: I've dealt with large codebases that are structured this way. It explodes your control flow all over the place. I think of channels as the modern equivalent of GOTO. (I do use them, but not often, and certainly not in cases where I just need to run a few things in parallel and then wait for completion.)
An important distinction to make is that tokio Futures aren't 'static, you can instead only spawn (take advantage of the runtime's concurrency) 'static Futures.
> This implies that you can't statically guarantee that a future is cleaned up properly.
Futures need to be Pin'd to be poll()'d. Any `T: !Unpin` that's pinned must eventually call Drop on it [0]. A type is `!Unpin` if it transitively contains a `PhantomPinned`. Futures generated by the compiler's `async` feature are such, and you can stick this in your own manually defined Futures. This lets you assume `mem::forget` shenanigans are UB once poll()'d and is what allows for intrusive/self-referential Future libraries [1]. The future can still be leaked from being kept alive by an Arc/Rc, but as a library developer I don't think you can/would-care-to reasonably distinguish that from normal use.
[0]: https://doc.rust-lang.org/std/pin/#drop-guarantee
[1]: https://docs.rs/futures-intrusive/latest/futures_intrusive/
Would you prefer not to have internal mutability, not to have `Rc`, or have them but with infectious unsafe trait bounds, or something else?
If an API leaks memory, then I’d like it to be deemed unsafe. That way, leaking a future would be unsafe, so the borrow checker could infer (transitively) that freeing the future means that any references it had are now dead (as it can already infer when a synchronous function call pops returns).
Am I missing something subtle?
Edit: Rc with cycles would be a problem. I rarely intentionally use Rc though (certainly less often than I create a future).
Edit 2: maybe an auto trait could statically disallow Rc cycles somehow?
Concurrency's correct primitive is Hoare's Communicating Sequential Processes mapped onto green threads. Some languages that have it right are Java (since JDK17 - Java Virtual Threads), Go, Kotlin.
The fact that a function can perform asynchronous operations matters to me and I want it reflected in the type system. I want to design my system on such a way that the asynchronous parts are kept where they belong, and I want the type system's help in doing that. "May perform asynchronous operations" is a property a calling function inherits from its callee and it is correctly modelled as such. I don't want to call functions that I don't know this about.
Now you can make an argument that you don't want to design your code this way and that's great if you have another way to think about it all that leads to code that can be maintained and reasoned about equally well (or more so). But calling the classes of functions red and blue and pretending the distinction has no more meaning than that is not such an argument. It's empty nonsense.
"We" don't all agree on this.
async doesn't tell you whether the function performs asynchronous operations, despite the name. async is an implementation detail about how the function must be invoked.
As TFA correctly points out, there's nothing stopping you from calling a blocking function inside a future, and blocking the whole runtime thread.
Either way, all of these changes are really annoying to make. We want less of these annoyances, not more.
Futures aren't a fundamental CS mistake, they're a design decision. You may disagree with that decision, but the advantage Rust brings is that you don't need to worry about thread safety once your program actually compiles, at the cost of different code styles.
Neither asynchronous processing design is fundamentally wrong, they both have their strengths and weaknesses.
Why would that ever be an issue? Instances of those classes shouldn't be shared between virtual threads just the same as when using regular threads.
true, but DateTimeFormatter has been available since Java 8, released almost 10 years ago.
VirtualThreads will be available in Java tomorrow
Also: https://docs.oracle.com/javase/8/docs/api/java/text/SimpleDa...
There is also nothing fundamentally bad with cooperative scheduling in scope of a single process.
The vast majority of them were already wrong. They only got more wrong.
You may just be used to knowing what code is "synchronous" and what isn't because it's been shoved into your face and you've adapted your thought process to it. In practice, "everything important is doing something 'asynchronously'" turns out to be the vast majority of what you need, and the vast majority of your mental energy you are dedicated to splitting the world in two is a waste. For the little bit that remains, by all means use something specialized, but it's just not something that everyone, everywhere, needs to be doing all the time, any more than everyone everywhere should be manually allocating registers, or any more than programs need to have line numbers because otherwise how can they work? (One of my favorites because I remember having that conception myself.)
Can you elaborate?
1. inability to read an async result from a sync function, which is a legitimately major architectural limitation.
2. author's opinion how function syntax should look like (fully implicit, hiding how the functions are run).
And from this there is the endless confusion and drama.
The problem 1 is mostly limited to JS. Languages that have threads can "change colour" of their functions at will, so they don't suffer from the dramatic problem described in the article.
But people see languages don't fit the opinion 2, of having magic implicit syntax, and treat it as an equally big deal the dead-end problem 1. But two syntaxes are somewhere between minor inconvenience to actual feature. In systems programming it's very important which type of locks you use, so you really need to know what runs async.
I’m hesitant towards not distinguishing different things anymore and let the underlying system “figure it out”. I’m sure this could work as long as you’re on the happy path, but that’s not the only path there is.
What I'm missing at the end of the article is the author's point: I believe they're advocating for the use of raw threads and manual management of concurrency, and doing away with the async paraphernalia. But, at the same time, earlier in the article they give the example of networking-related tasks as something that isn't so easy to deal with using only raw threads.
So, taking into account that await&co. are basically syntactic sugar + an API standard (iirc, I haven't used Rust so much lately), I wonder about what the alternative is. In particular, it seems to me like the alternative you could have would be everyone rolling their own "concurrency API", where each crate (inconsistently) exposes some sort of `await()` function, and you have to manually roll your async runtime every time. This would obviously also not be ideal.
> Maybe Rust isn’t a good tool for massively concurrent, userspace software. We can save it for the 99% of our projects that don’t have to be.
Personally, I'm a bit more radical than the author. You won't be able to write software like the example correctly. It should just not be done, ever. Machines can still optimize some sanely organized software into the same thing, maybe, if it happens to be a tractable problem (I'm not sure anybody knows). But people shouldn't touch that thing.
What that means is that when I'm writing async code, I have to audit every library I import to make sure that library is guaranteed to yield after a few microseconds of execution, otherwise my own core loops starve. Importing unknown code when using async rust is not safe for any application that needs to know its own threads won't starve.
A safe async language must guarantee that threads will make progress. Rust should change the scheduler so that it can pre-empt any code after that code has hogged a thread for too long.
Rust doesn't have a scheduler, and having one would be a no-go for any sufficiently low level code (e.g. in microcontrollers).
You might be looking for parallelism, not concurrency.
It was used because of ineptitude of languages where it become popular, and its far easier to implement into GC-less languages than message-passing-based asynchronous, but it's just misery to write code in. I'd prefer to suffer Go ineptitudes just to use bastardised message passing called channels there rather than any of the Python/JS/Rust async.
It was created to be an improvement over the Javascript situation, and somehow every language that had a sane structure adopted it as if it was not only good, but the way to do things. This is insane.
I see this repeated everywhere in this thread. async/await originated in C# not JS.
can you tell why it is not how not to do it in your opinion? What are the obvious issues with this approach?
JVM's futures are a joy to work with compared to JS's promises (or Kotlin's coroutines for that matter). While similar, I don't think you can conflate them.
Other times however rust stops me from writing buggy code and where I didn’t quite understand what I was doing. In some sense it can help you understand what your software better (when the problem isn’t an implementation detail).
I get the authors frustration, I often have the same feelings. Sometimes you just want to tell rust to get out of your way.
As an aside, I think there is room for a language similar to golang with sum types and modules and be a joy.
Concurrency is a subtype of parallelism. All concurrency is parallelism, but leaving some aspects of parallelism off the table.
I've worked in both worlds: I've built codes that manage thousands of connections through the ancient select() call on single processes (classic concurrency- IO multiplexing where most channels are not active simultaneously, and the amount of CPU work per channel is small) to synchronous parallelism on enormous supercomputers using MPI to eke out that last bit from Amdahl's law.
Over time I've come to the conclusion that a thread pool (possibly managed by the language runtime) that uses channels for communication and has optimizations for work stealing (to keep queues balanced) and eliminating context switches. Although it does not reach the optimal throughput of the machine (because shared memory is faster than message passing) it's a straightforward paradigm to work with and the developers of the concurrency/parallel frameworks are wise.
But these existential types can only be specified in function return or parameter position, so if you want to name a type for e.g.:
let x = async { };
You can't! Because you can only refer to it as `impl Future<Output = ()>` but that's not allowed in a variable binding! let x = || -> i32 { 1 }; // fine
let x = || -> impl Future<Output = i32> { async { 1 } }; // error: `impl Trait` only allowed in function and inherent method return types, not in closure return types
Unless I'm missing something, sometimes you do have to name the return type of an async closure if it's returning e.g Result<T, Box<dyn Error>>, and use of the ? operator means that the return type can't be inferred without an explicit annotation.I have some quibbles with this article:
"Rust comes at this problem with an “async/await” model"
No, it does not. It allows for that, and there's a big ... community ... around the async stuff, but in reality the language is entirely fine with operating using explicit concurrency constructs. And in fact for most applications I think standard synchronous functions combined with communicating channels is cleaner. I work in code bases that do both, and I find the explicit approach easier to reason about.
In the end, Async is something people ideally reach for only after they hit the wall with blocking on I/O. But in reality they're often reaching for it just because -- either because it's cool... or because some framework they are relying on mandates it.
But I think the pendulum will swing back the other way at some point. I don't think it's fair to tar the whole language with it.
This is like saying C++ allows for templates, and theres a big community around it. Sure, but its the entire community.
I don't think it's "the entire community" at all. Dealing with futures across library calls is a pain and almost every library that can avoid it, will avoid it.
I try to avoid async code because of its annoying pain points and I rarely see any circumstances where spawning a new thread doesn't work. Sure, there's more overhead, and you need some kind of limiting factor to prevent spawning a billion of them, but async isn't really required in most circumstances.
It's like saying Go allows for generics. Very few people and libraries bother with them. Working with them is kind of a pain. They're there jf you want to use them, but you generally don't.
Believe it or not, there's other types of things being built in Rust. Systems work, which I think Rust is more appropriate for.
Maybe async is the most popular concurrency construct there (I have no idea). But the entire population here is small.
Rust is all about lifetimes and the borrow checker. Async code (a la C#) will introduce overhead to reason about lifetime and it might not be as "fun" as it is with other languages that makes use of GCs and bigger runtimes.
The CSP vs Async/Await discussion is valid, but like in the majority of the cases, the drawbacks and benefits are not language relevant.
In CSP, the concurrent snippets behave just like linear/sequencial code as channels abstracts await a lot of the ugly bits. Sequential code tends to be easier to reason and this might be very important for Rust considering it design.
A good tool for massively concurrent software will as expected depend on the aspects you're evaluating: - Performance: the text does not show benchmarks evaluating Rust as a slow language. - Code/Feature throughput: the overall conclusion from the text if that Async Rust is a complex tool and expose the programmers in many ways to shoot themselves in the foot.
Assuming the "Maybe Rust..." is only talking about Async Rust, the existence of big Async Rust projects is a good counter argument. We also have the whole rest of the Rust language to code massively concurrent, userspace software.
Massively concurrent, userspace software tends to be complex and big to the point that design decisions generally impact way more the language decision.
Rust is a modern language with interesting features to prevent programmers from writing unsafe programs and this is a good head start to many when making those kind of programs, more than whether you want to use Async code or not.
* While the author states that not many apps "need" high concurrency in userspace... I would invert that and say that we may be missing so much performance, new potential applications, etc because highly concurrent code is so hard to get right. One bit of evidence of this (to me at least) is how often in my career I have had to scale things up due to memory or other resource limitations and not CPU. And when it is CPU, so often looking into it more finds bugs with concurrency that are the root cause or at least exacerbate the issue
* While I completely agree that rust is not easy with async and have myself poked around at which magical type things I need to do each time I have touched async rust code, I don't really like the suggestion being to "go use a different language", first, because if you are picking up rust, you (IMHO) should have a very good reason to already have chosen it. Rust is not easy enough or ubiquitous enough that you should be choosing it "just for fun" and your reason for using Rust should be compelling enough that you (right now) are willing to put in the effort to learn async when you need it
* What the other mentions in the body of the article, but I think is more of what my suggestion would be: don't use async unless you need it!. While I would love to see Rust (and think it should) evolve to the point where async is "easy", maybe we instead just need to get more pragmatic in what is taught and written about. I think when people start Rust they want to use all the fanciness, which includes async, and while some of that is just programmers, I think it is also how tutorials, docs, and general communication about a programming language happens where we show the breadth of capability, rather than the more realistic learning path, which leads people to feel like if they don't use async, they aren't doing it right
Finally, I do really hope Rust keeps working on the promise of these zero cost abstractions that can really simplify things... but if that doesn't work, I am at least hopeful of what people can build on top of the rust featureset/toolchain to help make things like async more realistic to be the default without the need for a complex VM/runtime.
I suspect that to take advantage of 1024-thread systems the only sane programming model will be structured concurrency with virtual threads instead of coroutines.
It’s the same progression as we saw in the industry going from unstructured imperative assembly programming to structured programming with modular features.
Both traditional mutexes and to a degree async programming are unstructured and global. They infect the whole codebase and can’t be reasoned about in isolation. This just doesn’t scale.
To your point, the C# guys seem to be interested in experimenting with green threads: https://twitter.com/davidfowl/status/1532880744732758018
It's an amazing combination.
Async functions don't have to always own their arguments. Just the outermost future that is getting spawned on another thread has to. The rest of the async program can borrow arguments as usual. You don't need to spawn() every task — there are other primitives for running multiple futures, with borrowed data, on the same thread.
In fact, this ability for a future to borrow from itself is the reason why Rust has native await instead of using callbacks. Futures can be "self-referential" in Rust, and nothing else is allowed to.
Maybe in the 2000's but I feel this reasoning is no longer valid in 2023 and should be put to rest.
10k problem.. Wouldn't modern computing not work if my Linux box couldn't spin up 10k threads? Htop says I'm currently at 4,000 threads on an 8 core machine.
The async case is suited to situations where you're blocking for things like network requests. In that case the thread will be doing nothing, so we want to hand off the work to another task of some kind that is active. Green threads mean you can do that without a context switch.
It got even more expensive in recent years after all the speculative execution vulnerabilities in CPUs, so now you have additional logic on every context switch with mitigations on in kernel.
I have no doubt that having a thread per core and managing the data with only non-blocking operations is much faster. But I'm pretty current machines can manage a thousand or so threads locked almost the entire time just fine.
So do we discard existing ways of making software more efficient because we can be more wasteful on more recent hardware? What if we could develop our software such that 2000s computers are still useful, rather than letting those computers become e-waste?
> The numbers reported here paint an interesting picture on the state of Linux multi-threaded performance in 2018. I would say that the limits still exist - running a million threads is probably not going to make sense; however, the limits have definitely shifted since the past, and a lot of folklore from the early 2000s doesn't apply today. On a beefy multi-core machine with lots of RAM we can easily run 10,000 threads in a single process today, in production. As I've mentioned above, it's highly recommended to watch Google's talk on fibers; through careful tuning of the kernel (and setting smaller default stacks) Google is able to run an order of magnitude more threads in parallel.
By the 2010s the problem had been updated to C10M. The people discussing it (well, perhaps some) aren't idiots and understand that the threshold changes as hardware changes.
Also, the issue isn't creating 10k threads it's dealing with 10k concurrent users (or, again, a much higher number today).
Typically, if you want to build something with Rust, it'll have to use async, at least because gRPC and the like are implemented that way. So the vanilla (and excellent, IMO) Rust language doesn't exist there. Everything is async from the get-go.
A weird way to use Rust since you can do a lot of messaging within the process, and use the computing power much more efficiently.
RPC is essentially messaging and message-passing. Message-passing is a way to avoid mutable shared state - this is the model with which Go became successful.
RPC surely has its use but message passing is another, and very often inferior, solution to the problem set where Rust has excellent own solutions for.
If I'm implementing a library, how should I write it so that the consumer of the library doesn't have to pull in Tokio if they don't want to?
The arguments about Arc fall flat because how else would you safely manage shared references, even in other lower level languages. And so called "modern GCs" still do come with a significant hit in performance; it's not just some "bizarre psyop".
Really the only problem I've run into with Rust's async/await is the fact that there is not much support for composing async tasks in a structured way (i.e. structured programming) and the author doesn't even touch on this issue.
Ultimately the goals and criticism of the author are just downright confusing because at the end he admits that he doesn't actually care for the fact that Rust is design constrained by being a low level language and instead advocates for using Haskell or Go for any application that requires significant concurrency. So to reformulate his argument: we should never use or design into low level languages an ergonomically integrated concurrency runtime because it may have a handful of engineering challenges. When put concisely, their thesis is really quite ridiculous.
With all this in mind, I really like Swift concurrency runtime. It does automatic thread migration and compaction to reduce the overhead of context switches, balances the thread allocation system-wide taking relative priorities into account, and it appears to be based on continuations instead of state machines. A very interesting design worth studying IMO.
I've coded performant applications on an OS that used channels and it sucked. It just got in the way and was confusing to engineers used to lower level constructs. "Just get out of my way!"
I think rust async is hard.
And that's what it comes down to. 99.9% (maybe more nines) of people do not need that level of control. They need conceptually simple things, like channels, and GC, and that will work for nearly everyone. The ones who need to drop to rust either have the engineers to do that, or their problem is intractable (for them). I pity those who drop to rust prematurely because it's cool.
I'm very curious; what OS is this?
Isn't that already, in this strong generality, an almost always wrong assumption?
Sure, one can do massively parallel or embarrassingly parallel computation.
Sure, graphic cards are parallel computers.
Sure, OS kernels use multiple cores.
Sure, languages and concepts like Clojure exist and work - for a specific domain, like web services (and for that, Clojure works fascinatingly well).
But there are many, even conceptually simple algorithms which are not easy to parallelize. There is no efficient parallel Fast Fourier Transform I know of.
Try it. It'll probably work fine. It may be very expensive, memory wise, but it's easy to get a machine with a lot of memory.
It's been tried, periodically. Still sucks.
Or in other words, the goal is that you can think in abstract what the natural optimal machine code would be for a program, and you can write a Rust program that, in principle, can compile to that machine code, with as little constraints as possible on what that machine code looks like.
Unlike C, that also has this property, Rust additionally seeks to guarantee that any code will satisfy a bunch of invariants (such as that a variable of a data type actually always holds a valid value of that data type) provided the unsafe code part satisfies a bunch of invariants.
If you use Go or Haskell, that's not possible.
For example, Go requires a GC (and thus requires to waste CPU cycles uselessly scanning memory), and Haskell requires to use memory to store thunks rather than the actual data and has limited mutation (meaning you waste CPU cycles uselessly handling lazy computations and copying data). Obviously neither of this are required for the vast majority of programs, so choosing such a language means your program is unfixably handicapped in term of efficiency, and has no chance to compile to the machine code that any reasonable programmer would conceive as the best solution to the problem.
Out of curiosity, could Rust be limited to a language subset to mimic the simplicity of Golang (with channels and message passing) and trade-off some of the powerful features that seem to be causing pain?
Pardon a naïve question. I’m a systems engineer who occasionally dabbles with simple cli tools in all languages for fun, but don’t have a serious need for them.
From what I can gather, such projects will never happen though. That's why I moved part of my work to Golang itself.
Rust is an amazing language. Though the team really takes the "system language" thing very seriously and they're making decisions and tradeoffs based on that, so it seems us its users should adapt and not use Rust for everything. That's what I ended up doing.
Good call, re: garbage collection FUD. Ultimately many programs have to clean up memory after it is no longer needed by the program and at a certain scale in a program it becomes necessary to write code that handles allocations/deallocations; and you end up manually writing a garbage collector. Done well you can get better performance for certain cases but often it's done haphazardly and you end up with poor performances.
It seems a good amount of Rust evangelism has given up on the, "no GC is required for performance," maxim. Is that the case, Rust friends?
That being said, I think it would be neat if there were a language like Haskell where there was an interface exposed by the compiler where a user could specify their own GC.
[0] https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/line...
Async Rust is many language features and behaviour all interacting with each other at the same time, to create something more complicatedly than how you would describe the problem you're actually trying to solve (I want to do X when Y happens, and I want to do X when Y happens × the number of hardware threads). When you're using async rust, you are having to think more carefully about:
* memory management (Arc) and safety and performance
* concurrency
* parallelism, arbitrary interleavings
* thread safety
* lifetimes
* function colouring
All interacting together to create a high cognitive load.
Is the assembly equivalent of multithreading and async complicated?
Multithreading, async, coroutines, concurrency and parallelism is my hobby of research I enjoy. I journal about it all the time.
* I think there's a way to design systems to be data intensive (Kleppmann) and data orientated (Mike Acton) with known-to-scale practices.
* I want programming languages to adopt these known-to-scale practices and make them easy.
* I want programs written in the model of the language to scale (linearly) by default. Rama from Red Planet Labs is an example of a model that scales.
* HN user mgaunard [0] told me about "algorithmic skeletons" which might be helpful if you're trying to parallelise. https://en.wikipedia.org/wiki/Algorithmic_skeleton
I think the concurrency primitives in programming languages are sharp edged and low level, which people reach for and build upon primitives that are too low level for the desired outcome.
[0]: https://news.ycombinator.com/item?id=36792796
[1]: https://blog.redplanetlabs.com/2023/08/15/how-we-reduced-the...
Note: You can use async Rust without threading but I assumed you're using multithreading.
Re: the conclusion, I wonder if this is a problem that can be solved over time with abstractions (i.e. async Rust is a good foundation that's just too low-level for direct use)?
(They mention this extra constraint early in the article: "But this approach has its limitations. Inter-process communication is not cheap, since most implementations copy data to OS memory and back.")
I'm familiar with writing services with large throughputs by offloading tasks onto a queue (say Redis/Rabbitmq whatever) and having a lot of single threaded "agents" or "workers" picking them off the queue and processing them.
But as implied in the earlier quote from the article, this is not an acceptable fast or cheap enough solution for the problems the author is talking about.
So now am left wondering: what are some examples of the class of (1%) problems the author is talking about in this article?
"Stackful" coroutines, on the other hand, do have runtime stacks (holding local variables) that get swapped out by the runtime on await points. It makes the code behave exactly like non-async code, but requires a runtime to manage those stacks. Rust didn't go this way, preferring the benefits of the stackless approach.
Until all the work you're trying to push is generating so many allocations that your GC goes to shit once every two minutes trying to clean up the mess you made. (https://discord.com/blog/why-discord-is-switching-from-go-to...)
I haven't investigated it deeply, but I was developing something in Rust, and whether something needs to be threadsafe or not is entirely on the consumer's use case... bad separation of concerns for the provider of a generic interface to have to specify the specific type of boxed value. 100% fine if the behavior in this case is to pre-allocate the max possible boxed type memory requirement.
This is the only thing I was really frustrated with in Rust
Your generic interface just takes a reference to the value inside the box.
If it's dynamic, you can use Cow or the supercow/bos/... crates if you want Arc/Rc to be options as well.
I really want to use TypeScript, as I like the language and I want to use this as a way to learn it better. I'm not expecting to have some super successful game, but the programmer part of my mind is upset at not utilizing all the cores of the machine. So, what do people do? Split up the server into multiple independent running components, or is my choice really to just use another language?
I know parallel ATA cables were all the rage. They had a higher theoretical throughput when compared with serial ATA cables but there was too much cross-talk involved to make it actually faster in the end so now we have serial ATA cables everywhere with much higher throughput than parallel ATA cables could ever achieve.
Should we move back away from parallelism and focus on handling synchronous stuff faster instead?
> Should we move back away from parallelism and focus on handling synchronous stuff faster instead?
Rust already has excellent handling of synchronous computation, given that it can meet/sometimes exceed equivalent performance in C. The problem is when you're I/O or network bound; you can either throw threads at the problem (and by extension throw memory at the problem for the thread stacks) or use async programming.
I want to write stuff to disk (SSD these days). I can issue a request, then have to wait tens to hundreds of milliseconds (in the average case, the worst case can be far longer) for that request to finish and let me know that my I/O request succeeded or failed. There's no getting around that with present-day technology.
The situation is worse and even less reliable with network I/O. If you are talking to a server in another continent, the speed of light determines the minimum of time I hear back from it, even if it (and all the intermediary network links) are lightly loaded and functioning perfectly.
Java is ok too if you want object oriented atomic joint parallelism, but I only recommend using it on the server where you need a VM anyhow.
C from 1970 and Java from 1990 still got things right.
Also Vulkan/Metal/DX12 does not really help, OpenGL 3 with VAO is enough.
Er, no. That’s not what those words mean.
“We want to use the whole computer. Code runs on CPU cores, and in 2023, even my phone has eight of the damn things. If I want to use more than 12% of the machine, I need several threads.”
Well, I would hardly mind to use the GPU for any part of my program which would fit it. That's why I believe it could be a great idea for a modern programming language to include first-class GPU-accelerated types and instructions.
Please make it happen! I want my userspace software to be in Rust!
Although, if it won't happen, then even better, a free real estate for a RustScript.
Not my cup of tea.
I think having to keyword async is frustrating as a design decision
Sure, Rust is certainly verbose and very strict how the ownership rules apply in the context of async, but this is a hard constraint of its memory safety model. We could probably do better while retaining all performance but this is by far one of the best implementations. Another example of nice to use async/await is C# which trades performance/memory (state has to be boxed if it is to live across continuations) for convenience (you just write it naturally without worrying about underlying behavior).
There is a reason Rust toyed with "green threads" at its inception but decided against such. The only popular languages of today that do these are Go and Java (which basically forced to do this because you can't go async without introducing the feature early in the lifecycle of the language, and the authors of project Loom are simply wrong with their excuses why this is superior to async/await).
Async/await is here to stay and is the right abstraction, git good, and it's not even difficult to use anyway.
[0] where feature name is green threads, not doing concurrency at all, doing it manually, etc.
It's probably the right abstraction for Haskell, or any other language that works well with functional programming, lambdas and monads. Loom is a better fit for Java. Rust also would have probably been better off with something else. Effect handlers might have been a good choice.
Why?