So it looks like their goal was: try adopting a new technology without changing any of the aspects designed for an old technology and optimised around it.
Your observation that computing architectures have chased fast server for decades is apt. There's a truism in computing that those who build systems are doomed to relearn the lessons of the early ages of networks, whether they studied them in school or not. But kudos to whoever went through the exercise again.
The advice "don't use virtual threads for that, it will be inefficient" really does need some evidence.
Mildly infuriating though that people may read this and think that somehow the JVM has problems in its virtual thread implementation. I admit their 'Unexpected findings' section is very useful work, but the moral of this story is: don't use virtual threads for this that they were not intended for. Use them when you want a very large number of processes executing concurrently, those processes have idle stages, and you want a simpler model to program with than other kinds of async.
Virtual threads are therefore useful if you're writing something like a proxy server, where you want to allow lots of concurrent connections, and you want to use the familiar thread-per-connection programming model.
What "CPU-intensive apps" did they test with? Surely not acmeair-authservice-java. A request does next to nothing. It authenticates a user and generates a token. I thought it at least connects to some auth provider, but if I understand it correctly, it just uses a test config with a single test user (https://openliberty.io/docs/latest/reference/config/quickSta...). Which would not be a blocking call.
If the request tasks don't block, this is not an interesting benchmark. Using virtual threads for non-blocking tasks is not useful.
So, let's hope that some of the tests were with tasks that block. The authors describe that a modest number of concurrent requests (< 10K) didn't show the increase in throughput that virtual threads promise. That's not a lot of concurrent requests, but one would expect an improvement in throughput once the number of concurrent requests exceeds the pool size. Except that may be hard to see because OpenLiberty's default is to keep spawning new threads (https://openliberty.io/blog/2019/04/03/liberty-threadpool-au...). I would imagine that in actual deployments with high concurrency, the pool size will be limited, to prevent the app from running out of memory.
If it never gets to the point where the number of concurrent requests significantly exceeds the pool size, this is not an interesting benchmark either.
Unfortunately the slides from that presentation were not uploaded to the conference site, but this article summarizes [2] the most significant metrics. The Oracle guy claimed that by using Virtual Threads Oracle was able to implement, using imperative Java, a new engine for Helidon (called Nima) that had identical performance to the old engine based on Netty, which is (at least in Oracle's opinion) the top performing reactive HTTP engine.
The conclusion of the presentation was that based on Oracle's experience imperative code is much easier to write, read and maintain with respect to reactive code. Given the identical performance achieved with Virtual Threads, Oracle was going to abandon reactive programming in favor of imperative programming and virtual threads in all its products.
[1] https://www.eclipsecon.org/2022/sessions/helidon-nima-loom-b...
[2] https://medium.com/helidon/helidon-n%C3%ADma-helidon-on-virt...
Go's goroutines are preemptive (and Go's development team went through a lot of pain to make them such).
Java's lightweight threads aren't.
Java's repeating the same mistakes that Go made (and learned from) 10 years ago.
This is crucial, because Java wouldn't necessarily require the same optimizations Go needed.
Making Virtual Threads fully preemptive could be useful, but it's probably not as crucial as it was for Go.
Go does not have a native mechanism to spawn OS threads that are separate from the scheduler pool, so if you want to run a long CPU-heavy task, you can only run it on the same pool as you run your I/O-bound Goroutines. This could lead to starvation, and adding partial preemption and later full preemption was a neat way to solve that issue.
On the other hand, Java still has OS threads, so you can put those long-running CPU-bound tasks on a separate thread-pool. Yes, it means programmers need to be extra careful with the type of code they run on Virtual Threads, but it's not the same situation as Go faced: in Java they DO have a native escape hatch.
I'm not saying a preemptive scheduler won't be helpful at Java, but it just isn't as direly needed as it was with Go. One of the most painful issues with Java Virtual Threads right now is thread pinning when a synchronized method call is executed. Unfortunately, a lot of existing Java code is heavily using synchronized methods[1], so it's very easy to unknowingly introduce a method call that pins an OS thread. Preemeptive could solve this issue, but it's not the only way to solve it.
---
[1] One of my pet peeves with the Java standard library is that almost any class or method that was added before Java 5 is using synchronized methods excessively. One of the best examples is StringBuffer, the precursor of StringBuilder, where all mutating methods are synchronized, as if it was a common use case to build a string across multiple threads. I'm still running into StringBuffers today in legacy codebases, but even newer codebases tend to use synchronized methods over ReentrantLocks or atomic operations, since they're just so easy to use.
A number of years ago I remember trying to have a sane discussion about “non blocking” and I remember saying “something” will block eventually no matter what… anything from the buffer being full on the NIC to your cpu being at anything less than 100%. Does it shake out to any real advantage?
So far, they're not quite there yet: the issue of "thread pinning" is something developers still have to be aware of. I hear the newest JVM version has removed a few more cases where it happens, but will we ever truly 100% not have to care about all that anymore?
I have to say things are already pretty awesome however. If you avoid the few thread pinning causes (and can avoid libraries that use them - although most of not all modern libraries have already adapted), you can write really clean code. We had to rewrite an old app that made a huge mess tracking a process where multiple event sources can act independently, and virtual threads seemed the perfect thing for it. Now our business logic looks more like a game loop and not the complicated mix of pollers, request handlers, intermediate state persisters (with their endless thirst for various mappers) and whatnot that it was before (granted, all those things weren't there just because of threading.. the previous version was really really shitily written).
It's true that virtual threads sometimes hurt performance (since their main benefit is cleaner simpler code). Not by much, usually, but a precisely written and carefully tuned piece of performance critical code can often still do things better than automatic threading code. And as a fun aside, some very popular libraries assumed the developer is using thread pools (before virtual threads, which non trivial Java app didn't? - ok nobody answer that, I'm sure there are cases :D) so these libraries had performance tricks (ab)using thread pool code specifics. So that's another possible performance issue with virtual threads - like always with performance of course: don't just assume, try it and measure! :P
Unfortunately kafka, for example, has not: https://github.com/spring-projects/spring-kafka/commit/ae775...
Also, all the database vendors provided their drivers implementing the JDBC API - good luck getting Oracle or IBM contribute to R2DBC.. (Actually, I stand corrected: there is an Oracle R2DBC driver now - it was released fairly recently though.)
EDIT: "failed miserably" is maybe too strong - but R2DBC certainly doesn't have the support and acceptance of JDBC.
My question is though: Why even do alleged “non-blocking” _at all_? What are people trying to optimize against?
To put it shortly: Writing single-threaded blocking code is far easier for most people and has many other benefits, like more understandable and readable programs: https://www.youtube.com/watch?v=449j7oKQVkc
The main reason why non-blocking IO with it's style of intertwining concurrency and algorithms came along is that starting a thread for every request was too expensive. With virtual threads that problem is eliminated so we can go back to writing blocking code.
I’d say that writing single-threaded code is far easier for _all_ people, even async code experts :)
Also, single-threaded code is supported by programming language facilities: you have a proper call stack, thread-local vars, exceptions bubbling up, structured concurrency, simple resource management (RAII, try-with-resources, defer). Easy to reason and debug on language level.
Async runtimes are always complicated, filled with leaky abstractions, it’s like another language that one has to learn in addition, but with a less thought-out, ad-hoc design. Difficult to reason and debug, especially in edge cases
I think you're missing the whole point.
The reason why so many smart people invest their time on "virtual threads" is developer experience. The goal is to turn writing event-driven concurrent code into something that's as easy as writing single-threaded blocking code.
Check why C#'s async/await implementation is such a huge success and replaced all past approaches overnight. Check why node.js is such a huge success. Check why Rust's async support is such a hot mess. It's all about developer experience.
This is the core misunderstanding/dishonesty behind the Loom/Virtual Threads hype. Single-threaded blocking code is easy, yes. But that ease comes from being single-threaded, not from not having to await a few Futures.
But Loom doesn't magically solve the threading problem. It hides the Futures, but that just means that you're now writing a multi-threaded program, without the guardrails that modern Future-aware APIs provide. It's the worst of all worlds. It's the scenario that gave multi-threading such a bad reputation for inscrutable failures in the first place.
Throughput.
Some workloads are not CPU-bound or memory-bound, and spend the bulk of their time waiting for external processes to make data available.
If your workloads are expected to stay idle while waiting for external events, you can switch to other tasks while you wait for those external events to trigger.
This is particularly convenient if the other tasks you're hoping to run are also tasks that are bound to stay idle while waiting for external events.
One of the textbook scenarios that suits this pattern well is making HTTP requests. Another one is request handlers, such as the controller pattern used so often in HTTP servers.
Perhaps the poster child of this pattern is Node.js. It might not be the performance king and might be single-threaded, but it features in the top spots in performance benchmarks such as TechEmpower's. Node.js is also highly favoured in function-as-a-service applications, as it's event-driven architecture is well suited for applications involving a hefty dose of network calls running on memory- and CPU-constrained systems.
With virtual threads, you can limit the damage by using a semaphore, but you still need to tune the size. This isn't much different than sizing a traditional thread pool, and so I'm not sure what benefit virtual threads will really have in practice. You're swapping one config for another.
When you spawn an OS thread you are paying at worst the full cost of it, and at best the max depth seen so far in the program, and stack overflows can happen even if the program is written correctly. Whereas a virtual thread can grow the stack to be exactly the size it needs at any point, and when GC runs it can rewrite pointers to any data on the stack safely.
Virtual/green/user space threads aka stackful coroutines have proven to be an excellent tool for scaling concurrency in real programs, while threads and processes have always played catchup.
> “something” will block eventually no matter what…
The point is to allow everything else to make progress while that resource is busy.
---
At a broader scale, as a programming model it lets you architect programs that are designed to scale horizontally. With the commodization of compute in the cloud that means it's very easy to write a program that can be distributed as i/o demand increases. In principle, a "virtual" thread could be spawned on a different machine entirely.
You are right that everything blocks, even when going to L1 cache you have to wait 1 nanoseconds. But blocking in this context means waiting for “real” IO like a network request or spinning disk access. Virtual threads take away the problem that the thread sits there doing nothing for a while as it is waiting for data, before it is context switched.
Virtual threads won’t improve CPU-bound blocking. There the thread is actually occupying the CPU, so there is no problem of the thread doing nothing as with IO-bound blocking.
Nope. You can go async all the way down, right to the electrical signals if you want. We usually impose some amount of synchronous clocking/polling for sanity, at various levels, but you don't have to; the world is not synchronised, the fastest way to respond to a stimulus will always be to respond when it happens.
> Does it shake out to any real advantage?
Of course it does - did you miss the whole C10K discussions 20+ years ago? Whether it matters for your business is another question, but you can absolutely get a lot more throughput by being nonblocking, and if you're doing request-response across the Internet you generally can't afford not to.
In one project I had to basically turn a reactive framework into a one thread per request framework, because passing around the MDC (a kv map of extra logging information) was a horrible pain. Getting it to actually jump ship from thread to thread AND deleting it at the correct time was basically impossible.
Has that improved yet?
class MyExecutor implements Executor {
private final Executor delegate;
public MyExecutor(Executor delegate) {
this.delegate = delegate;
}
@Override
public void execute(@NotNull Runnable command) {
var mdc = MDC.getCopyOfContextMap();
delegate.execute(() -> {
MDC.setContextMap(mdc);
try {
command.run();
} finally {
MDC.clear();
}
});
}
}See https://openjdk.org/jeps/429
If you keep ThreadLocal variables, they get inherited by child Threads. If you make many thousands of them, the memory footprint becomes completely unacceptable. If the memory used by ThreadLocal variables is large, it also makes it more expensive to create new Threads (virtual or not), so you lose most advantages of Virtual Threads by doing that.
[1] https://davidvlijmincx.com/posts/virtual-thread-performance-...
It’s a shame this article paints a neutral (or even negative) experience with virtual threads.
We rewrote a boring CRUD app that spent 99% of its time waiting the database to respond to be async/await from top-to-bottom. CPU and memory usage went way down on the web server because so many requests could be handled by far fewer threads.
Well somewhat but also not really. They are green threads like async/await, but it's use is more transparent, unlike async/await.
So there are no special "async methods". You just instantiate a "VirtualThread" where you normally instantiate a (kernel) "Thread" and then use it like any other (kernel) thread. This works because for example all blocking IO API will be automatically converted to non-blocking IO underwater.
Links:
https://github.com/dotnet/runtimelab/issues/2398
https://github.com/dotnet/runtimelab/blob/feature/green-thre...
> Green threads introduce a completely new async programming model. The interaction between green threads and the existing async model is quite complex for .NET developers. For example, invoking async methods from green thread code requires a sync-over-async code pattern that is a very poor choice if the code is executed on a regular thread.
Also to note that even the current model is complex enough to warrant a FAQ,
https://devblogs.microsoft.com/dotnet/configureawait-faq
https://github.com/davidfowl/AspNetCoreDiagnosticScenarios/b...
Not really. What C# does is sort of similar but it has the disadvantages of splitting your code ecosystem into non-blocking/blocking code. This means you can “accidentally” start your non-blocking code. Something which may cause your relatively simple API to consume a ridiculous amount of resources. It also makes it much more complicated to update and maintain your code as it grows over the years. What is perhaps worse is that C# lacks an interruption model.
Java’s approach is much more modern but then it kind of had to be because the JVM already supported structured concurrency from Kotlin. Which means that Java’s “async/await” had to work in a way which wouldn’t break what was already there. Because Java is like that.
I think you can sort of view it as another example of how Java has overtaken C# (for now), but I imagine C# will get an improved async/await model in the next couple of years. Neither approach is something you would actually chose if concurrency is important to what you build and you don’t have a legacy reason to continue to build on Java/C# . This is because Go or Erlang would be the obvious choice, but it’s nice that you at least have the option if your organisation is married to a specific language.
Kotlin's design had no bearing on Java's or the JVM's implementation.
C# has an interruption model through CancellationToken as far as I'm aware.
For example, you can actually share a thread with another runtime.
Cooperative threading allows for implicit critical sections that can be cumbersome in preemptive threading.
Async/await and virtual threads are solving different problems.
> What is perhaps worse is that C# lacks an interruption model
Btw, You'd just use OS threads if you really needed pre-emptively scheduled threads. Async tasks run on top of OS threads so you get both co-opertive scheduling within threads and pre-emptive scheduling of threads onto cores.
Java has the power that they make relatively more decisions about the language and the libs that they don’t have to fix later. That’s a great value if you’re not building throw-away software but SaaS or something that has to live long.
Why go? It has a quite anemic standard library for concurrent data structures, compared to java and is a less expressive , and arguably worse language on any count, verbosity included.
I seem to remember that is was some pretty basic operations (like maybe read or something) that caused the thread not to unmount, and therefore just block the underlying os thread. At that point you've just invented the world's most complicated thread pool.
The biggest difference is that C# async/await code is rewritten by the compiler to be able to be async. This means that you see artifacts in the stack that weren’t there when you wrote the code.
There are no rewrites with virtual threads and the code is presented on the stack just as you write it.
They solve the same problem but in very different ways.
Yes. Async/await is stackless, which leads to the “coloured functions” problem (because it can only suspend function calls one-by-one). Threads are stackful (the whole stack can be suspended at once), which avoids the issue.
(I guess scaling to ridiculous levels you could be approaching trouble if you have O(100k) outstanding DB queries per application server, hope you have a DB that can handle millions of oustanding DB queries then!)
No, the I/O is still blocking with respect to the application code.
But the bottom line with virtual threads, go-routines, or kotlin's co-routines is that it indeed allows for imperative code style code that is easy to read and understand. Of course you still need to understand all the pitfalls of concurrency bugs and all the weird and wonderful way things can fail to work as you expect. And while Java's virtual threads are designed to work like magic pixie dust, it does have some nasty failure modes where a single virtual thread can end up blocking all your virtual threads. Having a lot of synchronized blocks in legacy code could cause that.
However, the use of JAVA for me is for admin backend or heavy weight services for enterprises or startups I coded for, so for my taste I can't use it without spring or jboss, etc.. , and in that way I think simplicity went out the window a long long time ago :) It took me years to learn all the quirks of these frameworks... and the worse thing about it is that they keep changing every few months...
Go is a next-gen trumpian language that rejects sum types, pattern matching, non-nil pointers, and for years, generics; it's unhinged.
> pretending Erlang does not exist
For better or worse it doesn't to most programmers. The syntax is not nearly as approachable as GoLang. Luckily Elixir exists.