With virtual threads, you can limit the damage by using a semaphore, but you still need to tune the size. This isn't much different than sizing a traditional thread pool, and so I'm not sure what benefit virtual threads will really have in practice. You're swapping one config for another.
The key with virtual threads is they are so light weight that you can have thousands of them running concurrently: even when they block for I/O, it doesn't matter. It's similar to light weight coroutine in other language like Go or Kotlin.
This is a common problem with people using Java parallel streams, because they by default share a single global thread pool and the way to use your own thread pool is also extremely counterintuitive, because it essentially relies on some implicit thread local magic to choose to distribute the stream in the thread pool that the parallel stream was launched on, instead of passing it as a parameter.
It would be best if people came up with more dynamic back pressure strategies, because this is a more general problem that goes way beyond thread pools. In fact, one of the key problems of automatic parallelization is deciding at what point there is too much parallelization.
Memory utilization & performance is going to be similar to the async callback mess.
Also the virtual threads run on a "traditional" thread pool to my understanding, so you can just tweak the number of worker threads to cap the total concurrency.
The benefit is it's overall more efficient (in the general case) and lets you write linear blocking code (as opposed to function coloring). You don't have to use it, but it's nice that it's there. Now hopefully Valhalla actually makes it in eventually
I've not yet seen a study that shows that virtual threads offer a huge benefit. The Open Liberty study suggests that they're worse than the existing platform threads.
Ideally carrier threads would be pinned to isolated cpu cores, which removes most aspects of OS scheduler from the picture
Not exactly Java virtual threads, but a study on how userland threads beat kernel threads.
https://cs.uwaterloo.ca/~mkarsten/papers/sigmetrics2020.html
For quick results, check figures 11 and 15 from the (preprint) paper. Userland threads ("fred") have ~50% higher throughput while having orders of magnitude better latency at high load levels, in a real-world application (memcached).
> Context switching cost due to CPU cache thrashing doesn't go away regardless of which type of thread you're using.
Except it's not a context switch? You're jumping to another instruction in the program, one that should be very predictable. You might lose your cache, but it will depend on a ton of factors.
> there's a new cost caused by allocating the internal continuation object, and copying state into it.
This is more of a problem with the implementation (not every virtual thread language does it this way), but yeah this is more overhead on the application. I assume there's improvements that can be made to ease GC pressure, like using object pools.
Usually virtual threads are a memory vs CPU tradeoff that you typically use in massively concurrent IO-bound applications. Total throughput should take over platform threads with hundreds of thousands of connections, but below that probably perform worse, I'm not that surprised by that.