If you program it like it was C then you can get good performance, which make sense given that the language was built for embedded devices with anemic processors. Of course you undoubtedly give up some maintainability when you do that, but the tradeoff is getting 6 million packets through the thing per second.
This would also explain why Java benchmarks so well but still tends to be slow in the real world.
Java in the real world is fast. That's why it's used as the backbone of so many large organizations and so many scale out solutions (Cassandra, Kafka, Hadoop, etc) are written in Java.
... and contemporary interactive performance.
python < /dev/null 0.01s user 0.01s system 96% cpu 0.025 total
racket < /dev/null 0.17s user 0.02s system 98% cpu 0.198 total
java 0.09s user 0.02s system 94% cpu 0.118 total
clojure < /dev/null 2.07s user 0.06s system 170% cpu 1.247 total
There's of course a theoretical point about being shoehorned into the UNIX execution model, and if Java were able to run as a persistent OS (eg nailgun or whatever it is these days) then things get much better. But still, when you start off a project having to suffer and design workarounds for the language's and implementation's flaws...But that's not "Java in the real world". If anything I'd presume that that's "well programmed Java".
Actually, allocating and discarding objects is not the problem. Java allocation and deallocation is extremely efficient for short living objects.
I would say memory layout, access patters resulting from object orientedness are much more important factor.
It is possible to get very close to raw performance, but you have to reinvent almost everything. The code starts looking like plain C very quickly.
But in Java it always seemed that you are very much punished for having data structures that are at cross purposes to your dominant work load. I cut my teeth on implementing the last two parts of “make it work, make it right, make it fast”, often on projects where the existing team had declared that everything that could be done already had been done. There are a lot of refactorings that accomplish both goals, and I often got a 2-3x out of these projects by removing slow tech debt, and more by exposing a real info architecture.
It always surprised me that a language that so punished (especially in the early days where it was interpreted and all object lookups were double indirect) the Big Ball of Mud antipattern exhibited so many examples of it, so frequently.
We can do better. So much better.
Objects usually fell into one of two categories: discarded immediately or held for the entire application lifetime. Anything in between was problematic. Most things ended up using object pools. We also used to never really convert anything from binary format and just used wrapper objects to access the byte arrays directly.
Another sort of trick was scheduling GC for times when the application was OK to pause helped considerably, and made behavior more predictable as well.
It'd be tough to compare it to C/C++ given the complexity of the application. But without giving away specifics, we had solid performance afaik. But you're correct that it does end up making for some interesting Java code.
The thing about Java is that it has promised and delivered a much better threading experience than other programming languages. When somebody proved the Java memory model was unsafe, Sun fixed it. C and other languages have adopted essentially this memory model but a decade and a half later. Java provides bulletproof tools in forms of Executors, Latches and other specialized concurrency constructs. It takes time to learn to use them, but you can ship fast and correct code for something like the LMAX Disruptor.
Actually, I think that talking about Java's multithreading here is a red herring; note that LMAX had to abandon multi-threading because it was destroying performance.
And this is important, because they explicitly made a very interesting point of current architectures being at odds with current thinking about best practices for concurrent programming.
Is it, though?
The Techempower benchmarks are a great "real world" example IMO: https://www.techempower.com/benchmarks/#section=data-r16&hw=...
A more famous example might be Minecraft, where even with its blocky graphics it can tax a high end gaming machine when you turn the view distance up to a range that almost no other engine would consider long. The engine has been rewritten in other languages where it is much faster, notably the Microsoft version and the Phone version.
"If you program it like it was C then you can get good performance" -> Note that one of the points of the post is that just using "like C" won't magically fix your performance!
If anything, you could say "If you program it like a system with limited memory then you can get good performance". That, I can imagine being a universally applicable thing.
My point is that, for example for the particular case of memory management, there is an associated cost, whatever your language is; and you need to deal with it. Ignorance of the law is no excuse, etc.
It's just rudimentary integration approximation to a mathematics undergraduate but to those in Medicine unfamiliar with it, it was an achievement.