And modern GCs only have to stop the current thread to check for which thread-local objects are alive. The alice ones are moved to another generation, making the space reusable for free.
And atomic writes need synchronization which is crazy expensive, I honestly don’t get why you think it isn’t.
Also, just try writing rust/c++ code that relies entirely on RC vs Java in an object heavy workload - I really don’t think it is an open question in any shape or form.
It's pretty hilarious to me that you first say "they have to move it to another generation" and then you say "it's free!" It's like saying "I paid for my dinner, and now I get to eat it for free!"
Also: what do you think `free()` does? All modern memory allocators do this trick, keeping thread-local caches. This is not an advantage of GCs when reference counting does it as well.
Almost all modern GCs are stop-the-world in at least phases, and for good reason: stop-the-world GCs are higher performance. You pay in other ways for skipping that stop.
> And atomic writes need synchronization which is crazy expensive, I honestly don’t get why you think it isn’t.
Because I've actually benchmarked it: https://quick-bench.com/q/ISEetAHOohv-GaEuYR-7MajJgTc
18.5 nanoseconds fits under no reasonable definition of "crazy expensive", not when a regular increment clocks in at 5.9 nanoseconds. And there is extremely few situations where you increment a reference count more than, like, 5 times. It's just not an issue.
This is like cargo cult programming: you've been told these things and never tested them in the real world, and you have all these preconceived notions that just don't stand up to two minutes of scrutiny.
> Also, just try writing rust/c++ code that relies entirely on RC vs Java in an object heavy workload - I really don’t think it is an open question in any shape or form.
Yes, of course garbage collectors are easier to use than reference counting. Nobody has ever disputed this. That is the whole raison d'etre of garbage collectors. This is not what the discussion is about, it's about performance.
I'm done with this thread now, unless anybody can show me any actual data to back anything you say up. It's really tiring. I started this by saying "I don't actually know", and everyone replying to me has been so darn certain of everything they say while being outright incorrect about most things, and without any actual data to back up the rest.
Congratulations. You tested a construct meant for multicore/threading in a single threaded benchmark and then marvel at the low overhead.
Of course you will only start seeing the cost if there is actually contention to operate on the value between multiple threads running simultaniously. See.: https://travisdowns.github.io/blog/2020/07/06/concurrency-co...
Indeed the issue with measuring barriers is that measuring the barrier doesn't suffice; one wants to measure how the barrier affects the rest of execution. This entails coming up with programs that are much less trivial than repeatedly incrementing a counter.
> > Also, just try writing rust/c++ code that relies entirely on RC vs Java in an object heavy workload - I really don’t think it is an open question in any shape or form. > Yes, of course garbage collectors are easier to use than reference counting. Nobody has ever disputed this. That is the whole raison d'etre of garbage collectors. This is not what the discussion is about, it's about performance.
I am talking about performance exactly. Java's GC will smoke the hell out of C++'s shared pointers and Rust's (A)RC. Noone said anything about productivity/ease of usage.
And as mentioned by another commenter - your benchmark didn't take into account anything related to parallel execution, which would be the point.
You will see that the counter increment is about 2-5 cycles, a few hundred ps, and the atomic is on the order of 10 ns uncontended.
If you then introduce contention and a multi-socket setup, the atomic will slow down significantly. Only one thread can touch it at a time, so they have to take turns.