I think most of this is attributable to the ergonomics of compile-time expressiveness. C++ can effortlessly do things that require mountains of ugly boilerplate and macros in C or Rust. In principle they can express the same things but no one wants to write or deal with that ugly boilerplate so the equivalency is never realized in real code bases.
Zig is interesting because it slots in as a C-like language with a competent and expressive compile-time story. I don’t use Zig but I recognize its game.
To me programming Rust feels so limiting due to lack of good compile time meta programming with types. That’s the key.
It shares some of the same drawbacks as C++, though. The language is extremely powerful, so while it is easy to write performant code, it is also easy for non experts to write very suboptimal code.
If anything, Rust has the potential to be more performant than C due to its aliasing rules (C has `restrict` but it's rarely used, standard C++ does not have even that). The current perf stats show it does make Rust code faster but just a little bit, although we don't utilize the full optimization potential currently (LLVM does not do many possible optimizations here, and `noalias` is weaker than Rust's aliasing rules). It can also affect autovectorization, and if it does the effect could be dramatic.
The poor applicability of auto-vectorization is another area where C++ is strong. You can transparently codegen e.g. AVX512 from intrinsics directly in C++ in contexts that would be opaque to auto-vectorization and difficult to generalize in C. This allows you to get some degree of “auto-vectorization” where the compiler can’t see it because it works at the wrong level of abstraction.
With sufficiently heroic efforts you can write C that matches the performance of C++. I’m not arguing that. Virtually no one writes C to that standard, including myself when I was writing high-performance C because the effort was too high, so it is a bit of a strawman.
It is the difference between theory and practice. All code bases have a finite budget. C++ can do a lot more optimization in the same budget as C.
With LTO you get many of the same advantages as C++ template code, there's nothing magic about C++ template optimizations, it's all about whether the compiler can see all function bodies in a call hierarchy.
> The only candidate is using virtualization and void* pointers instead of monomorphized generics which some C code does for the lack of better options, but that's not a problem in Rust as well.
But in fact, if speed is a concern to you, even in C you will use "templated" sorting (via macros or code generation).
But this is not a valid argument, as all languages are Turing complete, and most modern languages can do low level stuff at optimum speeds. As an extreme example, in Java, you could just allocate a large chunk of memory and run an allocator inside of it and sidestep the GC entirely.
With a programming language the question is thus not what can you do with it and how fast can it run with infinite effort, but what are the ergonomics, and what performance will you get in practice.
At the compiler level, no. But as you write projects, you will for instance run into things you can do with templates which are infeasible to attempt with macros.
One example might be qsort() - a C compiler _could_ catch cases where it could create an intrinsic qsort based on the data type and function pointer being passed. However, in C++ you have the facilities to create a type safe, genericized sort that will be inlined based on the data structure.
Eg: delete_scene(void *arg) vs delete_scene<T>(T *arg)
In Twitter a user explained me that it is common in embedded space.
You do not need the OOP, RTTI, exceptions.
Like C with most use cases of preprocessor replaced by generic programming.
While you can write high performance C++ my experience is that many people will reach for shared_ptr and their like while Rust will force them into proper structure/ownership as Arc and their like have a lot higher friction.
This really needs more realworld evidence to back up the claim. In the end the important optimizations happen down in the Clang optimizer passes on the LLVM IR, and those optimizations are the same across C, C++, Rust (or Zig for that matter) - assuming of course that the optimizer can see all function bodies, which in C can be achieved via LTO or alternatively via 'unity builds'.
If the output of one of those languages differs so much (on an LLVM-based compiler) that there are noticeable performance differences I would start investigating whether there's a compile/link setting missing somewhere instead.
https://www.godbolt.org/z/n3Y54Yhqr
This is basically the gist of C++ 'zero cost abstraction', but C-style (the bulk of what enables C++ zero-cost-abstraction doesn't happen up in the language, but down in the optimization passes).
It's true that you can express many things in C++ -- the problem is that the language deliberately doesn't distinguish whether the things you've expressed are nonsense, so you might well have written total nonsense and you only find out when, much later, diagnosing a real world event you discover oh, this is nonsense, why did this even compile? Well sorry, it was "more performant" to allow nonsense.