undefined | Better HN

0 pointshaberman1y ago0 comments

> C++ has been faster than C for a long time.

What is your basis for this claim? C and C++ are both built on essentially the same memory and execution model. There is a significant set of programs that are valid C and C++ both -- surely you're not suggesting that merely compiling them as C++ will make them faster?

There's basically no performance technique available in C++ that is not also available in C. I don't think it's meaningful to call one faster than the other.

0 comments

jandrewrogers1y ago

This is really an “in theory” versus “in practice” argument.

Yes, you can write most things in modern C++ in roughly equivalent C with enough code, complexity, and effort. However, the disparate economics are so lopsided that almost no one ever writes the equivalent C in complex systems. At some point, the development cost is too high due to the limitations of the expressiveness and abstractions. Everyone has a finite budget.

I’ve written the same kinds of systems I write now in both C and modern C++. The C equivalent versions require several times the code of C++, are less safe, and are more difficult to maintain. I like C and wrote it for a long time but the demands of modern systems software are a beyond what it can efficiently express. Trying to make it work requires cutting a lot of corners in the implementation in practice. It is still suited to more classically simple systems software, though I really like what Zig is doing in that space.

I used to have a lot of nostalgia for working in C99 but C++ improved so rapidly that around C++17 I kind of lost interest in it.

habermanOP1y ago

None of this really supports your claim that "C++ has been faster than C for a long time."

You can argue that C takes more effort to write, but if you write equivalent programs in both (ie. that use comparable data structures and algorithms) they are going to have comparable performance.

In practice, many best-in-class projects are written in C (Lua, LuaJIT, SQLite, LMDB). To be fair, most of these projects inhabit a design space where it's worth spending years or decades refining the implementation, but the combination of performance and code size you can get from these C projects is something that few C++ projects I have seen can match.

For code size in particular, the use of templates makes typical C++ code many times larger than equivalent C. While a careful C++ programmer could avoid this (ie. by making templated types fall back to type-generic algorithms to save on code size), few programmers actually do this, and in practice you end up with N copies of std::vector, std::map, etc. in your program (even the slow fallback paths that get little benefit from type specialization).

WalterBright1y ago

> What is your basis for this claim?

Great question! Here's one answer:

Having written a great deal of C code, I made a discovery about it. The first algorithm and data structure selected for a C program, stayed there. It survives all the optimizations, refactorings and improvements. But everyone knows that finding a better algorithm and data structure is where the big wins are.

Why doesn't that happen with C code?

C code is not plastic. It is brittle. It does not bend, it breaks.

This is because C is a low level language that lacks higher level constructs and metaprogramming. (Yes, you can metaprogram with the C preprocessor, a technique right out of hell.) The implementation details of the algorithm and data structure are distributed throughout the code, and restructuring that is just too hard. So it doesn't happen.

A simple example:

Change a value to a pointer to a value. Now you have to go through your entire program changing dots to arrows, and sprinkle stars everywhere. Ick.

Or let's change a linked list to an array. Aarrgghh again.

Higher level features, like what C++ and D have, make this sort of thing vastly simpler. (D does it better than C++, as a dot serves both value and pointer uses.) And so algorithms and data structures can be quickly modified and tried out, resulting in faster code. A traversal of an array can be changed to a traversal of a linked list, a hash table, a binary tree, all without changing the traversal code at all.

srcreigh1y ago

At a certain point, C++ compile time computation becomes something you really can’t do in C. https://codegolf.stackexchange.com/a/269772

mawww1y ago

C and C++ do have very different memory models, C essentially follows the "types are a way to decode memory" model while C++ has an actual object model where accessing memory using the wrong type is UB and objects have actual lifetimes. Not that this would necessarily lead to performance differences.

When people claim C++ to be faster than C, that is usually understood as C++ provides tools that makes writing fast code easier than C, not that the fastest possible implementation in C++ is faster than the fastest possible implementation in C, which is trivially false as in both cases the fastest possible implementation is the same unmaintainable soup of inline assembly.

The typical example used to claim C++ is faster than C is sorting, where C due to its lack of templates and overloading needs `qsort` to work with void pointers and a pointer to function, making it very hard on the optimiser, when C++'s `std::sort` gets the actual types it works on and can directly inline the comparator, making the optimiser work easier.

ryao1y ago

Try putting objects into two linked lists in C using sys/queue.h and in C++ using the STL. Try sorting the linked lists. You will find C outperforms C++. That is because C’s data structures are intrusive, such that you do not have external nodes pointing to the objects to cause an extra random memory access. The C++ STL requires an externally allocated node that points to the object in at least one of the data structures, since only 1 container can manage the object lifetimes to be able to concatenate its node with the object as part of the allocation. If you wish to avoid having object lifetimes managed by containers, things will become even slower, because now both data structures will have an extra random memory access for every object. This is not even considering the extra allocations and deallocations needed for the external nodes.

That said, external comparators are a weakness of generic C library functions. I once manually inlined them in some performance critical code using the C preprocessor:

https://github.com/openzfs/zfs/commit/677c6f8457943fe5b56d7a...

jandrewrogers1y ago

It seems like your argument is predicated on using the C++ STL. Most people don’t for anything that matters and it is trivial to write alternative implementations that have none of the weaknesses you are arguing. You have created a bit of a strawman.

One of the strengths of C++ is that it is well-suited to compile-time codegen of hyper-optimized data structures. In fact, that is one of the features that makes it much better than C for performance engineering work.

ryao1y ago

Most C++ code I have seen uses the STL. As for “hyper optimized” data structures, you already have those in C. See the B-Tree code those binary search routine I patched to run faster. Nothing C++ adds improves upon what you can do performance wise in C.

You have other sources of slow downs in C++, since the abstractions have a tendency to hide bloat, such as excessive dynamic memory usage, use of exceptions and code just outright compiling inefficiently compared to similar code in C. Too much inlining can also be a problem, since it puts pressure on CPU instruction caches.

3 more replies

gpderetta1y ago

Unfortunately Stepanov and the STL are widely misunderstood. Stepanov core contributions is the set of concepts underlying the STL and the iterator model for generic programming. The set of algorithms and datastructures in the STL was only supposed to be a beginning, was never supposed to be a finished collection. Unfortunately many, if not most treat it that way.

But if you look beyond, you can find a whole world that extend the stl. If you are not happy, say, with unordered_map, you can find more or less drop in replacements that use the same iterator based interface, preserve value semantics and use the a common language to describe iterator and reference invalidation.

Regarding your specific use case, if you want intrusive lists you can use boost.intrusive which provides containers with STL semantics except it leaves ownership of the nodes to the user. The containers do not even need to be lists: you can put the same node in multiple lists linked list, binary trees (multiple flavors), and hash maps (although this is not fully intrusive) at the same time.

These days I don't generally need boost much, but I still reach for boost.intrusive quite often.

ryao1y ago

I have met a number of people who will not use the boost libraries. It has been so long that I have long forgotten their reasons. My guess is that it had to do with binary compatibility issues.

pjmlp1y ago

Except, nothing forbids me to use two linked lists in C++ using sys/queue.h, that is exactly one of the reason why Bjarne built C++ on top of C, and also unfortunely a reason why we have security pain points in C++.

ryao1y ago

Yet the C++ community is continually trying to get people to stay away from anything involving C. That said, newer C headers using _Generic for example are not usable from C++.

1 more reply

uecker1y ago

In my experience, templates usually cause a lot of bloat that slows things down. Sure, in microbenchmarks it always looks good to specialize everything at compile time, whether this is what you want in a larger project is a different question. And then, also a C compiler can specialize a sort routine for your types just fine. It just needs to be able too look into it, i.e. it does not work for qsort from the libc. I agree to your point that C++ comes with fast implementations of algorithms out-of-the-box. In C you need to assemble a toolbox yourself. But once you have done this, I see no downside.

1 more reply

krapht1y ago

I know you're going to reply with "BUT MY PREPROCESSOR", but template specialization is a big win and improvement (see qsort vs std::sort).

ryao1y ago

I have used the preprocessor to avoid this sort of slowdown in the past in a binary search function:

https://github.com/openzfs/zfs/commit/677c6f8457943fe5b56d7a...

The performance gain comes not from eliminating the function overhead, but enabling conditional move instructions to be used in the comparator, which eliminates a pipeline hazard on each loop iteration. There is some gain from eliminating the function overhead, but it is tiny in comparison to eliminating the pipeline hazard.

That said, C++ has its weaknesses too, particularly in its typical data structures, its excessive use of dynamic memory allocation and its exception handling. I gave an example here:

https://news.ycombinator.com/item?id=43827857

Honestly, I think these weaknesses are more severe than qsort being unable to inline the comparator.

uecker1y ago

A comparator can be inlined just fine in C. See here where the full example is folded to a constant: https://godbolt.org/z/bnsvGjrje

Does not work if the compiler can not look into the function, but the same is true in C++.

ryao1y ago

That does not show the comparator being inlined since everything was folded into a constant, although I suppose it was. Neat.

Edit: It sort of works for the bsearch() standard library function:

https://godbolt.org/z/3vEYrscof

However, it optimized the binary search into a linear search. I wanted to see it implement a binary search, so I tried with a bigger array:

https://godbolt.org/z/rjbev3xGM

Now it calls bsearch instead of inlining the comparator.

1 more reply

j / k navigate · click thread line to collapse

0 comments

jandrewrogers1y ago

This is really an “in theory” versus “in practice” argument.

I used to have a lot of nostalgia for working in C99 but C++ improved so rapidly that around C++17 I kind of lost interest in it.

habermanOP1y ago

None of this really supports your claim that "C++ has been faster than C for a long time."

You can argue that C takes more effort to write, but if you write equivalent programs in both (ie. that use comparable data structures and algorithms) they are going to have comparable performance.

WalterBright1y ago

> What is your basis for this claim?

Great question! Here's one answer:

Why doesn't that happen with C code?

C code is not plastic. It is brittle. It does not bend, it breaks.

A simple example:

Change a value to a pointer to a value. Now you have to go through your entire program changing dots to arrows, and sprinkle stars everywhere. Ick.

Or let's change a linked list to an array. Aarrgghh again.

srcreigh1y ago

At a certain point, C++ compile time computation becomes something you really can’t do in C. https://codegolf.stackexchange.com/a/269772

mawww1y ago

ryao1y ago

That said, external comparators are a weakness of generic C library functions. I once manually inlined them in some performance critical code using the C preprocessor:

https://github.com/openzfs/zfs/commit/677c6f8457943fe5b56d7a...

jandrewrogers1y ago

ryao1y ago

3 more replies

gpderetta1y ago

These days I don't generally need boost much, but I still reach for boost.intrusive quite often.

ryao1y ago

I have met a number of people who will not use the boost libraries. It has been so long that I have long forgotten their reasons. My guess is that it had to do with binary compatibility issues.

pjmlp1y ago

ryao1y ago

Yet the C++ community is continually trying to get people to stay away from anything involving C. That said, newer C headers using _Generic for example are not usable from C++.

1 more reply

uecker1y ago

1 more reply

krapht1y ago

I know you're going to reply with "BUT MY PREPROCESSOR", but template specialization is a big win and improvement (see qsort vs std::sort).

ryao1y ago

I have used the preprocessor to avoid this sort of slowdown in the past in a binary search function:

https://github.com/openzfs/zfs/commit/677c6f8457943fe5b56d7a...

That said, C++ has its weaknesses too, particularly in its typical data structures, its excessive use of dynamic memory allocation and its exception handling. I gave an example here:

https://news.ycombinator.com/item?id=43827857

Honestly, I think these weaknesses are more severe than qsort being unable to inline the comparator.

uecker1y ago

A comparator can be inlined just fine in C. See here where the full example is folded to a constant: https://godbolt.org/z/bnsvGjrje

Does not work if the compiler can not look into the function, but the same is true in C++.

ryao1y ago

That does not show the comparator being inlined since everything was folded into a constant, although I suppose it was. Neat.

Edit: It sort of works for the bsearch() standard library function:

https://godbolt.org/z/3vEYrscof

However, it optimized the binary search into a linear search. I wanted to see it implement a binary search, so I tried with a bigger array:

https://godbolt.org/z/rjbev3xGM

Now it calls bsearch instead of inlining the comparator.

1 more reply

j / k navigate · click thread line to collapse