Rust is now overall faster than C in benchmarks (opens in new tab)

typedef_union5y ago

>The reason that C programs often don’t perform as well as an equivalent rust program[1] is that it’s so incredibly hard to do anything at all in C (especially something reliable) that one can usually only do the simplest thing possible and this typically means simple data structures, simple algorithms and arrays

Reliability is very often the name of the game with C, and part of the reason you might see it written in such a simplistic fashion. In embedded systems, we often follow very strict code convention that has strict requirements on how a C program is to be written. This includes everything from how memory is to be allocated, the maximum number of local variables, the maximum number of the arguments, various limits on a function, constraints to handle errors, etc. We do this because it minimizes potential hazards especially when working with limited memory and system resources.

C is an unforgiving language in that mistakes can occur silently and on mission critical hardware these mistakes can cost more than just your project milestones. These programs are very specialized and often have complex algorithms associated with them. Most of the systems I've worked with include various kinds of feedback control systems, and accompanying algorithms. If you're familiar with control systems, you'll know these algorithms are certainly not trivial by any means.

The challenge with C is more as a developer your C code needs to be perfect. Bugs just aren't an option like they are in other environments. Once a specialized piece of hardware ships it needs to work as intended under a myriad of conditions without powering off for the next 30 years (not always the case but more often than you might think). Sometimes this kind of software is going to be put a position it may have never been designed for. The best we can do is try to add various check/support mechanisms both in software and in hardware, and maintain as safe and hazard free software as possible.

In terms of performance, again coming from an embedded environment, there is almost always a spec we are aiming for. It needs to do X tasks in N time for example. Of course high level design questions like whether to poll or wait for interrupt, or how to broker data from shared resources is decided long before you get to the question of how should I, or if I even should, compare two strings. It is nevertheless advised to go with a trivial solution if the ends satisfy the needs.

Is C time-consuming to write? Is C a "hard" language? These are subjective and depend on the nature of the project. I would certainly never use C with only libc to write a webserver that's going to be serving SaaS Co's backend API.

blackrock5y ago

Fascinating. I would’ve thought most commercial C programs would have heavily used linked lists, hashes (dictionary), and binary search trees all over the place.

I assume most C++ programs heavily use more of these advanced data structures, correct?

frutiger5y ago

Mostly agree with your comment but linear search through arrays of size less than a few hundred will typically beat more sophisticated structures such as red-black trees or hashtables. This is due to prefetching and avoidance of unpredictable pointer traversals. Asymptotic complexity is only that: asymptotic.

In many programs in many domains the sizes of these data structures will rarely exceed this limit.

vbezhenar5y ago

It might be fast, but it'll load CPU caches with that data and it'll evict another useful data. Which means that while this particular code will be fast or at least not very slow, some other code will be slow because its data have to be fetched again.

I have no idea whether that matters or even easy to measure...

https://github.com/redis/redis/blob/3.0/src/sds.h

ummonk5y ago

This is why Rust uses B-trees, which combine linear search and trees at optimal ratios.

michaelmior5y ago

> null terminated strings feel like idiomatic C to most people

Doesn't that mean that null terminated strings is idiomatic C? That is, my understanding of the term idiomatic is that it is defined by whatever is most natural to users of a language regardless of whether it is the most performant.

josephg5y ago

One of my long standing complaints about modern programming is how rarely people read code. We don't encourage it in school, and in the workplace most people only read code written by their coworkers.

It would be the equivalent of teaching people to write books without encouraging them to read anything.

To break myself of the habit I started reading some well regarded programs for fun. And oh boy, have I learned a lot from doing so. One of my first discoveries was this beauty in the Redis source code:

The idea is to have a string struct that stores its length and content. But the pointer passed around is a pointer to the (null terminated) contents field in the struct. The string is efficient for internal calls (the length can be queried by subtracting from the pointer). But the pointer is also an idiomatic null-terminated C string pointer, compatible with the standard library and everything else. (typedef char *sds;)

Dovecot is also a gem to read if you're looking for inspiration. The way it manages memory pools is delightful - and I'm sure much more performant than idiomatic rust. (That is, without reaching for arena allocator crates and alternate implementations of Box and Vec).

6 more replies

rcoveson5y ago

Null terminated strings are often called cstrings. They’re beyond idiomatic; they’re part of the C standard library.

martincmartin5y ago

Also lack of generics can make it slow, e.g. qsort() requires a function call for each comparison. So C++'s std::sort() can be significantly faster on an array of integers.

mh75y ago

It's more to do with the fact that std::sort's definition is visible to the compiler and qsort() is not. Put qsort() code in stdlib.h, make it static and write a static intcmp() and you'll see the compiler inline that no problem.

4 more replies

bluecalm5y ago

I benchmarked it several times in the past and couldn't replicate std::sort being faster (GCC with high optimization settings). Anyway both are slow. If you need fast sort you need an implementation without any function calls (no recursive calls) and both the pivot choice and the chunk size at which insert sort kicks in optimized to your data and hardware. My experience is that you can beat built in sort by 2x to 3x.

jstimpfle5y ago

I love qsort! It's easy to use.

When I once tested std::sort against qsort (sorting 4-byte integer) I measure a 2x difference. So yes, definitely non-trivial, but it won't get much worse than that.

Have you ever seen a program that was slow because of a slow sorting routine?

If you ever need a fast sort (~ never) then the last thing you should do is use std::sort anyway. You should figure out what your data looks like and hand roll an implementation. For example, a radix sort is often possible to use, easy to implement, and much faster than std::sort.

musicale5y ago

> real world C is a lot less performant than people think it might be

What is meant here? Fast? Reliable? Secure? Memory efficient? Power efficient? Easy to write? Easy to maintain? Quick to compile? Easy to debug?

I have no illusions about the reliability/security/correctness of real-world C code, especially since it's usually not compiled with a memory-safe compiler or run with memory-safe libraries and runtime environments (though it's often sandboxed to limit the damage.) It's relatively easy to introduce memory errors which are not detected by the compiler or runtime.

Certainly many algorithms and data structures (in C and other languages) exhibit tradeoffs including things like speed vs. memory use vs. code size vs. complexity, etc..

But C compilers are pretty fast, which I really like. Then there are/were environments like Turbo Pascal or Think C, which seem to have been amazingly compact while offering a rapid edit-compile-debug cycle as well as decent runtime speed and code size.

unethical_ban5y ago

I assume, since the submissions says is talking about something "faster in benchmarks", that he means "fast".

quotemstr5y ago

Right. As an industry, we need to get away from the simplistic "manual=fast" model. Logic does not automatically get faster when you write it in C or C++. It frequently gets slower, since your one-off "lean and mean" native code clumsily and slowly does the job that a managed runtime has been tuned for decades to do.

WalterBright5y ago

D uses slices for strings, which also gives D a big performance boost over C whenever strings are in play.

BruceEel5y ago

Walter, can you elaborate on this, are D's slices functionally equivalent to C++'s string_view? I.e. no copying as long no 'ASCIIZ-dependent' code is involved?

[1] https://github.com/rust-lang/rust/issues/54878#issuecomment-...

greyhair5y ago

It is funny when I think of the hundreds of thousands of lines of code I have shipped in my long career (all embedded) and how little of it ever had to handle a string.

I agree, C is a really poor choice for string or other human readable content handling.

But if you have to read-shift-mask-write bytes to hardware control registers, it is a pretty easy language to use, and faster than assembly language in 90% of the cases, with modern compilers. That last note was less true in the mid 1980s, but by the mid-1990s, compiler optimization had gotten to be quite good.

Cellular modem code that runs on DSPs is written in C. Maybe someday that will be written in Rust. We'll see.

fermienrico5y ago

Forgive me for asking a stupid question: what does it mean when something is “idiomatic” in a programming language context? Is it just the best or recommended way to do something? Or is it something that’s supported by the language? Or something else?

grzm5y ago

Idiomatic in a programming language means pretty much what it means in any other language: how would someone fluent in the language express it? In a programming context, that may also correspond to the most performant or otherwise "best" expression, but sometimes it's just a commonly used style or phrase.

millstone5y ago

A programming language idiom is a pattern or technique that you expect your readers to know, even if it isn't obvious when encountered for the first time. Here's an idiom:

    for (int i=0; i < 10; i++)

We all instantly know this means loop 10 times, from 0 through 9. But on first read it takes a little work to figure out what's happening.

ClumsyPilot5y ago

I was struggling with that too initially - replace the word with 'standard' or 'natural' and it retains the same meaning.

jstimpfle5y ago

Slow code bottlenecked by string operations is probably just bad, though - no matter which way you code your strings. Strings are not what computers like to do. Computers like integers and subscripting operations. Strings can make sense for inter-process scenarios (for example, file paths are usually the right thing and you don't want to reference files with inodes).

There is a culture of bad C code from the 90s - especially C code written in OOP-y ways where that was never necessary. Like, calling malloc() + free() for every little thing instead of more structured memory management. A lot of that code is written in C not with a efficiency or elegance mindset but simply because C was the language that you wrote programs in.

baybal25y ago

> I spent a fair amount of time reviewing C code in the last 5 years - and things that pop up in nearly every review are costly string operations.

Do not tie computational operations on string operations in C.

Pick one of many string manipulation libraries to do any serious work with them.

Svetlitski5y ago

Once LLVM fixes some bugs with `noalias`, at which point Rust will begin using it again in more circumstances [1], I'd expect to see Rust get even faster in these benchmarks, given that the Rust compiler knows much more about which pointers do/do-not alias than most other programming languages [2] and the myriad optimizations this knowledge allows.

[2] https://doc.rust-lang.org/nomicon/aliasing.html

nindalf5y ago

How often does benchmark code have a function that takes two pointers that could potentially alias each other? If it's as rare as I think it is, it might not have that much of an impact on Rust's position in the benchmarks game.

Still, real world performance will probably benefit from this fix so it's a positive change regardless.

jerf5y ago

I've been fairly convinced for a while that once Rust matures (which is probably fairly close to "now", but I've held this opinion for years) that it's going to have a performance advantage in real code that's going to be hard to capture in benchmarks, because it's easy in a small benchmark to be very careful and ensure that you don't have aliasing, avoid extra copies, etc.

Where I expect Rust to really shine performance-wise is at the larger scale of real code, where it affords code that copies less often because the programmer isn't sure in this particular function whether or not they own this so they just have to take a copy, or the compiler can't work out aliasing, etc. Ensuring at scale that you don't take extra copies, or have an aliasing problem, or that you don't have to take copies of things just to "be sure" in multithreading situations, is hard, and drains a lot of performance.

kolbe5y ago

The problem is that it’s “rare” and not “impossible.” Compilers have to be logically sound, not probabilistic when it comes to defined behaviors.

loeg5y ago

Benchmark hackers can just sprinkle `restrict` all over the place, too. Real world C code doesn't get restrict by default and often isn't fully annotated, so the benchmarks may artificially hide the real-world difference.

volta875y ago

> How often does benchmark code have a function that takes two pointers that could potentially alias each other? If it's as rare as I think it is, it might not have that much of an impact on Rust's position in the benchmarks game.

Most C++ STL algorithms take iterators, that in many cases are just pointers in benchmarks because these often only deal with arrays. Hell most C standard APIs (e.g. memcpy) take multiple pointers.

stabbles5y ago

I doubt there's any performance to be gained that way, but if so, the C implementation can just use `restrict` to the same effect.

vvanders5y ago

Have you ever used restrict in anger?

I've done it when we really needed that performance for an inner loop(particle system). It can be a real bastard to keep the non-alias constraint held constant in a large, multi-person codebase and the error cases are really gnarly to chase down.

Compare that to Rust which has this knowledge built in since it naturally falls out of the ownership model.

roca5y ago

"Just use 'restrict'" isn't as easy as it sounds. You need to only use 'restrict' where you are 100% sure pointers can't alias. Wherever you get this wrong you have introduced a subtle bug that a) only shows up in optimized code b) may or may not show up at all depending on compiler version and target architecture and c) only shows up when two pointers actually alias at runtime, which may be rare, and may be intermittent ... in other words these bugs will be hell to debug. So in fact in a large codebase it will be a lot of work to figure out where you can safely put 'restrict' and you will probably introduce some very nasty bugs. Most people aren't going to be willing to do this.

pornel5y ago

Rust's problem with aliasing in LLVM is caused exactly by limited usefulness of C's restrict.

LLVM implements only coarse per-function aliasing information needed by C, and doesn't properly preserve fine-grained aliasing information that Rust can provide.

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

kouteiheika5y ago

> I doubt there's any performance to be gained that way

And how do you know this? Have you actually measured the difference?

My CPU-heavy program (which takes ~30 seconds to run on a 32-core Threadripper) gains ~15% extra performance if I turn it on.

adrianN5y ago

Sure, if you put the burden of proving correctness of restrict on the developer.

That's like saying C programmers could just write memory safe code if they felt like this would help, so clearly memory safety is not important.

FartyMcFarter5y ago

Looking at the reverse-complement code, it appears that the Rust and C implementations are using different algorithms:

On a quick inspection:

- The Rust code is about twice as long.

- The Rust code has CPU feature detection and SSE intrinsics, while the C code is more idiomatic.

- The lookup table is larger in the Rust code.

vitus5y ago

I recall an anecdote about how Haskell actually outperformed C on various tree benchmarks because it was using a better implementation. At some point, the C programmers got fed up with the airs of superiority from Haskell programmers, ported the Haskell implementation, and reclaimed their position.

I wouldn't be surprised if there's something similar happening here.

spiffytech5y ago

That kind of tit-for-tat in benchmarks seems like it's counter to the goal of benchmarks: what kind of performance could I expect to see using technology $FOO? Crucially, that question depends on how someone will realistically implement $FOO.

I like PyPy as an example: on the surface, implementing a Python runtime in Python and expecting performance gains seems crazy. PyPy manages to outperform CPython because although a C implementation should theoretically be faster, realistically the increased expressiveness of Python lets the PyPy devs opt into optimizations the CPython devs find out of reach.

I don't know C or Rust well enough to comment on these specific scenarios, but if two technologies can be fast, and one makes that speed accessible while the other encourages implementors to leave performance on the table, that's much more useful information to me than seeing a back-and-forth where folks compete to implement optimizations I'll never make in my own projects.

1. http://www.ats-lang.org/ 2. http://web.archive.org/web/20121218042116/http://shootout.al... 3. https://stackoverflow.com/questions/26958969/why-was-the-ats...

harporoeder5y ago

There was a long period where a fairly unknown theorem proving language ATS (1) was beating C in many test cases on the benchmark game (2) the benchmarks were removed though (3). I expect many languages could be made to win with sufficient effort.

FartyMcFarter5y ago

> I wouldn't be surprised if there's something similar happening here.

In this case it seems like benchmark code is allowed to use intrinsics, which can degenerate into a situation where a benchmark in language X is more "glorified x86 Assembly code" than actual code in language X.

This is not very useful for comparing languages IMO. Especially since all of Rust, C, C++ can use this strategy and become almost identical in both code and performance.

johnisgood5y ago

It is always like that. There is no way that X is faster than C. It most definitely has to do with algorithm. What I can believe is Fortran being faster than C in some circumstances, but that is about it. Can you guys please stop this nonsense? No, your equivalent Rust code is not faster, sorry.

Blikkentrekker5y ago

The Haskell code on some pathological examples of these implementations that I have seen has so many unsafe construct, strictness annotations, inlining annotations and so forth that it's practically C in a different syntax.

It is not idiomatic Haskell at all and loses all of the touted benefits.

Is there a separate benchmark that only accepts idiomatic code?

loxias5y ago

It's exactly the same thing. I've been seeing this tossed around for years now, and one of these days it'll make me grumpy enough to fix the benchmark.

_wldu5y ago

Yes, it seems everyone is always trying to beat C, but really, no one can.

madacol5y ago

Looked at the git repo https://salsa.debian.org/benchmarksgame-team/benchmarksgame and it seems there's only one committer. So if all scripts are written by the same guy, then his biases are all over the place, and it needs more code diversity before we can arrive to any conclusion.

Am I missing something?, is there a list somewhere of the people who have contributed to this?

• https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

mhh__5y ago

> The Rust code has CPU feature detection and SSE intrinsics, while the C code is more idiomatic.

These benchmarks are almost always either useless or a scam - you either end up writing rewriting the same implementation n times or you don't utilize the capabilities of the language, either way you're not really measuring much of anything intrinsic to the language itself - Rust and C both have the same backends and if you really care about performance you're going to take it to the max anyway, so inference by the compiler isn't that important.

pornel5y ago

Nothing stops someone from copying and submitting other implementation's algorithm. There are multiple implementations of each benchmark for every language:

It's possible that someone has already submitted both algorithms for both languages, and different approaches won for language-specific reasons.

FartyMcFarter5y ago

These are all the C versions I can find:

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

None of them have SSE intrinsics or are quite as long as the Rust version.

I find it doubtful that SSE intrinsics wouldn't help the C version, if they are indeed helping the Rust version. This seems fairly easy to check as the Rust version has a non-SSE fallback code path - I'd do it myself but am not able to at the moment.

littlestymaar5y ago

> Nothing stops someone

Well, it looks like the submission process[0] and the maintainer[1] do, actually.

[0]: https://www.reddit.com/r/rust/comments/kpqmrh/rust_is_now_ov... [1]: https://www.reddit.com/r/rust/comments/kpqmrh/rust_is_now_ov...

igouy5y ago

It's only 6 weeks since that Rust revcomp #1 program was contributed — November 23, 2020.

JZerf5y ago

I'm the author of that C program. You are correct that the programs are using somewhat different algorithms (and as an author for many other submissions on the site, I can say this is true for many of the benchmarks on the site).

In addition to what you've already mentioned, if I'm not mistaken, the programs also take different multi-threading approaches. One of the tasks in this benchmark involves reversing three very large strings. My C program took a fairly simple approach and does that reversal process for each string serially but processes multiple strings in parallel. The Rust program on the other hand does the opposite and does the reversal process for each string in a parallel manner but only processes strings one at a time. The algorithm the Rust program uses is more complicated which results in a considerably larger program size that also is a bit less flexible regarding the input (it assumes each line of input is 60 characters whereas my program will work with lines of any length) but it results in better CPU utilization, less clock time, and less memory use. I've been meaning to submit a faster C program using this faster algorithm but have simply been too busy with other things.

qart5y ago

I have seen this happening so often: C/C++/Rust often end up using CPU-specific features, and the code starts looking more and more like assembly code, and less like idiomatic high-level language code. Basically, comparisons of programs written in all the other languages against these three become meaningless. And in turn, hurts benchmarksgame as a resource for comparing languages.

If I had to write a performant library at work, I too might rely on CPU-specific assembly wrappers in my code. But IMO, such code has no place in a general-purpose cross-language benchmark site.

igouy5y ago

> If I had to write a performant library at work, I too might rely on CPU-specific assembly wrappers in my code. But IMO, such code has no place in a general-purpose cross-language benchmark site.

My guess is that other people would take your "performant library at work" premise as justification for including that code.

fsloth5y ago

What does "idiomatic" C even mean? It's a high level assembler and as such should not limit the creativity of programmers using it.

C code that pretends it does not need to care about it's platform is not idiomatic, it's just suboptimal.

FartyMcFarter5y ago

By "idiomatic C" I meant any of the following:

- Code that most C books/courses would teach you how to write

- Portable C code (arguably portability is one of C's biggest successes!)

- Code that you'd expect to find in the K&R book

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

gameswithgo5y ago

cpu feature detection is way easier in rust. its just built into the core libs.

colejohnson665y ago

Doesn’t gcc have a CPU feature branching system? You use it by attaching attributes to functions that say what instructions are required. Granted, it’s not as “elegant” as Rust, but it does exist.

bsder5y ago

And?

These are the kind of things that make working in one language different from another.

I'd actually like to see the SSE version in C so I can compare the two implementations and how much grief you have to go through.

nynx5y ago

The fastest n-body program is written in very idiomatic rust. https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

madars5y ago

In https://old.reddit.com/r/rust/comments/kpqmrh/rust_is_now_ov... /u/Saefroch writes:

Some years ago I submitted an n-body implementation that used the crunchy crate to unroll the inner loop. This was rejected as being non-idiomatic and obscuring what compilers are/aren't capable of. Rust is currently leading the benchmark because someone added the flag -C llvm-args='-unroll-threshold=500' which achieves the same effect. Why one of these is acceptable and the other isn't is beyond me, and all of this makes the whole project very discouraging.

FartyMcFarter5y ago

n-body in C compiled by clang runs just as fast as Rust apparently:

pjscott5y ago

It's not entirely surprising that a carefully-optimized C program using explicit SSE intrinsics, plus a fancy trick involving a low-precision square root instruction fixed up with two iterations of Newton's method, would be fast. :-)

What impresses me is that the Rust version didn't do any of that stuff, just wrote very boring, straightforward code -- and got the same speed anyway. Some impressive compilation there!

awestroke5y ago

The code for the C-Clang version is terrifying, compared to the Rust version. Which one would you rather maintain?

1. https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

seeekr5y ago

From the page: "… a pretty solid study on the boredom of performance-oriented software engineers grouped by programming language." I find this both funny and consider it true to some degree. There's nothing like a good old friendly arms race for the benefit of all (languages and its users, in this case) involved!

harporoeder5y ago

I was wondering if perhaps this was actually measuring a difference between LLVM and GCC, but they also provide a set of benchmarks of C Clang vs C GCC (1) and Clang is generally slower in those test. Although there is some correlation between the ones Clang wins in C And Rust.

arcticbull5y ago

Rust can be faster than C because in general C compilers have to assume that pointers to memory locations can overlap (unless you mark them __restrict). Rust forbids aliasing pointers. This opens up a whole world of optimizations in the Rust compiler. Broadly speaking this is why Rust can genuinely be faster than C. Same is true in FORTRAN, for what it's worth.

shepmaster5y ago

Why does the Rust compiler not optimize code assuming that two mutable references cannot alias? —https://stackoverflow.com/q/57259126/155423

walki5y ago

> C compilers have to assume that pointers to memory locations can overlap, unless you mark them __restrict...

What I don't fully understand is: "GCC has the option -fstrict-aliasing which enables aliasing optimizations globally and expects you to ensure that nothing gets illegally aliased. This optimization is enabled for -O2 and -O3 I believe." (source: https://stackoverflow.com/a/7298596)

Doesn't this mean that C++ programs compiled in release mode behave as if all pointers are marked with __restrict?

alerighi5y ago

Well you are saying that even in C you can use the restrict keyword to tell the compiler that 2 memory locations can overlap. Of course is in the hand of the programmer to tell the compiler to do so.

I don't think there is a fair comparison between Rust and C: C is just an higher level assembler, if the programmer knows what he's doing he can use the hardware 100% of its potential. That is the reason why C is still used in all the embedded applications where you have ridiculous low power microcontrollers and you must squeeze out the best performance.

That is the difference between C and Rust to me: for each fast Rust program you are guaranteed that you can write an equivalent performant program in C (or assembly). Worst case scenario you use inline assembly in C and you get that.

Thus the contrary cannot be true for Rust: if I give you a heavily optimized C program not always you can produce an equivalent version in Rust.

Also not always these optimizations are what you want. In C you can choose the level of optimizations, and most of the time, at least on the program that I write, I choose a low level of optimization. The reason is that a lot of time performance is not the only thing that matters, but it maybe matters most the stability of the code (and a code compiler with optimizations is more likely to contain bugs) or the ability to debug (and thus the readability of the assembly output of the compiler).

Rust gives out an horrible assembly code, that is impossible to debug, or to check for correctness. You just have to hope that the compiler doesn't contains bugs. For the same reason Rust is the ideal language to write viruses, since it's difficult to reverse engineer.

7 more replies

tick_tock_tick5y ago

In theory one day Rust will but LLVM doesn't support anything beyond the function level for noalias since that's all that's needed to support __restrict in C. Even that isn't used by Rust most of the time since they have found several bugs in it.

[1]: https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

indolering5y ago

The Rust vs C Clang comparison [1] has Rust winning on every benchmark except pidigits, and that's only by .01 seconds. Memory usage and binary size is competitive as well.

lmilcin5y ago

Which is not at all surprising. Rust has much larger compilation unit and knows more about what can read/write a particular piece of memory. This allows some occasions for optimization where C compiler must be conservative.

An example of simpler version of this is Fortran that can be faster for numerical loads due to the fact that Fortran disallows aliasing of function arguments. C on the other hand, must pay the price of having to be conservative with how it treats arguments just in case they overlap.

the84725y ago

Wouldn't the C compiler be allowed to make similar assumptions with -flto or -fwhole-program?

olodus5y ago

Yeah I can understand why. Though I still prefer C in some ways simply because of its minimalism in the language, while still allowing for the kinds of things you want to be able to do if you wanna push your code to the limits. It is a bit scary sometimes writing in C though and sometimes I get a bit annoyed at how some things work or not work in it. I personally am hoping Zig can soon fill this minimalism trait in langs for me. C will probably always be the standard though and I do think more programmers should learn C better to become better programmers.

Rust is a good lang though. I am glad something else is pushing up there for the top spots. And more competition in performance is a good thing.

You don't always have to pick your sides. I just want to be able to write good code and be happy writing it :)

dnautics5y ago

You'll be happy to know I implemented the n-body benchmarks game in zig 0.6.0 and it absolutely thrashed the rust version. Submitted it, but they don't take PLs not on the board currently.

adamnemecek5y ago

But how much? Do you have an idea why?

wuxb5y ago

Just took a quick peek at the binary-trees C code. Why using openmp while the others don't? why use recursive functions? The C implementation is not correctly optimized. People just paid more attention to Rust and other languages. Rust can be as fast as C, but "faster" is really misleading. BTW, I use clang all the time since it's better than GCC.

Tade05y ago

It seems that someone simply tacked openmp on without checking if it actually increases performance in this case - which is unusual, because I remember a course in college, part of which was a project aiming to teach the student that openmp is not a silver bullet and should be used only when it actually helps.

JZerf5y ago

I submitted the C binary-trees program. I did check to see if OpenMP increased the performance and it most certainly does. Just looking at the CPU and clock time usage on https://benchmarksgame-team.pages.debian.net/benchmarksgame/... you can see that the clock time used is about one third of the CPU time that was used.

igouy5y ago

How do you know that this is not one of those cases when openmp actually helps?

JZerf5y ago

You're probably referring to the C program I submitted.

OpenMP is only available for C, C++, and Fortran so that is why most programs won't use it. However most of the other programming languages have their own ways for doing multi-threading and many of the programs for this benchmark do make use of them.

The rules at https://benchmarksgame-team.pages.debian.net/benchmarksgame/... request that submitters use the same algorithms as existing programs and I try to follow that rule. I believe all the binary-trees programs are using recursive functions so naturally I did the same to avoid breaking the rule about using different algorithms.

ma2rten5y ago

I'm having a hard time making sense of this page. Why is this comparing fastest implementation with slowest implementation? Why is the metric busy time/least busy? Why is C++ so much better than C?

jeffbee5y ago

Speaking generally and about no specific program, you should expect C++ to be faster than C. C++ has more ways for the programmer to communicate with the compiler.

vitus5y ago

At the same time, a lot of those mechanics involve additional overhead (e.g. vtable lookups for dynamic dispatch per inheritance).

But yes, some of these do provide hints for the compiler, e.g. constexpr, ownership semantics per unique_ptr. There's nothing stopping a human from writing equivalent C, so my suspicion is that the performance gap is primarily due to the benchmark implementation.

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

infoseek125y ago

An unrelated rant about benchmarksgame. Has anyone noticed that the Python implementation of regex beats a lot of the Rust and C implementations? That’s because it uses the PCRE2 library (written in C) which it assumes is installed on the OS. Benchmarks are always artificial but this seems like a step too far: the benchmark hardly says anything about Python and is dependent on the OS environment having the right dependences.

JZerf5y ago

I wrote that program in the hope that it would better illustrate why some of the benchmarks on the site aren't very good since for some benchmarks the program performance is highly dependent on the libraries being used and not the programming language implementation itself. I know at least one person opened an issue regarding this on the site issue tracker at https://salsa.debian.org/benchmarksgame-team/benchmarksgame/... and I also believe I recall one of the Rust regex crate developers mentioning this too.

I know that using PCRE2 in a Python program isn't typical but the Benchmarks Game does allow using other libraries and many of the previously submitted programs have been doing this for a long time already. The pidigits benchmark is the other benchmark that is heavily dependent on the libraries being used as is illustrated by their being a nearly ten way tie for second place by a bunch of programs that all use GMP.

nhumrich5y ago

Sure, I understand your concern. However, this is also very representative of python. Python programs are very much about using libraries for the heavy lifting. Python has never been a "distributable single binary" language, and has always required certain OS dependencies, or installable "wheels" (binary dependencies)

indymike5y ago

Let's talk about speed when we are implementing the same algorithm and optimizations, please. If $1 was donated to cure cancer every time a developer games a comparison like this, there would be no more cancer.

burnthrow5y ago

I sort of doubt that, billions of dollars have been spent on cancer research.

kzrdude5y ago

Rust seems to be using parallelism better. In one benchmark though (fasta), C gcc is using all 4 cpus and Rust only two, and still wins.

(Looking at just C gcc vs Rust) https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

FartyMcFarter5y ago

The C version uses OpenMP, while the Rust version doesn't.

I tried running the C code with 2 and 4 threads, wall-time and CPU time don't change much in either case which is strange (this is cygwin with gcc 10.2.0):

2 threads:

  $ /usr/bin/gcc -pipe -Wall -O3 -fomit-frame-pointer -march=ivybridge -fopenmp fasta.c -o fasta.gcc-2.gcc_run && time ./fasta.gcc-2.gcc_run 25000000 | md5sum
  fd55b9e8011c781131046b6dd87511e1 *-

  real    0m0.724s
  user    0m1.468s
  sys     0m0.108s

4 threads:

  $ /usr/bin/gcc -pipe -Wall -O3 -fomit-frame-pointer -march=ivybridge -fopenmp fasta.c -o fasta.gcc-2.gcc_run && time ./fasta.gcc-2.gcc_run 25000000 | md5sum
  fd55b9e8011c781131046b6dd87511e1 *-

  real    0m0.670s
  user    0m1.514s
  sys     0m0.046s

tutfbhuf5y ago

Nodejs is incredible fast for a interpreted language. It is only ~4 times slower than Rust and only a bit slower than Go or Java (compiled GC languages) in the benchmarks. Compare that with Python 3, also interpreted but ~30 times slower than Rust.

I know that Python 3 can do some runtime stuff that Nodejs can't, but I wonder whether that's worth so much performance. Maybe if the answer is that you would include C modules in Python if you need the speed, but I don't know if that's a good answer to the problem.

brundolf5y ago

Node uses V8, which does JIT compilation, while CPython is a straight bytecode interpreter. A better point of comparison would be PyPy

tutfbhuf5y ago

Are there any CPython vs PyPy benchmarks?

Can_Not5y ago

> I know that Python 3 can do some runtime stuff that Nodejs can't

Examples? This is too interesting to just throw in then hand wave away.

tutfbhuf5y ago

Honestly, I don't know. I think I read it somewhere, as explanation for why python is so slow. It made sense to me, that it must be language features. I hope there's someone on HN who can explain it.

api5y ago

One of Rust's performance advantages is the compiler's ability to unambiguously determine memory aliasing. Aliasing is why many numeric kernels are written in Fortran, a much older language that also enforces strict aliasing as it simply doesn't allow overlapping references.

There are probably others as well, but this is the advantage I'm familiar with. C's ambiguity makes it harder to achieve some optimizations that can really matter on modern CPUs.

nynx5y ago

LLVM's support for this is bugged, so rust does not currently take (edit: full) advantage of it.

steveklabnik5y ago

Rust does not take full advantage of it; that is, &T will still get noalias, it's &mut T that's currently disabled.

The tracking bug is https://github.com/rust-lang/rust/issues/54878

runevault5y ago

I really hope this gets fixed at some point, just because I'd like to see how much faster rust can get. In particular I wonder how much it would impact the compiler itself since it is also written in rust.

kzrdude5y ago

Does not fully take advantage of it.

hawk_5y ago

Unfortunately that's only on paper due to LLVM issues.

mh75y ago

Some of the rust versions calls C libraries for its heavy lifting (gmp, pcre) so I wouldn't take this too seriously.

sitkack5y ago

As soon as the libraries are RiiR, then the Rust compiler can optimize across those library calls.

burntsushi5y ago

That's not good enough. Rust already has a pure-Rust regex library. (I'm its author.) It is the only non-PCRE regex engine to appear in the first 20 results of the regex-redux benchmark. (The 21st I believe is currently RE2.) When using the regex crate, you would not materially benefit from optimizations across library calls, nor is it the difference maker here. Highly optimized regex engines depend more on internal inlining. Take a look at the object code for a program compiled with PCRE2 or the regex crate. You'll find huge functions internal to the regex library where inlining has been forced to reduce overhead. Those things are never going to be inlined across library boundaries.

pjmlp5y ago

Looking at the progress of Rust/WinRT, and C++ renaissance thanks to GPGPU computing and mobile OSes stack, that is still a couple of decades away.

And then there are the whole LLVM and GCC based eco-systems.

ncmncm5y ago

When a Rust program is faster than the matching C program, it is utterly nonsensical to attribute its speed to C.

It is gratifying to see C++ identified, here, as the hands-down fastest implementation language, but odd to see Rust performance still compared, in the headline, to C, as if that were the goal. The headline should say that Rust speed is approaching C++ speed. In principle, Rust speed should someday exceed C++'s, in some cases, because it leaves behind some boat anchors C++ must retain for backward compatibility. In particular, if compiler optimizers could act on what they have been told about language semantics, they should be able to do optimizations they could not do on C++ code. As it is, Rust relies on optimizers coded for C and C++.

Rust may never reliably beat C++, because Rust has itself committed to details that interfere with optimization, principally in its standard library: The C++ Standard Library offers more knobs for tuning, while the Rust libraries are simpler to use.

Some of the reasons that C++ is so reliably faster than C are subtle and, to some, surprising. As noted elsewhere in this thread, the compiler knows more about what the language is doing, and can act on that knowledge, but C and C++ compilers share their optimizer, so that makes less difference than one might guess. Mainly, the C++ code does many things you might do in C, but in exactly one way that the optimizer can recognize easily.

The biggest reason why C++ is so much faster than C is that C++ can capture optimizations in libraries and reliably deliver those optimizations to library users. The result is that people who write C++ libraries pay a great deal of attention to performance because it pays, and that attention directly benefits all users of the library, including other libraries.

Because you can capture more semantics in C++ libraries, C++ programs naturally use better algorithms than C programs can. C programs much more frequently use pointer-chasing data structures because those are easier to put into a C library, or quicker to open-code, where the corresponding C++ program will use a library that has been hand-tuned for performance without giving up anything else.

Rust gets to claim many of the same benefits, because it is also more expressive than C, and Rust libraries are often as carefully tuned. Rust is not as expressive as C++, yet, and has many, many fewer people working to produce optimal libraries, but it is doing well with what it has.

bluecalm5y ago

Your claims contradict my experience.

One example: C re-write of a popular chess engine Stockfish (written in C++) is significantly faster (10%-30% depending on who measures it at which time and on what hardware). This is a piece of code which was already heavily optimized for many years as speed is critical for the engine's performance. One guy (although very talented one) re-wrote it in C and got the speed gains. Another guy (also very talented one) re-wrote it in assembly and got even bigger speed gains (see CFish and asmFish projects)

It looks to me that you are comparing badly written C to decently written C++ and claiming speed gains. I use C for my own software(solver for a popular game) and I wouldn't dream of using stuff like "pointer chasing structures". If you need to use stuff like hash tables in your performance critical code you have to code them yourself anyway as using some generic hash and memory handling algorithm is a recipe for being slow.

pjmlp5y ago

I agree, however from my experiences with lifetime checkers in VC++, it will still take a couple of more years to make it work properly and hardware memory tagging as being pushed by Oracle, Microsoft, Apple, Google and ARM is still a couple of years away to be deployed everywhere.

pmarin5y ago

The only conclusion I have got about this web site is how much some programmers like to write benchmark code in Rust.

albertzeyer5y ago

Interestingly, C++ seems to be the fastest overall.

agumonkey5y ago

After the rust blossom storm I didn't track cpp implementation evolutions.. did cpp compiler perf/libs increased or was is simply faster and still is the same ?

burnthrow5y ago

Was faster and is the same, checking isn't free. Not that Rust is slow. C++ is unsafe by default whereas in Rust it is opt-in (lexically scoped). The Rust implementations don't appear to use `unsafe`.

https://dev.to/aakatev/executable-size-rust-go-c-and-c-1bna

moldavi5y ago

Shouldnt they be about the same? (At least until the LLVM immutability optimizations happen)

I suspect surprising factors are in play.

skohan5y ago

The ceilings should be similar, but there is an argument that there are cases where idiomatic Rust might outperform idiomatic C or vice versa.

For instance, in real-world usage, you might do a bit more redundant copying in C in places you would avoid it with the borrow checker in Rust.

Conversely, enforcing RAII in Rust might cost performance in some scenarios relative to C.

jug5y ago

I'd look at the algorithms here. Looking a bit fishy with sometimes quite serious differences...

1vuio0pswjnm75y ago

The reasons I use C versus comparable alternatives are not limited to speed. For example, the size of the compiler toolchain, the speed of compilation and the size of the resulting executables are all factors I have to consider. I do lots of work on systems with limited resources. How does Rust compare on those points versus, say, GCC.

steveklabnik5y ago

I mean, you'll get drastically different numbers with all of those examples if you actually ask for options that produce small binaries. That the default flags don't try to make things small (across any of these toolchains) means this isn't really a fair comparison.

The smallest "hello world" binary rustc has ever produced was 137 bytes. https://github.com/tormol/tiny-rust-executable

fmntf5y ago

If that's possible, why should not binaries be small (without too many downsides, eg execution speed) by default?

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

haxorito5y ago

It’s all depends on developer. I have talent to write really bad code in any language and make code as slow as I want. I can write code in ruby that would outperform code in C. The benchmarks like this don’t give you full picture of language possibilities. On top of everything we also need account for compilers and compiler optimization. Nothing against rust. I write ton of code using rust, and advocating at my work. But this looks like a PR move to me.

topspin5y ago

Is it just me or has that 'benchmarks game' site been growing less navigable over time? It use to be easy to compare benchmarks across several languages. If that capability still exists somewhere it's buried and I'm not interested in puzzling it out. There are no side bars or menus or anything helpful.

fancyfish5y ago

It's a bit hard to find, but the page comparing languages is found via the main page -> "Which are fastest?"

igouy5y ago

1) The parent post linked to charts which "compare benchmarks across several languages".

2) The home page (click the banner) has links which compare benchmarks across two language implementations.

3) Each of those comparison pages has links which compare all the programs, for all the language implementations, for each benchmark.

notorandit5y ago

I am not sure I can buy such a comparison. Someone smarter than me already argued about test implementations. Someone else also put compilers and interpreters into prospective. Of course language expressiveness can gauge in but, IMHO, comparing the same sort algorithm or the same hash table implementation (or n-queens algo) could make much more sense especially with comparable compilers.

If Rust implementation is father than C's, kudos goes to the compiler, not to the language

dodobirdlord5y ago

The Rust language specifically gives more information and thus more optimization opportunities to the compiler. Plus, when comparing Rust code compiled with rustc and C code compiled with Clang, both are using LLVM as the compiler backend, so what primarily comes into play is how expressive each language is, and how idiomatic it is to write code that will be optimized by the compiler.

Both C and Rust are capable of just inlining optimal assembly, so comparing "pure speed potential" is pointless.

notorandit5y ago

OK. So you say that implementations are the same within the respective language capabilities and that the Rust compiler frontend is inherently better than C's thanks to the language expressiveness? Still sounds weird to me, unless the test programs use very different approaches, like parallel programming...

kowlo5y ago

Searched the page for "Rust" which returned nothing... and the box labels are awkward. Why not label them conventionally?

gus_massa5y ago

You can see the data in https://benchmarksgame-team.pages.debian.net/benchmarksgame/... but it has no graphic comparison.

igouy5y ago

> Why not label them conventionally?

Space.

savant_penguin5y ago

Wondering around the site I found this particular benchmark interesting

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

Why is julia using 250x the memory? (and it`s still fast)

stabbles5y ago

It's including compilation time and memory too, which I would consider a major flaw in the benchmarks game.

If I wrap things in a function and run it a second time in the repl, this is what I get:

    $ JULIA_LLVM_ARGS="-unroll-threshold=500" ~/julia-b00e9f0bac/bin/julia -O3 --check-bounds=no

    julia> include("nbody.jl")
    run (generic function with 1 method)

    julia> @time run(50000000)
    -0.169075164
    -0.169059907
    2.044478 seconds (294.25 k allocations: 12.599 MiB, 8.89% compilation time)

    julia> @time run(50000000)
    -0.169075164
    -0.169059907
    1.867936 seconds (12 allocations: 1.750 KiB)

So, the second time there's no compilation overhead, just like in Rust, C, C++.

ChrisRackauckas5y ago

Julia is not allowed to use its compiler (PackageCompiler.jl) but Rust and C are. It's really silly and imposed by the author of the benchmarks for no clear reason.

batmansmk5y ago

Was the Formula One pilot Schumacher as good as he is just because of his Ferrari? Obviously not, but he probably wouldn't have performed as well with a Ford Focus.

If you are willing to spend the time to perform at the highest level, Rust can bring you there.

acje5y ago

It struck me a while ago that the most powerful feature of rust is the strong contracts libraries can and must express. This allows people with much deeper knowledge than me to make awesome stuff I can depend on.

anta405y ago

There are 2 charts on the page, both using the same title: "How many times slower?"

I don't get those. What's the difference?

igouy5y ago

The different charts show different programming language implementations.

dilap5y ago

My experience using ripgrep and fd is that this is also true in real-world programs. :-)

sriku5y ago

... but not as fast as C++/g++ ?

timbit425y ago

Yet...

tpoacher5y ago

Define 'c'.

layoutIfNeeded5y ago

C++ still seems to be the fastest.

JoeAltmaier5y ago

Apples and oranges. How about Rust vs C++?

eeZah7Ux5y ago

Thanks, I'd rather not use it. My time is valuable as well.

Animats5y ago

Two charts with different languages. Unclear what is being measured. Is this a humor article?

nindalf5y ago

This site has a page where they explain what they're doing - https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

Basically, it's toy programs written in each of these languages. They measure how fast each one is executed. This methodology does have it's limitations, which that page is upfront about.

hvasilev5y ago

How do I downvote a thread in HN, I can just upvote it?

Also is there also a way to filter out posts with the word "Rust" in it so I don't see them?

ksec5y ago

The only option is to Hide it.

rurban5y ago

Because Rust does alloca for all locals, and this if course faster. Everyone else avoids it for security reasons. Just search the Rust bugtracker for stack overflows.

steveklabnik5y ago

You are confusing the llvm instruction “alloca” with the C feature “alloca.” llvm will use its alloca instruction for C and C++ local variables the same way as Rust does.

You are correct that the C feature is often banned. Rust doesn’t even support it at all.

rurban5y ago

No, you are still confusing it. llvm alloca reserves a stack allocation slot. The number of slots is unchecked that's why rust has so many stack overflows. Esp with varargs. Check your issue tracker or source code.

etaioinshrdlu5y ago

I know it's a meme, but it really does seem like most C or C++ code would be better off transitioning to Rust at some point. That includes the entire Linux kernel, web browsers, entire OS's...

tpoacher5y ago

Ah, rust. A language which combines the flexibility of assembly language with the power of assembly language.

jolux5y ago

I think benchmarking C vs C++ vs Rust must only really be useful for researchers. They’re all making a similar tradeoff for performance: forcing you to consider how you use memory. Does anyone work in a field where the performance difference between these specific three platforms matters? I’m genuinely curious. Edit: also, if you could explain briefly why and what makes particular choices out of the three unsuitable, that would be awesome too.

jandrewrogers5y ago

It matters for real-world software development, though the reason may not be intuitive. In theory, for any particular bit of software, you can write code in any of these three languages that has nearly identical performance. In practice, the complexity of expressing equivalent performance can vary considerably depending on what you are trying to do.

There are finite limits to the complexity cost developers are willing to pay for performance. Because the cost in each of these three languages is different to express some thing, sometimes there will be a threshold where in one or more of these languages most developers will choose a less optimal design. This manifests as practical performance differences in real software even though in theory they are equally expressive with enough effort.

This comes with another tradeoff. Efficient expressiveness relative to software performance comes at a cost of language complexity. C is a simple language that has enough efficient expressiveness for simple software architectures. C++ is at the extreme opposite; you can do mind-boggling magic with its metaprogramming facilities that can express almost anything optimally but god help you if you are trying to learn how to do this yourself. Rust sits in the middle; much more capable than C, not as expressive as C++.

This suggests the appropriate language is partly a function of the investment a developer is willing to make in learning a language and what they need to do with it. Today, I use modern C++ even though it is a very (unnecessarily) complex language and write very complex software. Once you pay the steep price of learning C++ well, you acutely feel the limitations of what other languages can't express easily when it comes to high performance software design.

I used to write a lot of C. It still has critical niches but not for high-performance code in 2021, giving up far too much expressiveness. C++17/20 is incredibly powerful but few developers really learn how to wield that power, though usage of it has been growing rapidly in industry as scale and efficiency have become more important. Rust is in many ways an heir apparent to C and/or Java, for different reasons. Or at least, that is how I loosely categorize them in my head, having a moderate amount of contact with all three. They all have use cases where you probably wouldn't want to use the others.

steveklabnik5y ago

> In practice, the complexity of expressing equivalent performance can vary considerably depending on what you are trying to do.

Yep. Rust's safety means that in some cases, you can be more aggressive because you know the compiler has your back. And in other cases, it makes harder things tractable. The example of Stylo is instructive; Mozilla tried multiple times to pull the architecture off in C++, but couldn't manage to do it until Rust.

jolux5y ago

This is a fantastic comment and I think it gives me a much better understanding of the tradeoffs here. It seems like which is the best choice depends a lot on what you're trying to express and how well the tool fits the problem you're trying to solve, and a lot less on what the absolute capabilities of a given language are.

londons_explore5y ago

I do some graphics stuff. As soon as you get to "this chunk of code needs to be run for every pixel of every 4k frame at 60fps", suddenly the number of clock cycles and registers matters... Some of my platforms don't have GPU's, so it really is squeezing everything possible out of the language and compiler...

djeiasbsbo5y ago

I do audio stuff and it is the same there. DSP is easy until it has to be in real time and there can only be minimal latency...

mrec5y ago

I'm curious, what platforms support 4K output but don't have any kind of GPU?

steveklabnik5y ago

I think what's most interesting about your comment is the assumption that the three are in the same ballpark. You're not wrong, but it just reminds me of how far we've come. That is, the key is not "which of these three can eke out the last tiny ounce of things," but that Rust has successfully landed across that gap you see in the graph. That it's "(C/C++/Rust) vs everything else" is in of itself an interesting result. You can see some skepticism of the premise elsewhere in this thread, even.

jolux5y ago

>You can see some skepticism of the premise elsewhere in this thread, even.

Heh, well. The downvotes are clearly trying to tell me something, though I'm not sure what. I can guess. I assumed that this was something widely agreed upon at this point but clearly I assumed incorrectly.

I think ATS is in that category too but it was removed from the benchmarks game at some point. It's also vastly more esoteric and complicated than Rust is from my limited knowledge. Perhaps someday Zig will get there as well. I wouldn't have expected that this niche would ever see so much exploration, as C and C++ were the only game in town for so long.

pixel_fcker5y ago

Graphics software.

klysm5y ago

Yeah, databases

JimBlackwood5y ago

Could you elaborate a little? I’d be interested in this answer

[1]: https://www.youtube.com/watch?v=HgtRAbE1nBM&t=43m15s

kzrdude5y ago

It's mostly an exercise that's useful for Rust: Internally, to prove that "it works"* and externally, to make a credible name for Rust.

(*) of course part of the output is also looking at the Rust code and evaluating style and `unsafe`-wise what the price for winning was.

j / k navigate · click thread line to collapse

428 comments

Matthias2475y ago

Rust avoids those from the start by making slices idiomatic.

dan-robertson5y ago

littlestymaar5y ago

typedef_union5y ago

>The reason that C programs often don’t perform as well as an equivalent rust program[1] is that it’s so incredibly hard to do anything at all in C (especially something reliable) that one can usually only do the simplest thing possible and this typically means simple data structures, simple algorithms and arrays

blackrock5y ago

Fascinating. I would’ve thought most commercial C programs would have heavily used linked lists, hashes (dictionary), and binary search trees all over the place.

I assume most C++ programs heavily use more of these advanced data structures, correct?

frutiger5y ago

In many programs in many domains the sizes of these data structures will rarely exceed this limit.

vbezhenar5y ago

I have no idea whether that matters or even easy to measure...

https://github.com/redis/redis/blob/3.0/src/sds.h

ummonk5y ago

This is why Rust uses B-trees, which combine linear search and trees at optimal ratios.

michaelmior5y ago

> null terminated strings feel like idiomatic C to most people

josephg5y ago

It would be the equivalent of teaching people to write books without encouraging them to read anything.

6 more replies

rcoveson5y ago

Null terminated strings are often called cstrings. They’re beyond idiomatic; they’re part of the C standard library.

martincmartin5y ago

Also lack of generics can make it slow, e.g. qsort() requires a function call for each comparison. So C++'s std::sort() can be significantly faster on an array of integers.

mh75y ago

4 more replies

bluecalm5y ago

jstimpfle5y ago

I love qsort! It's easy to use.

When I once tested std::sort against qsort (sorting 4-byte integer) I measure a 2x difference. So yes, definitely non-trivial, but it won't get much worse than that.

Have you ever seen a program that was slow because of a slow sorting routine?

musicale5y ago

> real world C is a lot less performant than people think it might be

What is meant here? Fast? Reliable? Secure? Memory efficient? Power efficient? Easy to write? Easy to maintain? Quick to compile? Easy to debug?

Certainly many algorithms and data structures (in C and other languages) exhibit tradeoffs including things like speed vs. memory use vs. code size vs. complexity, etc..

unethical_ban5y ago

I assume, since the submissions says is talking about something "faster in benchmarks", that he means "fast".

quotemstr5y ago

WalterBright5y ago

D uses slices for strings, which also gives D a big performance boost over C whenever strings are in play.

BruceEel5y ago

Walter, can you elaborate on this, are D's slices functionally equivalent to C++'s string_view? I.e. no copying as long no 'ASCIIZ-dependent' code is involved?

[1] https://github.com/rust-lang/rust/issues/54878#issuecomment-...

greyhair5y ago

It is funny when I think of the hundreds of thousands of lines of code I have shipped in my long career (all embedded) and how little of it ever had to handle a string.

I agree, C is a really poor choice for string or other human readable content handling.

Cellular modem code that runs on DSPs is written in C. Maybe someday that will be written in Rust. We'll see.

fermienrico5y ago

grzm5y ago

millstone5y ago

A programming language idiom is a pattern or technique that you expect your readers to know, even if it isn't obvious when encountered for the first time. Here's an idiom:

    for (int i=0; i < 10; i++)

We all instantly know this means loop 10 times, from 0 through 9. But on first read it takes a little work to figure out what's happening.

ClumsyPilot5y ago

I was struggling with that too initially - replace the word with 'standard' or 'natural' and it retains the same meaning.

jstimpfle5y ago

baybal25y ago

> I spent a fair amount of time reviewing C code in the last 5 years - and things that pop up in nearly every review are costly string operations.

Do not tie computational operations on string operations in C.

Pick one of many string manipulation libraries to do any serious work with them.

Svetlitski5y ago

[2] https://doc.rust-lang.org/nomicon/aliasing.html

nindalf5y ago

Still, real world performance will probably benefit from this fix so it's a positive change regardless.

jerf5y ago

kolbe5y ago

The problem is that it’s “rare” and not “impossible.” Compilers have to be logically sound, not probabilistic when it comes to defined behaviors.

loeg5y ago

volta875y ago

Most C++ STL algorithms take iterators, that in many cases are just pointers in benchmarks because these often only deal with arrays. Hell most C standard APIs (e.g. memcpy) take multiple pointers.

stabbles5y ago

I doubt there's any performance to be gained that way, but if so, the C implementation can just use `restrict` to the same effect.

vvanders5y ago

Have you ever used restrict in anger?

Compare that to Rust which has this knowledge built in since it naturally falls out of the ownership model.

roca5y ago

pornel5y ago

Rust's problem with aliasing in LLVM is caused exactly by limited usefulness of C's restrict.

LLVM implements only coarse per-function aliasing information needed by C, and doesn't properly preserve fine-grained aliasing information that Rust can provide.

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

kouteiheika5y ago

> I doubt there's any performance to be gained that way

And how do you know this? Have you actually measured the difference?

My CPU-heavy program (which takes ~30 seconds to run on a 32-core Threadripper) gains ~15% extra performance if I turn it on.

adrianN5y ago

Sure, if you put the burden of proving correctness of restrict on the developer.

That's like saying C programmers could just write memory safe code if they felt like this would help, so clearly memory safety is not important.

FartyMcFarter5y ago

Looking at the reverse-complement code, it appears that the Rust and C implementations are using different algorithms:

On a quick inspection:

- The Rust code is about twice as long.

- The Rust code has CPU feature detection and SSE intrinsics, while the C code is more idiomatic.

- The lookup table is larger in the Rust code.

vitus5y ago

I wouldn't be surprised if there's something similar happening here.

spiffytech5y ago

1. http://www.ats-lang.org/ 2. http://web.archive.org/web/20121218042116/http://shootout.al... 3. https://stackoverflow.com/questions/26958969/why-was-the-ats...

harporoeder5y ago

FartyMcFarter5y ago

> I wouldn't be surprised if there's something similar happening here.

This is not very useful for comparing languages IMO. Especially since all of Rust, C, C++ can use this strategy and become almost identical in both code and performance.

johnisgood5y ago

Blikkentrekker5y ago

It is not idiomatic Haskell at all and loses all of the touted benefits.

Is there a separate benchmark that only accepts idiomatic code?

loxias5y ago

It's exactly the same thing. I've been seeing this tossed around for years now, and one of these days it'll make me grumpy enough to fix the benchmark.

_wldu5y ago

Yes, it seems everyone is always trying to beat C, but really, no one can.

madacol5y ago

Am I missing something?, is there a list somewhere of the people who have contributed to this?

• https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

mhh__5y ago

> The Rust code has CPU feature detection and SSE intrinsics, while the C code is more idiomatic.

pornel5y ago

Nothing stops someone from copying and submitting other implementation's algorithm. There are multiple implementations of each benchmark for every language:

It's possible that someone has already submitted both algorithms for both languages, and different approaches won for language-specific reasons.

FartyMcFarter5y ago

These are all the C versions I can find:

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

None of them have SSE intrinsics or are quite as long as the Rust version.

littlestymaar5y ago

> Nothing stops someone

Well, it looks like the submission process[0] and the maintainer[1] do, actually.

[0]: https://www.reddit.com/r/rust/comments/kpqmrh/rust_is_now_ov... [1]: https://www.reddit.com/r/rust/comments/kpqmrh/rust_is_now_ov...

igouy5y ago

It's only 6 weeks since that Rust revcomp #1 program was contributed — November 23, 2020.

JZerf5y ago

qart5y ago

If I had to write a performant library at work, I too might rely on CPU-specific assembly wrappers in my code. But IMO, such code has no place in a general-purpose cross-language benchmark site.

igouy5y ago

> If I had to write a performant library at work, I too might rely on CPU-specific assembly wrappers in my code. But IMO, such code has no place in a general-purpose cross-language benchmark site.

My guess is that other people would take your "performant library at work" premise as justification for including that code.

fsloth5y ago

What does "idiomatic" C even mean? It's a high level assembler and as such should not limit the creativity of programmers using it.

C code that pretends it does not need to care about it's platform is not idiomatic, it's just suboptimal.

FartyMcFarter5y ago

By "idiomatic C" I meant any of the following:

- Code that most C books/courses would teach you how to write

- Portable C code (arguably portability is one of C's biggest successes!)

- Code that you'd expect to find in the K&R book

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

gameswithgo5y ago

cpu feature detection is way easier in rust. its just built into the core libs.

colejohnson665y ago

bsder5y ago

And?

These are the kind of things that make working in one language different from another.

I'd actually like to see the SSE version in C so I can compare the two implementations and how much grief you have to go through.

nynx5y ago

The fastest n-body program is written in very idiomatic rust. https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

madars5y ago

In https://old.reddit.com/r/rust/comments/kpqmrh/rust_is_now_ov... /u/Saefroch writes:

FartyMcFarter5y ago

n-body in C compiled by clang runs just as fast as Rust apparently:

pjscott5y ago

What impresses me is that the Rust version didn't do any of that stuff, just wrote very boring, straightforward code -- and got the same speed anyway. Some impressive compilation there!

awestroke5y ago

The code for the C-Clang version is terrifying, compared to the Rust version. Which one would you rather maintain?

1. https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

seeekr5y ago

harporoeder5y ago

arcticbull5y ago

shepmaster5y ago

Why does the Rust compiler not optimize code assuming that two mutable references cannot alias? —https://stackoverflow.com/q/57259126/155423

walki5y ago

> C compilers have to assume that pointers to memory locations can overlap, unless you mark them __restrict...

Doesn't this mean that C++ programs compiled in release mode behave as if all pointers are marked with __restrict?

alerighi5y ago

Well you are saying that even in C you can use the restrict keyword to tell the compiler that 2 memory locations can overlap. Of course is in the hand of the programmer to tell the compiler to do so.

Thus the contrary cannot be true for Rust: if I give you a heavily optimized C program not always you can produce an equivalent version in Rust.

7 more replies

tick_tock_tick5y ago

[1]: https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

indolering5y ago

The Rust vs C Clang comparison [1] has Rust winning on every benchmark except pidigits, and that's only by .01 seconds. Memory usage and binary size is competitive as well.

lmilcin5y ago

the84725y ago

Wouldn't the C compiler be allowed to make similar assumptions with -flto or -fwhole-program?

olodus5y ago

Rust is a good lang though. I am glad something else is pushing up there for the top spots. And more competition in performance is a good thing.

You don't always have to pick your sides. I just want to be able to write good code and be happy writing it :)

dnautics5y ago

You'll be happy to know I implemented the n-body benchmarks game in zig 0.6.0 and it absolutely thrashed the rust version. Submitted it, but they don't take PLs not on the board currently.

adamnemecek5y ago

But how much? Do you have an idea why?

wuxb5y ago

Tade05y ago

JZerf5y ago

igouy5y ago

How do you know that this is not one of those cases when openmp actually helps?

JZerf5y ago

You're probably referring to the C program I submitted.

ma2rten5y ago

I'm having a hard time making sense of this page. Why is this comparing fastest implementation with slowest implementation? Why is the metric busy time/least busy? Why is C++ so much better than C?

jeffbee5y ago

Speaking generally and about no specific program, you should expect C++ to be faster than C. C++ has more ways for the programmer to communicate with the compiler.

vitus5y ago

At the same time, a lot of those mechanics involve additional overhead (e.g. vtable lookups for dynamic dispatch per inheritance).

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

infoseek125y ago

JZerf5y ago

nhumrich5y ago

indymike5y ago

burnthrow5y ago

I sort of doubt that, billions of dollars have been spent on cancer research.

kzrdude5y ago

Rust seems to be using parallelism better. In one benchmark though (fasta), C gcc is using all 4 cpus and Rust only two, and still wins.

(Looking at just C gcc vs Rust) https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

FartyMcFarter5y ago

The C version uses OpenMP, while the Rust version doesn't.

I tried running the C code with 2 and 4 threads, wall-time and CPU time don't change much in either case which is strange (this is cygwin with gcc 10.2.0):

2 threads:

  $ /usr/bin/gcc -pipe -Wall -O3 -fomit-frame-pointer -march=ivybridge -fopenmp fasta.c -o fasta.gcc-2.gcc_run && time ./fasta.gcc-2.gcc_run 25000000 | md5sum
  fd55b9e8011c781131046b6dd87511e1 *-

  real    0m0.724s
  user    0m1.468s
  sys     0m0.108s

4 threads:

  $ /usr/bin/gcc -pipe -Wall -O3 -fomit-frame-pointer -march=ivybridge -fopenmp fasta.c -o fasta.gcc-2.gcc_run && time ./fasta.gcc-2.gcc_run 25000000 | md5sum
  fd55b9e8011c781131046b6dd87511e1 *-

  real    0m0.670s
  user    0m1.514s
  sys     0m0.046s

tutfbhuf5y ago

brundolf5y ago

Node uses V8, which does JIT compilation, while CPython is a straight bytecode interpreter. A better point of comparison would be PyPy

tutfbhuf5y ago

Are there any CPython vs PyPy benchmarks?

Can_Not5y ago

> I know that Python 3 can do some runtime stuff that Nodejs can't

Examples? This is too interesting to just throw in then hand wave away.

tutfbhuf5y ago

Honestly, I don't know. I think I read it somewhere, as explanation for why python is so slow. It made sense to me, that it must be language features. I hope there's someone on HN who can explain it.

api5y ago

There are probably others as well, but this is the advantage I'm familiar with. C's ambiguity makes it harder to achieve some optimizations that can really matter on modern CPUs.

nynx5y ago

LLVM's support for this is bugged, so rust does not currently take (edit: full) advantage of it.

steveklabnik5y ago

Rust does not take full advantage of it; that is, &T will still get noalias, it's &mut T that's currently disabled.

The tracking bug is https://github.com/rust-lang/rust/issues/54878

runevault5y ago

kzrdude5y ago

Does not fully take advantage of it.

hawk_5y ago

Unfortunately that's only on paper due to LLVM issues.

mh75y ago

Some of the rust versions calls C libraries for its heavy lifting (gmp, pcre) so I wouldn't take this too seriously.

sitkack5y ago

As soon as the libraries are RiiR, then the Rust compiler can optimize across those library calls.

burntsushi5y ago

pjmlp5y ago

Looking at the progress of Rust/WinRT, and C++ renaissance thanks to GPGPU computing and mobile OSes stack, that is still a couple of decades away.

And then there are the whole LLVM and GCC based eco-systems.

ncmncm5y ago

When a Rust program is faster than the matching C program, it is utterly nonsensical to attribute its speed to C.

bluecalm5y ago

Your claims contradict my experience.

pjmlp5y ago

pmarin5y ago

The only conclusion I have got about this web site is how much some programmers like to write benchmark code in Rust.

albertzeyer5y ago

Interestingly, C++ seems to be the fastest overall.

agumonkey5y ago

After the rust blossom storm I didn't track cpp implementation evolutions.. did cpp compiler perf/libs increased or was is simply faster and still is the same ?

burnthrow5y ago

https://dev.to/aakatev/executable-size-rust-go-c-and-c-1bna

moldavi5y ago

Shouldnt they be about the same? (At least until the LLVM immutability optimizations happen)

I suspect surprising factors are in play.

skohan5y ago

The ceilings should be similar, but there is an argument that there are cases where idiomatic Rust might outperform idiomatic C or vice versa.

For instance, in real-world usage, you might do a bit more redundant copying in C in places you would avoid it with the borrow checker in Rust.

Conversely, enforcing RAII in Rust might cost performance in some scenarios relative to C.

jug5y ago

I'd look at the algorithms here. Looking a bit fishy with sometimes quite serious differences...

1vuio0pswjnm75y ago

steveklabnik5y ago

The smallest "hello world" binary rustc has ever produced was 137 bytes. https://github.com/tormol/tiny-rust-executable

fmntf5y ago

If that's possible, why should not binaries be small (without too many downsides, eg execution speed) by default?

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

haxorito5y ago

topspin5y ago

fancyfish5y ago

It's a bit hard to find, but the page comparing languages is found via the main page -> "Which are fastest?"

igouy5y ago

1) The parent post linked to charts which "compare benchmarks across several languages".

2) The home page (click the banner) has links which compare benchmarks across two language implementations.

3) Each of those comparison pages has links which compare all the programs, for all the language implementations, for each benchmark.

notorandit5y ago

If Rust implementation is father than C's, kudos goes to the compiler, not to the language

dodobirdlord5y ago

Both C and Rust are capable of just inlining optimal assembly, so comparing "pure speed potential" is pointless.

notorandit5y ago

kowlo5y ago

Searched the page for "Rust" which returned nothing... and the box labels are awkward. Why not label them conventionally?

gus_massa5y ago

You can see the data in https://benchmarksgame-team.pages.debian.net/benchmarksgame/... but it has no graphic comparison.

igouy5y ago

> Why not label them conventionally?

Space.

savant_penguin5y ago

Wondering around the site I found this particular benchmark interesting

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

Why is julia using 250x the memory? (and it`s still fast)

stabbles5y ago

It's including compilation time and memory too, which I would consider a major flaw in the benchmarks game.

If I wrap things in a function and run it a second time in the repl, this is what I get:

    $ JULIA_LLVM_ARGS="-unroll-threshold=500" ~/julia-b00e9f0bac/bin/julia -O3 --check-bounds=no

    julia> include("nbody.jl")
    run (generic function with 1 method)

    julia> @time run(50000000)
    -0.169075164
    -0.169059907
    2.044478 seconds (294.25 k allocations: 12.599 MiB, 8.89% compilation time)

    julia> @time run(50000000)
    -0.169075164
    -0.169059907
    1.867936 seconds (12 allocations: 1.750 KiB)

So, the second time there's no compilation overhead, just like in Rust, C, C++.

ChrisRackauckas5y ago

Julia is not allowed to use its compiler (PackageCompiler.jl) but Rust and C are. It's really silly and imposed by the author of the benchmarks for no clear reason.

batmansmk5y ago

Was the Formula One pilot Schumacher as good as he is just because of his Ferrari? Obviously not, but he probably wouldn't have performed as well with a Ford Focus.

If you are willing to spend the time to perform at the highest level, Rust can bring you there.

acje5y ago

anta405y ago

There are 2 charts on the page, both using the same title: "How many times slower?"

I don't get those. What's the difference?

igouy5y ago

The different charts show different programming language implementations.

dilap5y ago

My experience using ripgrep and fd is that this is also true in real-world programs. :-)

sriku5y ago

... but not as fast as C++/g++ ?

timbit425y ago

Yet...

tpoacher5y ago

Define 'c'.

layoutIfNeeded5y ago

C++ still seems to be the fastest.

JoeAltmaier5y ago

Apples and oranges. How about Rust vs C++?

eeZah7Ux5y ago

Thanks, I'd rather not use it. My time is valuable as well.

Animats5y ago

Two charts with different languages. Unclear what is being measured. Is this a humor article?

nindalf5y ago

This site has a page where they explain what they're doing - https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

Basically, it's toy programs written in each of these languages. They measure how fast each one is executed. This methodology does have it's limitations, which that page is upfront about.

hvasilev5y ago

How do I downvote a thread in HN, I can just upvote it?

Also is there also a way to filter out posts with the word "Rust" in it so I don't see them?

ksec5y ago

The only option is to Hide it.

rurban5y ago

Because Rust does alloca for all locals, and this if course faster. Everyone else avoids it for security reasons. Just search the Rust bugtracker for stack overflows.

steveklabnik5y ago

You are confusing the llvm instruction “alloca” with the C feature “alloca.” llvm will use its alloca instruction for C and C++ local variables the same way as Rust does.

You are correct that the C feature is often banned. Rust doesn’t even support it at all.

rurban5y ago

etaioinshrdlu5y ago

I know it's a meme, but it really does seem like most C or C++ code would be better off transitioning to Rust at some point. That includes the entire Linux kernel, web browsers, entire OS's...

tpoacher5y ago

Ah, rust. A language which combines the flexibility of assembly language with the power of assembly language.

jolux5y ago

jandrewrogers5y ago

steveklabnik5y ago

> In practice, the complexity of expressing equivalent performance can vary considerably depending on what you are trying to do.

jolux5y ago

londons_explore5y ago

djeiasbsbo5y ago

I do audio stuff and it is the same there. DSP is easy until it has to be in real time and there can only be minimal latency...

mrec5y ago

I'm curious, what platforms support 4K output but don't have any kind of GPU?

steveklabnik5y ago

jolux5y ago

>You can see some skepticism of the premise elsewhere in this thread, even.

pixel_fcker5y ago

Graphics software.

klysm5y ago

Yeah, databases

JimBlackwood5y ago

Could you elaborate a little? I’d be interested in this answer