Reevaluating a low-level programming language is something that's done in a large organisation or project once every 15-25 years or so. Switching such a programming language incurs a high cost and a high risk, and is a long-term commitment. To make such a switch, the new language obviously has to be better, but that's not enough. It has to be a hell of a lot better (and, if not, at least return the investment with a profit quickly).
For some, Rust is better enough. For me, not nearly so. Even though it offers a fascinating and, I think, ingenious path to better safety, it shares some of C++'s greatest downside for me, which is that they are both extremely complex languages. Maybe Rust is simpler, but not enough. Rust also shares what I think is C++'s original misguided sin, which is the attempt to create a low level language, whose code appears high-level on the page by means of a lot of implicitness. I've become convinced that that's a very, very bad idea.
If there were no other ideas on the horizon or if Rust seemed like a surefire success, it might have been justified to make such a switch, but that's not the case. Rust's low adoption rate in professional settings is not reassuring to "PL-cautious" people like me, and a language like Zig shows that there are other approaches that appeal to me more, and while more revolutionary and ambitious than Rust in its departure from C++'s philosophy, I think it also has the potential to be better enough. Maybe it will make it, and maybe it inspires some other language that will, or maybe other ideas will turn up. Given the risk and commitment, to me it makes sense to wait. I don't like C++; I believe Rust is better. But that's not enough.
Zig might be revolutionary but personally for me Rust is why I can dare write non gc code. I never thought I would ever be robotically precise with memory management on a large project in memory unsafe languages, that doesn't change with Zig. There are experts in C/C++, there will be experts in Zig and we will continue having memory safety CVEs because they don't make mistakes/they can use some other static analysis tool/they can test :)
I think people comfortable writing low level code under appreciate what Rust has done for bystanders or newcomers or a large team. It is definitely complex but choosing C/C++/Zig over it would make even less sense since lack of guardrails.
Can you maybe summarize a bit more, why you prefer it over rust?
The concept of rust, of beeing safe by default and only optimize critical parts sounds solid to me, but after a quick skimming, I have not seen such a feature for zig, too, possibly by design?
" No hidden control flow. No hidden memory allocations. No preprocessor, no macros."
This is something that you definitely can't do with C++ and probably can't do with Rust.
For some people it's more comforting to be able to understand the entirety of the language and focus on the complexities of the problem and the implementation than to have a slightly higher level and larger language taking over the details. It's the same reason people tend to like C over C++ or Go over other languages.
This also generally translates into a couple of technical benefits as well, such as much much faster compilers relative to more complicated languages like C++ and Rust.
Personally I still find Rust much more comfortable to write code in generally but I really appreciate the elegance of the Zig approach.
Oh boy and I was just reading (https://news.ycombinator.com/item?id=30022022) about ISO C unsuitability for OS programming specially its implicit reordering of the control flow of the code, breaking parts that are sensitive to the machine code generated.
I wonder if that point was introduce as a response for this ISO C nuance, does this means the machine code generated by ziglang code will be explicitly what each statement does?
It's much worse in one way: Interoperation with existing C++ code. Sure, that's not a fair criteria - C++ is designed in a way that makes it almost impossible for other languages to use C++ libraries without a heavyweight wrapper like SWIG. However, even though it's not fair, it's still really important. If you have a project like LLVM with millions of lines of existing C++ code, adding expensive or complicated interop boundaries between different components within your system is not an acceptable price to pay.
Of course no language will be perfect to a specific domain, there is no objective metric every individual has their own needs and no universal language can cater all them at once, there will be trade offs.
Rust is already better enough( and possibly will take decades before another language "suceeds" it), choose it or be on the perpetual expectation for a "perfect" language
Disagree. I think the "market" (the set of programmers) defines "enough". When a language is enough better (in some area, doesn't have to be all areas) you see widespread adoption.
C was enough better than PL/I, ALGOL, and assembly. Java was enough better than C++. (Why? Garbage collection, and the huge standard library.)
So far, Rust is not enough better than C++.
Now, I know this is kind of circular. I'm saying that a language is "enough" better if it wins in the market, and I'm saying that a language wins in the market if it's enough better. But I have some trust in programmers, that they are not just sheep. If a language is better than other languages in a way that matters to actual working programmers, a fair number of them will use it.
The problem in c++ is the surface where you might cause a memory problem is huge. Once it's there, it's a lot of work to test the hypotheses about where it is hiding. On top of that, these kinds of issues can escape your instrumentation in a way that other bugs tend not to. Add some debug lines, things get accessed differently -> Heisenbug. Mega pain in the ass to figure out, lots of time taking everything apart, sprinkling debug lines, running long tests to catch that one time it goes wrong in a million, and so on.
He's also right that the array access thing is not a huge thing, it can't possibly be what your decision turns on, and that most of the code doesn't have a tradeoff in performance, because it's in the config stage rather than the hot path.
Personally I've had a great time with Rust, it's far more productive than other typed languages I've used. On a business level, the issue with the type of bug mentioned above is it destroys your schedule. I've spent entire weeks looking at that kind of thing, when I was expecting to be moving on with other parts of my project. With my current Rust stuff, I'm doing what I expect to be doing: addressing some issue that will soon be fixed like adjusting some component to fit a new spec.
But, it really doesn’t take a very long post to talk about this. The remainder goes off the rails, talking about “C++ apologists” (hint: if you’re being “fair”, pick words that are unlikely to cause people to be preemptively upset. This is not one of those words) and their stupid opinions. And the author just trashes them as being complete idiots, but it’s obvious that the arguments come from inexperience or strawmen, which just makes the overall thing not particularly convincing. Saying that the various UB finding tools were useless because you tried using them and didn’t get good results is stupid. Being smug about “people who use modern C++ clearly can’t do HFT, which is the thing that you said you were using C++ to do” is also insipid, just because you spotted the use of a shared_ptr somewhere and read how it’s not zero-cost. Modern C++ has other things in it, you know, many of which are zero-cost and significantly (but not entirely) safer; picking one thing and misrepresenting it does not make for a good refutation.
Anyways, coming from someone who writes a lot of C++ and would also like a lot of code to be migrated to Rust for good reasons, it’s a good idea to approach the tradeoffs honestly and without disdain for those who aren’t convinced yet. The core argument I mentioned above and the closing part of the article does do this…but there’s a lot in the middle that doesn’t, and it drags down the usefulness of the post.
On the subject of safe defaults, just to correct that Rust does not in fact have as much default memory safety with regards to buffer bleeds (e.g. variants of OpenSSL's Heartbleed) as it could [1], because it has unchecked arithmetic (integer wraparound) as the default for performance, with checked arithmetic only as an opt-in for safety.
In other words, if an attacker can get some bounds merely to underflow (as opposed to overflow) then they can still read the sensitive memory of a Rust program, even without a UAF or buffer overflow.
Bleed vulnerabilities like these are also low-hanging fruit and significantly easier to exploit.
In other words, bounds checking only ensures you are within the buffer, but checked arithmetic is still needed to ensure that your index was correctly calculated in the first place.
I believe that Rust would be much safer against memory bleeds, if it had checked arithmetic enabled by default for safety, with an opt-out at the block scope level for performance, like Zig has.
It strikes me as extremely dubious to tout Zig, which doesn't have memory safety at all, as somehow superior to Rust because Zig has this mitigation enabled by default (with a large performance cost) for an error class that Rust forestalls the vast majority of the negative consequences of, by dint of memory safety.
1. not doing the bounds check. 2. not storing the the bounds with the array.
> But wait! The C++ apologists are still talking! What are they saying? How have they not been completely flummoxed?
is just one small sample. I came out of the article liking Rust a little bit less than when I went in (irrational, I know, but true).
The quote from The Big Lebowski comes to mind: you're not wrong, author, ...
One unaddressed issue in Rust is that this could easily happen with a crate, and how hard diagnosing it could be, especially due to implicit behavior via procedural and attribute macros. Also, just because your code is safe, there could still be an unsafe block at the end of any long safe call chain. I haven't been able to reconcile to myself how this isn't just an illusion of overall safety.
Experienced Rust writers try to be very clear about this, but it's a subtle point that's hard to fit into an elevator pitch for the language.
Safety in Rust is an encapsulation mechanism, and it's closely related to privacy. In fact, we can use privacy as a good metaphor. Suppose we have a private member variable x, in any language that supports such a thing. And let's say the design of our class is such that x should always be less than 10. Does the fact that x is private mean that we're guaranteed it will always follow that invariant? No of course not, because the public methods of our class could have bugs in them that screw up the value of x. However, we still get a useful guarantee here! We're guaranteed that x can only violate its invariants if our methods have a bug. We don't have to worry about what any specific caller is going to do, because privacy rules let us make guarantees about x solely based on our code.
Safety in Rust is similar. If Vec is buggy and unsound, then its "safe API" isn't providing much value. But if I manage to cause UB using Vec, that is a bug report for the Rust standard library, and they will fix it. Once the bug is fixed, then my safe calling code cannot cause UB using Vec, no matter how hard it tries.
These end up being very useful guarantees in practice. Many nontrivial programs can be written entirely in safe code, using only high-quality dependencies that get a lot of testing. Surprise soundness holes come up occasionally, but they're kind of like miscompliation bugs in that it's usually hard to trigger them.
You don't need to answer a question of "is my 1-million-line codebase safe?", but rather "does this 10-line function uphold Rust's invariants?". It may be a tricky question, but you can focus on it in isolation. The contract between crates is safe, so once you've proven the dependency upholds the contract, you can rely on all its usages being safe.
The author spent 1 page before this statement, and the whole article after it, explaining that this is not true, so the article is a big contradiction.
Rust and C++ are not "in the exact same place".
With Rust, you get bound checking by default. If, after profiling, you find that it is a performance problem somewhere, it allows you to elide it safely. In the programs I work on, 99% of the execution time of my program is spent in 1% of the code, and Rust optimizes for this situation. Instead of debugging segmentation faults due to performance optimizations that buy you nothing in 99% of the code, you can spend your time optimizing the 1% that actually makes a difference.
This is why Rust libraries are program are "so fast". Its not because of multi threading, or because rust programmers are geniuses, but rather because Rust buys these programmer time to actually optimize the code that matters, and in particular, do so without introducing new bugs.
This is not the case. Due to instruction level parallelism, the throughput could be unaffected, but you will always have latency penalty. The CPU still needs to run the check (access the length and compare it to the index) and this adds latency. On top of that, it also increases code size, which can impact the instruction cache and binary size. It’s a small penalty, but it’s not 0.
I think this article ignores some arguments for array bounds checks and it ignores the importance of what the default is:
- It doesn’t matter how fast or slow bounds checking is in theory. It only matters how fast it is in practice. In practice, the results are quite surprising. For example, years ago WebKit switched its Vector<> to checking bounds by default with no perf regression, though this did mean having to opt out a handful of the thousands of Vector<> instantiations. Maybe this isn’t true for everyone’s code, but the point is, you should try out bounds checking and see if it really costs you anything rather than worrying about hypothetical nanoseconds.
- If you spend X hours optimizing a program, it will on average get Y% faster. If you don’t have bounds checks in your program and your program has any kind of security story, then you will spend Z hours per year fixing security critical OOBs. I believe that if you switch to checking bounds then you will instead get Z hours/year of your life back. If you then spend those hours optimizing, then for most code, it’ll take less then a year to gain back whatever perf you lost to bounds checks by doing other kinds of optimizations. Hence, bounds checking is a kind of meta performance optimization because it lets you shift some resources away from security to optimization. Since the time you gain for optimization is a recurring win and the bounds checks are a one time cost, the bounds checks become perf-profitable over time.
- It really matters what the language does by default. C++ doesn’t check bounds by default. The most fundamental way of indexing arrays in C++ is via pointers and those don’t do any checks today. The most canonical way of accessing arrays in Rust is with a bounds check. So, I think Rust does encourage programmers to use bounds checking in a way that C++ doesn’t, and that was the right choice.
As a C++ apologist my main beef is: if bounds checks are so great then please give them to me in the language that a crapton of code is already written in rather than giving me a goofy new language with a different syntax and other shit I don’t want (like ownership and an anemic concurrency story).
I don't think "safe" or "unsafe" can be a property of code; it can only be a property of something you do, like changing code. I think that something being "unsafe" means that there is a risk with doing it. Programming is always a risk, even if you write Rust code without using the "unsafe" keyword. You can even have arbitrary code execution bugs in Rust programs without using the "unsafe" keyword; think about bugs like SQL-injections.
All of this doesn't mean than I don't think the checks that the Rust compiler does help. They probably help many people to write less buggy code. I just think it makes no sense to call code "safe" or "unsafe".
Like if you only want to access the data inside the bounds of the array, that's one intention that the compiler will help you to check. If you never intend to access the first element in the array, that's an intention that the compiler doesn't help you to check.
So, there is no memory-safe code or language. I think the only way you could define memory-safe code is that memory-safe code can't contain any code that breaks some rules that the compiler checks for. The problem with that is that those rules could be just about anything, so that definition is pretty useless.
Languages like Rust add a layer of safety on top of the operating system's layers. The problem is that even if you have safety on one layer, the next layer will always be unsafe, and as long as you have abstractions in your code, you will always have layers.
Let's say you build some kind of abstraction on top of Rust arrays. The compiler will do bounds checks on the arrays but your abstraction will have no checks unless you implement them. Let's say that some state of your abstraction is invalid; the compiler will not help you to check that.
Therefore you can't have a safe language, because even if one layer is perfectly safe, as soon as you add an abstraction layer, you have no safety checks on that layer. SQL injections are an example of that; even if SQL were a perfectly safe language, as soon as you add a layer on top of that (a function that builds SQL code by concatenating strings) you are back to no safety.
Outside of your simple example C code, there exists C code which can only be memory safe if the compiler implements a heavy runtime: track pointer allocations, track where pointers source from, raise an error when the pointer is used in an undefined context. See how much work valgrind does to achieve a subset of this task
You could consider C code safe if you included a machine verifiable proof of memory safety with the code.. but that's ridiculously more effort than using Rust
In short, you're arguing semantics over the use of the word safe/unsafe when there's a clear definition Rust offers. You can argue that safe code still has bugs, but that's beside the point
This is interesting. Where can I read more about this ?
Would like to add that at least in plain C, doing var[index] doesn't invoke any checked() or unchecked() access function call. It's rather compiled into assembly instructions to calculate the address where the data is expected and load it into memory in one or two lines.
> A postfix expression followed by an expression in square brackets [] is a subscripted designation of an element of an array object. The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))).