Modern languages might do more than C to prevent programmers from writing buggy code, but if you already have bug-free code due to massive time, attention, and testing, and the rate of change is low (or zero), it doesn’t really matter what the language is. SQLIte could be assembly language for all it would matter.
This jives with a point that the Google Security Blog made last year: "The [memory safety] problem is overwhelmingly with new code...Code matures and gets safer with time."
https://security.googleblog.com/2024/09/eliminating-memory-s...
https://www.sqlite.org/cves.html
Note that although code matures the chances of C Human error bugs will never go to zero. We have some bad incidents like Heartbleed to show this.
Too few maintainers, too few security researchers and too little funding.
When writing systems as complicated and as sensitive as the leading encryption suite used globally, no language choice will save you from under resourcing.
Instead what I see _mostly_ is re-writes and proposed re-writes of existing software, often software that has no networking functions, and/or relatively small, easily audited software that IMHO poses little risk of memory-related bugs
This is concerning to me as an end-user who builds their software from source because the effects on compilation, e.g., increased resource requirements, increased interdependencies, increased program size, decreased compilation speed, are significant
That is, nobody perceives, say, "the silver searcher" as being some sort of nefarious plot to re-write grep, but they did with ripgrep, even though that's not what it is trying to do.
There are a few projects that are deliberately doing a re-write for reasons they consider important, but they're not "just because it's in Rust," they're things like "openssl has had memory safety issues, we should use tools that eliminate those" or "sudo is very important and has grown a lot of features and has had memory safety issues, we should consider cutting some scope and using tools that help eliminate the memory safety issues." But those number of projects are very small. Heck, even things like uutils were originally "I just want to learn Rust" projects and not some sort of "we must re-write the world in Rust."
I have a suspicion as to why this perception exists in the C++ crowd.
I don't think it's because of evangelists. C++ was that evangelized language in the early 90's, but after a period in the sun it then survived the evangelism of Java, Python, Go, and others, despite losing ground to them in general purpose tasks.
And that's because all of those other languages, while being much safer than C++, came at the cost of performance or the overhead of a runtime. There was really no other language that had the expressiveness of C++ while allowing the developer to get down to the bare metal if they needed speed.
The existence and growing popularity of Rust changes that calculus, and I imagine that makes certain developers who might have a lot of investment in the C++ ecosystem defensive, causing them to overreact to any perceived slight in a way that other languages simply don't provoke.
Designing new software is orders of magnitude more difficult than iterating on existing software
Public domain. No "copyleft" license needed
Being written in a small, fast, "old and boring" language may be part of what makes SQLite apealing. Another (related) part may be the thoughtfulness and carefulness of its author, e.g., "time, attention and testing"
The former, i.e., the author's "time, attention and testing", may matter more than the later, i.e., the author's language choice
As suggested by the top comment, in effect the author's language choice, by itself, may not matter with respect to the issue of "safety". If true, then even an "unsafe language" may not reduce the "safety" of SQLite^1
djb's software is also public domain and written in an "unsafe language", mostly the same "old and boring" one as used to write SQLite. Like SQLite it is appealing to many people and is found in many places^2
1. But the thoughtlessness and carelessness of an author, no matter what language they choose, is still relevant to "safety". In sum, "safety" is partly a function of human effort, e.g., "time, attention and testing", not simply the result of a language choice. Perhaps "safe" and "unsafe" are adjectives that can apply to authors as well as languages
This is obviously not analogous to Rust evangelism that targets projects written in C
The author claims the program is a clone of ack; ack is written in a "safe" language
Few things in this life are novel. Regardless, when I wrote ripgrep, I was focusing on writing new and better software. I perceived several problems with similar tools at the time and set out to do something better.
Imagine if people actually listened to whinging like your comment. I'm glad I never have.
We will see. On the Rust side there is Turso which is pretty active.
It's an open sourced project! It's public domain!
If you make an open source project that is heavily used and widely lauded for a quarter century before being supplanted by a newer solution that's better, do you know what that is?
A success! You did it! You made a thing that was useful and loved!
Nothing lasts forever; there's nothing wrong with a project filling a niche and then gracefully fading away when that niche goes away.
Turso is arguably positioned slightly better as a standalone product seeing as it's using a more traditional open source "bazaar" model, as opposed to SQLite's source available "cathedral" model.
So SQLite is still the bar
I guess we could use Rust and I might be wrong on this, but it seemed like it would be a lot of work to utilize it compared to just continuing with C and gradually incorprating Zig, and we certainly don't write bug free C.
I don’t get that. You had trouble optimizing Go, so you went with Python?
I personally really like Go, but I feel like I now have a better understanding of why so many teams stick with c/c++ without even considering adopting Go, Rust or similar.
Memory bugs are often implicated in security issues, but other bugs are more likely to cause data loss, corruption, etc.
Dont get me wrong, despite not taking time to learn Rust at this time, I am aware that memory safety is the thing at this time. Yes... some software might do better being rewritten from C to Rust. However, there are other projects that have been worked on for years and years. Sqlite is an example of this. That quote above is 100%
Various GNU tools are being replaced. Dont just expect them to be 100% despite being "memory safe" -- they will have to fo through various tweaks to boost performance. With Rust likely (still) to go through some further changes, I am sure Linus will get frustrated at some points within the Linux Kernel. We shall see.
Point is - some things are just better left with their mature state. Though.. on the flip side, we are have to think about the younger generation. Will SQlite survive if it continues to use C? It's likely to be a language that each new generation will not bother, and focus on Rust, Zig.. or whatever comes out in the future.
I am just waiting for rewrite of Doom or Quake (I am sure they already exist if I can be bothered to search.. in Rust)
Not only was Apple was able to launch the Mac Classic with zero lines of C code, their Pascal dialect lives on in Delphi and Free Pascal.
As one example.
Non exaustive list:
- proper strings with bounds checking
- proper arrays with bounds checking
- no pointer decays, you need to be explicit about getting pointers to arrays
- less use cases of implicit conversions, requires more typecasts
- reference parameters reduce the need of pointers
- variant records directly support tags
- enumerations are stronger typed without implicit conversions
- modules with better control what gets exposed
- range types
- set types
- arenas
There was plenty of Pascal code on Mac OS including a Smalltalk like OOP framework, until C++ took over Object Pascal's role at Apple, which again it isn't C.
I love the "but it has used Assembly!" usual rebutal, as if OSes written in C weren't full of Assembly, or inline Assembly and language extensions that certainly aren't C (ISO C proper) either.
If you prefer, Zig is a modern taken on what Modula-2 in 1978, and Object Pascal in the 1980's already offered, with the addition of nullable types and comptime as key differentor in 40 years, packaged in a more appealing syntax for current generations.
The output is a non-portable half-a-million LoC Go file for each platform.
[Cries in Ada]
This is the C/++ delusion - "if one puts enough effort, a [complex] memory unsafe program can be made memory safe"; the year after this page was published, the Magellan series of RCEs was released.
Keeping SQLite in C is certainly a valid design choice, but it's important to be aware of the practical implications of the language.
We don't have to have one implementation of a lightweight SQL database. You can go out right now and start your own implementation in Rust or C++ or Go or Lisp or whatever you like! You can even make compatible APIs for it so that it can be a drop-in replacement for SQLite! No one can stop you! You don't need permission!
But why would we want to throw away the perfectly good C implementation, and why would we expect the C experts who have been carefully maintaining SQLite for a quarter century to be the ones to learn a new language and start over?
Because a lot of language advocacy has degraded to telling others what you want them to do instead of showing by example what to do. The idea behind this is that language adoption is some kind of zero sum game. If you're developing project 'x' in language 'y' then you are by definition not developing it in language 'z'. This reduces the stature of language 'z' and the continued existence of project 'x' in spite of not being written in language 'z' makes people wonder if language 'z' is actually as much of a necessity as its proponents claim. And never mind the fact that if the decision in what language 'x' would be written were to be revisited by the authors of 'x' that not only language 'z' would be on the menu, but also languages 'd', 'l', 'j' and 'g'.
Company I worked for decided to build out a new microservice in language Y. The whole company was writing in W and X, but they decided to write the new service in Y. When something goes wrong, or a bug needs fixing, 3 people in the company of over 100 devs know Y. Guess what management is doing.. Re-writing it in X.
I agree to what I think you're saying which is that "sqlite" has, to some degree, become so ubiquitous that it's evolved beyond a single implementation.
We, of course, have sqlite the C library but there is also sqlite the database file format and there is no reason we can't have an sqlite implementation in golang (we already do) and one in pure rust too.
I imagine that in the future that will happen (pure rust implementation) and that perhaps at some point much further in the future, that may even become the dominant implementation.
There's also the Go-wrapped WASM build of the C sqlite[0] which is handy.
But think about all those karma points here and on Reddit, or GitHub stars!
The SQLite developers are actually open to the idea of rewriting SQLite in Rust, so they must see an advantage to it:
> All that said, it is possible that SQLite might one day be recoded in Rust. Recoding SQLite in Go is unlikely since Go hates assert(). But Rust is a possibility. Some preconditions that must occur before SQLite is recoded in Rust include: […] If you are a "rustacean" and feel that Rust already meets the preconditions listed above, and that SQLite should be recoded in Rust, then you are welcomed and encouraged to contact the SQLite developers privately and argue your case.
I think we like to fool ourselves that decisions like these are based on performance considerations or maintainability or whatever, but in reality they would be based on time to market and skill availability in the areas where the team is being built.
At the end of the day, SQLite is not being rewritten because the cost of doing so is not justifiable
Huh it's not everyday that I hear a genuinely new argument. Thanks for sharing.
This feels like chasing arbitrary 100% test coverage at the expense of safety. The code quality isn’t actually improved by omitting the checks even though it makes testing coverage go up.
I don't think I would (personally) ever be comfortable asserting that a code branch in the machine instructions emitted by a compiler can't ever be taken, no matter what, with 100% confidence, during a large fraction of situations in realistic application or library development, as to do so would require a type system powerful enough to express such an invariant, and in that case, surely the compiler would not emit the branch code in the first place.
One exception might be the presence of some external formal verification scheme which certifies that the branch code can't ever be executed, which is presumably what the article authors are gesturing towards in item D on their list of preconditions.
The choices therefore are:
1. No bound check
2. Bounds check inserted, but that branch isn't covered by tests
3. Bounds check inserted, and that branch is covered by tests
I'm skeptical of the claim that if (3) is infeasible then the next best option is (1)
Because if it is indeed an impossible scenario, then the lack of coverage shouldn't matter. If it's not an impossible scenario then you have an untested case with option (1) - you've overrun the bounds of an array, which may not be a branch in the code but is definitely a different behaviour than the one you tested.
“What gets us into trouble is not what we don't know. It's what we know for sure that just ain't so.” — Mark Twain, https://www.goodreads.com/quotes/738123
If you then can come up a scenario where you need it. Well in fully tested code you do need to test it.
// pseudocode
if (i >= array_length) panic("index out of bounds")
that are never actually run if the code is correct? But (if I understand correctly) these are checks implicitly added by the compiler. So the objection amounts to questioning the correctness of this auto-generated code, and is predicated upon mistrusting the correctness of the compiler? But presumably the Rust compiler itself would have thorough tests that these kinds of checks work?Someone please correct me if I'm misunderstanding the argument.
Automatic array bounds checks can get hit by corrupted data. Thereby leading to a crash of exactly the kind that SQLite tries to avoid. With complete branch testing, they can guarantee that the test suite includes every kind of corruption that might hit an array bounds check, and guarantee that none of them panic. But if the compiler is inserting branches that are supposed to be inaccessible, you can't do complete branch testing. So now how do you know that you have tested every code branch that might be reached from corrupted data?
Furthermore those unused branches are there as footguns which are reachable with a cosmic ray bit flip, or a dodgy CPU. Which again undermines the principle of keeping running if at all possible.
Also you rarely need to actually access by index - you could just access using functional methods on .iter() which avoids the bounds check problem in the first place.
This is a dubious statement. In Rust, the array indexing operator arr[i] is syntactic sugar for calling the function arr.index(i), and the implementation of this function on the standard library's array types is documented to perform a bounds-check assertion and access the element.
So the checks aren't really implicitly added -- you explicitly called a function that performs a bounds check. If you want different behavior, you can call a different, slightly-less-ergonomic indexing function, such as `get` (which returns an Option, making your code responsible for handling the failure case) or `get_unchecked` (which requires an unsafe block and exhibits UB if the index is out of bounds, like C).
The way I was thinking about it was: if you somehow magically knew that nothing added by the compiler could ever cause a problem, it would be redundant to test those branches. Then wondering why a really well tested compiler wouldn't be equivalent to that. It sounds like the answer is, for the level of soundness sqlite is aspiring to, you can't make those assumptions.
If the check never fails, it is logically equivalent to not having the check. If the code isn't "correct" and the panic is reached, then the equivalent c code would have undefined behavior, which can be much worse than a panic.
I wouldn't put it that way. Usually when we say the compiler is "incorrect", we mean that it's generating code that breaks the observable behavior of some program. In that sense, adding extra checks that can't actually fail isn't a correctness issue; it's just an efficiency issue. I'd usually say the compiler is being "conservative" or "defensive". However, the "100% branch testing" strategy that we're talking about makes this more complicated, because this branch-that's-never-taken actually is observable, not to the program itself but to its test suite.
sure safety checks are added but
it's ignoring that many of such checks get reliably optimized away
worse it's a bit like saying "in case of a broken invariant I prefer arbitrary potential highly problematic behavior over clean aborts (or errors) because my test tooling is inadequate"
instead of saying "we haven't found adequate test tooling" for our use case
Why inadequate? Because technically test setups can use
1. fault injection to test such branches even if normally you would never hit them
2. for many of such tests (especially array bound checks) you can pretty reliably identify them and then remove them from your test coverage statistic
idk. what the tooling of rust wrt this is in 2025, but around the rust 1.0 times you mainly had C tooling you applied to rust so you had problems like that back then.
#ifdef CONTRACTS
if (i >= array_length) panic("index out of bounds")
#endifRust does not stop you from writing code that accesses out of bounds, at all. It just makes sure that there's an if that checks.
By the same logic one could also claim that tail recursion optimisation, or loop unrolling are also dangerous because they change the way code works, and your tests don't cover the final output.
Certainly don't get me wrong, SQLite is one of the best and most thoroughly tested libraries out there. But this was an argument to have 4 arguments. That's because 2 of the arguments break down as "Those languages didn't exist when we first wrote SQLite and we aren't going to rewrite the whole library just because a new language came around."
Any language, including C, will emit or not emit instructions that are "invisible" to the author. For example, whenever the C compiler decides it can autovectorize a section of a function it'll be introducing a complicated set of SIMD instructions and new invisible branch tests. That can also happen if the C compiler decides to unroll a loop for whatever reason.
The entire point of compilers and their optimizations is to emit instructions which keep the semantic intent of higher level code. That includes excluding branches, adding new branches, or creating complex lookup tables if the compiler believes it'll make things faster.
Dr Hipp is completely correct in rejecting Rust for SQLite. Sqlite is already written and extremely well tested. Switching over to a new language now would almost certainly introduce new bugs that don't currently exist as it'd inevitably need to be changed to remain "safe".
Presumably this is why they do 100% test coverage. All of those instructions would be tested and not invisible to the test suite
0: https://doc.rust-lang.org/std/vec/struct.Vec.html#method.get...
There already is an implicit "branch" on every array access in C, it's called an access violation.
Do they test for a segfault on every single array access in the code base? No? Then they don't really have 100% branch coverage, do they?
I think a lot of projects that claim to have 100% coverage are overselling their testing, but SQLite is in another category of thoroughness entirely.
A simple array access in C:
arr[i] = 123;
...can be thought of as being equivalent to: if (i >= array_length) UB();
else arr[i] = 123;
where the "UB" function can do literally anything. From the perspective of exhaustively testing and formally verifying software, I'd rather have the safe-language equivalent: if (i >= array_length) panic();
else arr[i] = 123;
...because at least I can reason about what happens if the supposedly-unreachable condition occurs.Dr. Hipp mentions that "Recoding SQLite in Go is unlikely since Go hates assert()", implying that SQLite makes use of assert statements to guard against unreachable conditions. Surely his testing infrastructure must have some way of exempting unreachable assert branches -- so why can't bounds checks (that do nothing but assert undefined behavior does not occur) be treated in the same way?
A more complex C program can have index range checking at a different place than the simple array access. The compiler's flow analysis isn't always able to confirm that the index is guaranteed to be checked. If it therefore adds a cautionary (and unneeded) range check, then this code branch can never be exercised, making the code no longer 100% branch tested.
you basically say if deeply unexpected things happen you prefer you program doing widely arbitrary and as such potentially dangerous things over it having a clean abort or proper error. ... that doesn't seem right
worse it's due to a lack of the used tooling and not a fundamental problem, not only can you test this branches (using fault injection) you also often (not always) can separate them from relevant branches when collecting the branch statistics
so the while argument misses the point (which is tooling is lacking, not extra checks for array bounds and similar)
lastly array bounds checking is probably the worst example they could have given as it
- often can be disabled/omitted in optimized builds
- is quite often optimized away
- has often quite low perf. overhead
- bound check branches are often very easy to identify, i.e. excluding them from a 100% branch testing statistic is viable
- out of bounds read/write are some of the most common cases of memory unsafety leading to security vulnerability (including full RCE cases)
SQLite isn't a program, it's a library used by many other programs. As such, aborting is not an option. It doesn't do "wildly arbitrary" things - it reports errors to the client application and takes it on faith that they will respond appropriately.
It's like seat belts.
E.g. what if we drive four blocks and then the case occurs when the seatbelt is needed need the seatbelt? Okay, we have an explicit test for that.
But we cannot test everything. We have not tested what happens if we drive four blocks, and then take a right turn, and hit something half a block later.
Screw it, just remove the seatbelts and not have this insane untested space whereby we are never sure whether the seat belt will work properly and prevent injury!
- Rust needs to mature a little more, stop changing so fast, and move further toward being old and boring.
- Rust needs to demonstrate that it can be used to create general-purpose libraries that are callable from all other programming languages.
- Rust needs to demonstrate that it can produce object code that works on obscure embedded devices, including devices that lack an operating system.
- Rust needs to pick up the necessary tooling that enables one to do 100% branch coverage testing of the compiled binaries.
- Rust needs a mechanism to recover gracefully from OOM errors.
- Rust needs to demonstrate that it can do the kinds of work that C does in SQLite without a significant speed penalty.
2. This has been demonstrated.
3. This one hinges on your definition of “obscure,” but the “without an operating system” bit is unambiguously demonstrated.
4. I am not an expert here, but given that you’re testing binaries, I’m not sure what is Rust specific. I know the Ferrocene folks have done some of this work, but I don’t know the current state of things.
5. Rust as a language does no allocation. This OOM behavior is the standard library, of which you’re not using in these embedded cases anyway. There, you’re free to do whatever you’d like, as it’s all just library code.
6. This also hinges on a lot of definitions, so it could be argued either way.
ironically if we look at how things play out in practice rust is far more suited as general purpose languages then C, to a point where I would argue C is only a general purpose language on technicality not on practical IRL basis
this is especially ridiculous when they argue C is the fasted general purpose language when that has proven to simply not hold up to larger IRL projects (i.e. not micro benchmarks)
C has terrible UX for generic code re-use and memory management, this often means that in IRL projects people don't write the fasted code. Wrt. memory management it's not rare to see unnecessary clones, as not doing so it to easy to lead to bugs. Wrt. data structures you write the code which is maintainable, robust and fast enough and sometimes add the 10th maximal simple reimplementation (or C macro or similar) of some data structure instead of using reusing some data structures people spend years of fine tuning.
When people switched a lot from C to C++ most general purpose projects got faster, not slower. And even for the C++ to Rust case it's not rare that companies end up with faster projects after the switch.
Both C++ and Rust also allow more optimization in general.
So C is only fastest in micro benchmarks after excluding stuff like fortran for not being general purpose while itself not really being used much anymore for general purpose projects...
C projects avoiding dependencies entirely just end up reimplementing work. You can do that in any language.
Is hard work? But is not that different from what you see in certain C projects that neither use external deps
Rust insists on its own package manager "rustup" and frowns on distro maintainers. When Rust is happy to just be packaged by the distro and rustup has gone away, then it will have matured to at least adolescence.
There are other worlds out there than Linux.
The current version of the Rust compiler definitely doesn't -- there's known issues like https://github.com/rust-lang/rust/issues/57893 -- but maybe there's some historical version from before the features that caused those problems were introduced.
Of course, two libraries that choose different no_std collection types can't communicate...but hey, we're comparing to C here.
like there are some things you can well in C
and this things you can do in rust too, through with a bit of pain and limitations to how you write rust
and then there is the rest which looks "hard but doable" in C, but the more you learn about it the more it's a "uh wtf. nightmare" case where "let's kill+restart and have robustness even in presence of the process/error kernel dying" is nearly always the right answer.
I’d love to see rust be so stable that MSRV is an anachronism. I want it to be unthinkable you wouldn’t have any reason not to support Rust from forever ago because the feature set is so stable.
What other languages satisfy this criteria?
https://stackoverflow.com/questions/36703867/golang-preproce...
Wouldn't this work? Surely the empty function would be removed completely during compilation?
Like why defend C in 2025 when you only have to defend C in 2000 and then argue you have a old, stable, deeply tested, C code base which has no problem with anything like "commonly having memory safety issues" and is maintained by a small group of people very highly skilled in C.
Like that argument alone is all you need, a win, simple straight forward, hard to contest.
But most of the other arguments they list can be picked apart and are only half true.
I'd like to see you pick the other arguments apart
Not OP, And I'm not really arguing with the post, but this struck me as a really odd thing to include in the article. Of course nothing is going to be faster then C, because it compiles straight to machine code with no garbage collection. Literally any language that does the same will be the same speed but not faster, because there's no way to be faster. It's physically impossible.
A much better statement, and one inline with the rest of the article, would be that at the time C and C++ were really the only viable languages that gave them the performance they wanted, and C++ wouldn't have given them the interoperability they wanted. So their only choice was C.
There is nothing special about C that makes this true. C has semantics, just like any language, that are higher level than assembly, and sometimes, those semantics make the code slower than other languages that have different semantics.
Consider this C function:
void redundant_store(int *a, int *b) {
int t = *a + 1;
*a = t;
*b = 0; // May clobber *a if (b == a)
*a = t;
}
Because a and b may point to the same address, you get this code (on clang trunk): redundant_store:
mov eax, dword ptr [rdi]
inc eax
mov dword ptr [rdi], eax
mov dword ptr [rsi], 0
mov dword ptr [rdi], eax
ret
That fifth line there has to be kept in, because the final `*a = t;` there is semantically meaningful; if a == b, then a is also set to 0 on line four, and so we need to reset it to t on line five.Consider the Rust version:
pub fn redundant_store(a: &mut i32, b: &mut i32) {
let t = *a + 1;
*a = t;
*b = 0; // a and b must not alias, so can never clobber
*a = t;
}
You get this output (on Rust 1.90.0): redundant_store:
mov eax, dword ptr [rdi]
inc eax
mov dword ptr [rsi], 0
mov dword ptr [rdi], eax
ret
Because a and b can never alias, we know that the extra store to *a is redundant, as it's not possible for the assignment to *b to modify *a. This means we can get rid of this line.Sure, eliminating one single store isn't going to have a meaningful difference here. But that's not the point, the point is that "it's not possible to be faster than C because C is especially low level" just simply isn't true.
You have to say OK, I allow myself platform specific intrinsics and extensions even though those aren't standard ISO C, and that includes inline assembler. I can pick any compiler and tooling. And I won't count other languages which are transpiled to C for portability because hey in theory I could just write that C myself, couldn't I so they're not really faster.
At the end you're basically begging the question. "I claim C is fastest because I don't count anything else as faster" which is no longer a claim worth disputing.
The aliasing optimisations in Fortran and Rust stand out as obvious examples where to get the same perf in C requires you do global analysis (this is what Rust is side-stepping via language rules and the borrowck) which you can't afford in practice.
But equally the monomorphisation in C++ or Rust can be beneficial in a similar way, you could in principle do all this by hand in your C project but you won't, because time is finite, so you live without the optimisations.
Another somewhat related example is Fortran and C, where one reason Fortran could perform better than C is the restrictions Fortran places on aliasing. In theory, one could use restrict in C to replicate these aliasing restrictions, but in practice restrict is used fairly sparingly, to the point that when Rust tried to enable its equivalent it had to back out the change multiple times because it kept exposing bugs in LLVM's optimizer.
> Safe languages usually want to abort if they encounter an out-of-memory (OOM) situation. SQLite is designed to recover gracefully from an OOM. It is unclear how this could be accomplished in the current crop of safe languages.
I don't think most Rust code written today has guardrails in the case of OOM. I don't think this disqualifies Rust for most things, because I happen to find the trade-off worth it compared to the things it does protect against that C doesn't, but I don't think it's a particularly controversial take that Rust still could use some ergonomic improvements around handling allocation failures. Right now, trying to create a Box or Vec can theoretically fail at runtime if no memory is available, and those failures aren't returned from the functions called to create them. Handling panics is something you can normally do in Rust, but if you're already OOM, things get complicated pretty fast.
I agree that in the long run it would probably make sense to have something like this in Rust when eventually the current maintainers aren't around, but I also don't think it makes much sense to criticize them for continuing to maintain the code that already exists.
"Why is SQLite coded in C and not Rust?" is a question, which immediately makes me want to ask "Why do you need SQLite coded in Rust?".
they have a blog hinting at some answers as to "why": https://turso.tech/blog/introducing-limbo-a-complete-rewrite...
SQLite is old, huge and known for its gigantic test coverage. There’s just so much to rewrite.
DuckDB is from 2019, so new enough to jump on the “rust is safe and fast”
Maybe autovectorization works, but can I just write a few ARM64 instructions on my Mac in Rust stable (notcexperimental/nightly) as I can do it in C/C++ by just including a few ARM specific header files?
The practical differences are larger than the theoretical differences, so I would expect the gap to diminish over time.
Rust reminded me of when I used to write database engines in Java. It required a lot more code, which has its own costs, but never really delivered on claims of comparable performance. The "more code" part largely comes down to the more limited ability to build good abstractions compared to C++20 and more limited composability. The "slower binaries" part comes down to worse codegen, which you can't blame on Rust per se, and a lot of extra overhead introduced in the code to satisfy the Rust safety model that would simply not be required in other systems languages.
Safety is a mixed bag. Rust can check several things at compile-time that C++20 cannot. C++20 can check several things at compile-time that Rust cannot.
For high-performance database-y code, memory is allocated at startup and is accessed via managed index handles. Rust does the same thing. In these types of memory models, i.e. no dynamic allocation and no raw pointers, both Rust and C++20 offer similar memory safety guarantees. Most high-performance software is thread-per-core that is almost purely single-threaded, so thread-safety concerns are limited.
That said, stripping away all of the above, the only real advantage that C++20 has its much more powerful toolset for building abstractions. Its performance and unique safety elements are based almost entirely on the ability to build concise, contextual, and highly composable abstractions as needed. This is not a feature that should be downplayed, I immediately miss it when I use most other languages.
https://news.ycombinator.com/item?id=28278859 - August 2021
https://news.ycombinator.com/item?id=16585120 - March 2018
The current doc no longer has any paragraphs about security, or even the word security once.
The 2021 edition of the doc contained this text which no longer appears: 'Safe languages are often touted for helping to prevent security vulnerabilities. True enough, but SQLite is not a particularly security-sensitive library. If an application is running untrusted and unverified SQL, then it already has much bigger security issues (SQL injection) that no "safe" language will fix.
It is true that applications sometimes import complete binary SQLite database files from untrusted sources, and such imports could present a possible attack vector. However, those code paths in SQLite are limited and are extremely well tested. And pre-validation routines are available to applications that want to read untrusted databases that can help detect possible attacks prior to use.'
https://web.archive.org/web/20210825025834/https%3A//www.sql...
The biggest gripe I have with a rewrite is... A lof of the time we rewrite for feature parity. Not the exact same thing. So you are kind ignoring/missing/forgetting all those edge cases and patches that were added along the way for so many niche or otherwise reasons.
This means broken software. Something which used to work before but not anymore. They'll have to encounter all of them again in the wild and fix it again.
Obviously if we are to rewrite an important piece of software like this, you'd emphasise more on all of these. But it's hard for me to comprehend whether it will be 100%.
But other than sqlite, think SDL. If it is to be rewritten. It's really hard for me to comprehend that it's negligible in effect. Am guessing horrible releases before it gets better. Users complaining for things that used work.
C is going to be there long after the next Rust is where my money is. And even if Rust is still present, there would be a new Rust then.
So why rewrite? Rewrites shouldn't be the default thinking no?
I am not Dr Hipp, and therefore I like run-time checks.
Also, does it use doubly linked lists or graphs at all? Those can, in a way, be safer in C since Rust makes you roll your own virtual pointer arena.
You can implement a linked list in Rust the same as you would in C using raw pointers and some unsafe code. In fact there is one in the standard library.
You can write a linked list the same way you would in C if you wish.
sure, it's an old library they had pretty much anything (not because they don't know what they are doing but because shit happens)
lets check CVEs of the last few years:
- CVE-2025-29088 type confusion
- CVE-2025-29087 out of bounds write
- CVE-2025-7458 integer overflow, possible in optimized rust but test builds check for it
- CVE-2025-6965 memory corruption, rust might not have helped
- CVE-2025-3277 integer overflow, rust might have helped
- CVE-2024-0232 use after free
- CVE-2023-36191 segmentation violation, unclear if rust would have helped
- CVE-2023-7104 buffer overflow
- CVE-2022-46908 validation logic error
- CVE-2022-35737 array bounds overflow
- CVE-2021-45346 memory leak
...
as you can see the majority of CVEs of sqlite are much less likely in rust (but a rust sqlite impl. likely would use unsafe, so not impossible)
as a side note there being so many CVEs in 2025 seem to be related to better some companies (e.g. Google) having done quite a bit of fuzz testing of SQLite
other takeaways:
- 100% branch coverage is nice, but doesn't guarantee memory soundness in C
- given how deeply people look for CVEs in SQLite the number of CVEs found is not at all as bad as it might look
but also one final question:
SQLite uses some of the best C programmers out there, only they merge anything to the code, it had very limited degree of change compared to a typical company project. And we still have memory vulnerabilities. How is anyone still arguing for C for new projects?
Yeah I essentially agree. I'm sure there are still plenty of good cases for C, depending on project size, experience of the engineers, integration with existing libraries, target platform, etc. But it definitely seems like Rust would be the better option in scenarios where there's not some a priori thing that strongly skews toward or forces C.
It just works
Alternately, maybe there's a spectrum of undesirable behaviors, some of which are preventable by choice of language, some of which aren't, and trying to reduce a complex set of tradeoffs to a simple binary of whether it "just works" only restates the conclusion someone has already come to because you need to actually reason about those tradeoffs to come to an informed decision of where to implicitly draw the line in the first place.
It has async I/O support on Linux with io_uring, vector support, BEGIN CONCURRENT for improved write throughput using multi-version concurrency control (MVCC), Encryption at rest, incremental computation using DBSP for incremental view maintenance and query subscriptions.
Time will tell, but this may well be the future of SQLite.
Also, this is a VC backed project. Everyone has to eat, but I suspect that Turso will not go out of its way to offer a Public Domain offering or 50 year support in the way that SQLite has.
The aim is to be compatible with sqlite, and a drop-in replacement for it, so I think it's fair use.
> Also, this is a VC backed project. Everyone has to eat, but I suspect that Turso will not go out of its way to offer a Public Domain offering or 50 year support in the way that SQLite has.
It's MIT license open-source. And unlike sqlite, encourages outside contribution. For this reason, I think it can "win".
It’s absolutely inappropriate and appropriative.
They’ve been poor community members from the start when they publicized their one-sided spat with SQLite over their contribution policy.
The reality is that they are a VC-funded company focused on the “edge database” hypetrain that’s already dying out as it becomes clear that CAP theorem isn’t something you can just pretend doesn’t exist.
It’ll very likely be dead in a few years, but even if it’s not, a VC-funded project isn’t a replacement for SQLite. It would take incredibly unique advantages to shift literally the entire world away from SQLite.
It’s a new thing, not the next evolution of SQLite.
try marketing your burger company as "The Next Evolution of McDonalds" and see what happens
SQLite is NOT being rewritten in Rust!
>>Turso Database is an in-process SQL database written in Rust, compatible with SQLite.
turdso is VC funded so will probably be defunct in 2 years
Compatible with SQLite. So it's another database?
Occasionally when working in Lua I'd write something low-level in C++, wrap it in C, and then call the C wrapper from Lua. It's extra boilerplate but damn is it nice to have a REPL for your C++ code.
Edit: Because someone else will say it - Rust binary artifacts _are_ kinda big by default. You can compile libstd from scratch on nightly (it's a couple flags) or you can amortize the cost by packing more functions into the same binary, but it is gonna have more fixed overhead than C or C++.
If I want a "C Library", I want a "C Library" and not some weird abomination that has been surgically grafted to libstdc++ or similar (but be careful of which version as they're not compatible and the name mangling changes and ...).
This isn't theoretical. It's such a pain that the C++ folks started resorting to header-only libraries just to sidestep the nightmare.
This makes me less safe rather than more. Note that there is a substantial double standard here, we could never in the name of safety impose this level of burden from C tooling side because maintainers would rightfully be very upset (even toggling a warning in the default set causes discussions). For the same reason it should be unacceptable to use Rust before this is fixed, but somehow the memory safety absolutists convinced many people that this is more important than everything else. (I also think memory safety is important, but I can't help but thinking that pushing for Rust is more harmful to me than good. )
So you might think, but there is a committee actively undermining this, not to mention compiler people keeping things exciting also.
There is a dogged adherence to backward compatibility, so that you can't pretend C has not gone anywhere in thirty-five years, if you like --- provided you aren't invoking too much undefined behavior. (You can't as easily pretend that your compiler has not gone anywhere in 35 years with regard to things you are doing out of spec.)
So, the argument for keeping SQLite written in C is that it gives the user the choice to either:
- Build SQLite with Yolo-C, in which case you get excellent performance and lots of tooling. And it's boring in the way that SQLite devs like. But it's not "safe" in the sense of memory safe languages.
- Build SQLite with Fil-C, in which case you get worse (but still quite good) performance and memory safety that exceeds what you'd get with a Rust/Go/Java/whatever rewrite.
Recompiling with Fil-C is safer than a rewrite into other memory safe languages because Fil-C is safe through all dependencies, including the syscall layer. Like, making a syscall in Rust means writing some unsafe code where you could screw up buffer sizes or whatnot, while making a syscall in Fil-C means going through the Fil-C runtime.
Sqlite has been recoded (automatically) in Go a while ago [1], and it is widely deployed
> would probably introduce far more bugs than would be fixed
It runs against the same test suite with no issues
> and it may also result in slower code
It is quite a lot slower, but it is still widely used as it turns out that the convenience of a native port outweighs the performance penalty in most cases.
I don't think SQLite should be rewritten in Go, Rust, Zig, Nim, Swift ... but ANSI C is a subset of the feature set of most modern programming languages. Projects such as this could be written and maintained in C indefinitely, and be automatically translated to other languages for the convenience of users in those languages
It runs against the same public test suite. The proprietary test suite is much more intensive.
It runs against the same test suite with no issues
- that proves nothing about bugs existing or not.
That doesn't guarantee no bugs. It just means that the existing behaviour covered by the tests is still the same. It may introduce new issues in untested edge cases or performance issues
Libraries written in C do not have a huge run-time dependency.
In its minimum configuration, SQLite requires only the following routines from the standard C library:
memcmp() memcpy() memmove() memset() strcmp() strlen() strncmp()
In a more complete build, SQLite also uses library routines like malloc() and free() and operating system interfaces for opening, reading, writing, and closing files. But even then, the number of dependencies is very small. Other "modern" languages, in contrast, often require multi-megabyte runtimes loaded with thousands and thousands of interfaces."
Very laudable!
(I should also point out that SQLite could conceivably be compiled with small (in terms of lines of code) C compilers like Fabrice Bellard's Tiny C Compiler (TCC). Also SQLite's few required standard C library routines listed above could conceivably be coded inside of SQLite itself(!) (they are, after all, just additional lines of C code in a different place -- and those could conceivably be moved or copied) -- thus removing the dependency/requirement for any standard C library whatsoever!)
Anyway, we love SQLite!
Safe languages insert additional machine branches to do things like verify that array accesses are in-bounds. In correct code, those branches are never taken. That means that the machine code cannot be 100% branch tested, which is an important component of SQLite's quality strategy.
Rust needs to mature a little more, stop changing so fast, and move further toward being old and boring.
Rust needs to demonstrate that it can do the kinds of work that C does in SQLite without a significant speed penalty.I suppose SQLite might use a C linter tool that can prove the bounds checks happen at a higher layer, and then elide redundant ones in lower layers, but... C compilers won't do that by default, they'll just write memory-unsafe machine code. Right?
This is annoying in Rust. To me array accesses aren't the most annoying, it's match{} branches that will never been invoked.
There is unreachable!() for such situations, and you would hope that:
if array_access_out_of_bounds { unreachable!(); }
is recognised by the Rust tooling and just ignored. That's effectively the same as SQLite is doing now by not doing the check. But it isn't ignored by the tooling: unreachable!() is reported as a missed line. Then there is the test code coverage including the standard output by default, and you have to use regex's on path names to remove it.Your example does what [] does already, it’s just a more verbose way of writing the same thing. It’s not the same behavior as sqlite.
Array access was just an example. My point was that Rust makes 100% code coverage for unit tests well neigh impossible.
For those of us who like 100% test coverage, that is a major annoyance. For way I use Rust that could be fixed by the tooling simply not counting unreachable!() lines as unreached. For Sqlite, who does branch coverage testing on the compiled binary you would have to go a step further, and provide a compile time option that elides all paths that lead to unreachable!() from the binary. I recall Sqlite saying when they changed to 100% branch coverage, their bug reports dropped by a factor of 7. I hope I remember that correctly. If I do, I'm pretty sure they won't be looking at Rust until they can achieve the same outcome. They can't come close now.
Replacing every array access with get_unchecked() and the consequent explosion of unsafe area's wouldn't fly with anyone I worked with. You did say to me in another thread unsafe is perfectly fine in Rust programs, but you are literally the only person holding that opinion I come across.
https://algora.io/challenges/turso "Turso is rewriting SQLite in Rust ; Find a bug to win $1,000"
------
- Dec 10, 2024 : "Introducing Limbo: A complete rewrite of SQLite in Rust"
https://turso.tech/blog/introducing-limbo-a-complete-rewrite...
- Jan 21, 2025 - "We will rewrite SQLite. And we are going all-in"
https://turso.tech/blog/we-will-rewrite-sqlite-and-we-are-go...
- Project: https://github.com/tursodatabase/turso
Status: "Turso Database is currently under heavy development and is not ready for production use."
turso has 341 rust source files spread across tens of directories and 514 (!) external dependencies that produce (in release mode) 16 libraries and 7 binaries with tursodb at 48M and libturso_sqlite3.so at 36M.
looks roughly an order of magnitude larger to me. it would be interesting to understand the memory usage characteristics in real-world workloads. these numbers also sort of capture the character of the languages. for extreme portability and memory efficiency, probably hard to beat c and autotools though.
Talking about C99, or C++11, and then “oh you need the nightly build of rust” were juxtaposed in such a way that I never felt comfortable banging out “yum install rust” and giving it a go.
(There are some decent reasons to use the nightly toolchain in development even if you don’t rely on any unfinished features in your codebase, but that means they build on stable anyway just fine if you prefer.)
The Rust Project releases a new stable compiler every six weeks. Because it is backwards compatible, most people update fairly quickly, as it is virtually always painless. So this may mean, if you don’t update your compiler, you may try out a new package version and it may use features or standard library calls that don’t exist in the version you’re using, because the authors updated regularly. There’s been some developments in Cargo to try and mitigate some of this, but since it’s not what the majority of users do, it’s taken a while and those features landed relatively recently, so they’re not widely adopted yet.
Nightly features are ones that aren’t properly accepted into the language yet, and so are allowed to break in backwards incompatible ways at any time.
"....Safe languages insert additional machine branches to do things like verify that array accesses are in-bounds. In correct code, those branches are never taken. That means that the machine code cannot be 100% branch tested, which is an important component of SQLite's quality strategy..."
"...Safe languages usually want to abort if they encounter an out-of-memory (OOM) situation. SQLite is designed to recover gracefully from an OOM. It is unclear how this could be accomplished in the current crop of safe languages..."
One stupid workaround is combining multiple columns into one, with values separated by a space, for example. This works when each column value is always a string containing no spaces
Another stupid workaround, probably slower, might be to hash the multiple columns into a new column and use ON CONFLICT(newcolumn_name)
At this point I wish the creators of the language could talk about what rust is bad at.
Not only had this fellow built a functional ISP in one of the toughest markets (at that time), in the world - but he'd also managed to build the database engine and quite a few of the other tools that ran that ISP, and was in danger of setting a few standards for a few things which, since then, have long since settled out, but .. nevertheless .. it could've been.
Anyway, this fellow wrote everything in C. His web page, his TODO.h for the day .. he had C-based tools for managing his docs, for doing syncs between various systems under his command (often in very far-away locations, and even under water a couple times) .. everything, in C.
The database system he wrote in pure C was, at the time, quite a delight. It gave a few folks further up the road a bit of a tight neck.
He went on to do an OS, because of course he did.
Just sayin', SQLite devs aren't the only ones who got this right. ;)
Zig gives the programmer more control than Rust. I think this is one of the reasons why TigerBeetle is written in Zig.
More control over what exactly? Allocations? There is nothing Zig can do that Rust can’t.
I mean yeah, allocations. Allocations are always explicit. Which is not true in C++ or Rust.
Personally I don't think it's that big of a deal, but it's a thing and maybe some people care enough.
...If you're using the alloc/std crates (which to be fair, is probably the vast majority of Rust devs). libcore and the Rust language itself do not allocate at all, so if you use appropriate crates and/or build on top of libcore yourself you too can have an explicit-allocation Rust (though perhaps not as ergonomic as Zig makes it).
I read your response 3 times and I truly don't know what you mean. Mind explaining with a simple example?
From section "1.2 Compatibility". How easy is it to embed a library written in Zig in, say, a small embedded system where you may not be using Zig for the rest of the work?
Also, since you're the submitter, why did you change the title? It's just "Why is SQLite Coded in C", you added the "and not Rust" part.
From the site guidelines: https://news.ycombinator.com/newsguidelines.html
Any idea what this refers to? assert is a macro in C. Is the implication that OP wants the capability of testing conditions and then turning off the tests in a production release? If so, then I think the argument is more that go hates the idea of a preprocessor. Or have I misunderstood the point being made?
One reason I enjoy Go is because of the pragmatic stdlib. On most cases, I can get away without pulling in any 3p deps.
Now of course Go doesn’t work where you can’t tolerate GC pauses and need some sort of FFI. But because of the stdlib and faster compilation, Go somehow feels lighter than Rust.