> These days I thus view asserts as falling into two categories:
> 1. Checking problem domain assumptions.
> 2. Checking internal assumptions.
(1) is the category of assert that should not be an assert. That is an error to be handled, not asserted.Ok, to be fair, (1) is really a combination of two categories: (1a) assumptions about uncontrolled external input, and (1b) assumptions about supposedly controlled or known input. Both should be handled with error checking most of the time, but it's forgivable for (1b) to be asserted if it's too inconvenient to do proper error handling. (1b) and (2) are problems that you need to fix, and the sooner and clearer the issue is announced, the more likely and easier it is to be fixed.
One thing I didn't see mentioned is that asserts, especially category (2), enable fuzz testing to be vastly more effective. A fuzz test doesn't have to stumble across something that causes a crash or a recognizably bad output; it just needs to trigger an assert. Which is another reason to not use asserts for unexpected input; fuzzers are supposed to give unexpected input, and your program is supposed to handle it reasonably gracefully. If you over-constrain the input, then first you'll be wrong because weirdness will sneak in anyway, and second the fuzzer is harder to implement correctly and is less powerful. The fuzzer is supposed to be a chaos monkey, and it works best if you allow it to be one.
For fuzz testing I go even further with asserts. I usually also write a function called dbg_check(), which actively goes through all internal data and checks that all the internal invariants hold. Eg, in a b-tree, the depth should be the same for all children, children should be in order, width of all nodes is between n/2-n, and so on.
If anything breaks during fuzz testing (which is almost guaranteed), you want the program to crash as soon as possible - since that makes it much easier to debug. I'll wrap a lot of methods which modify the data structure in calls to dbg_check, calling it both before and after making changes. If dbg_check passes before a function runs, but fails afterwards - then I have a surefire way to narrow in on the buggy behaviour so I can fix it.
But that's only worth it for some CLI tools. For many, I agree that spewing out an assert failure is plenty good enough.
It's also worth pointing out that this is the reason you should *never* put side-effects or security-relevant checks in assert statements. For example, you should never do something like this:
assert f.read(4) == b"\x89PNG", "Not a PNG file"
# proceed to read and parse the rest of the file
but rather, you should do magic = f.read(4)
assert magic == b"\x89PNG", "Not a PNG file"
so that your code doesn't suddenly break when someone decides to be clever and use -O.Also, fun unrelated fact: Python does have something like a preprocessor, although it's rarely used. If you condition on the flag __debug__:
if __debug__:
expensive_runtime_check()
and then run Python with -O, the if statement and its body will be entirely deleted from the bytecode - even the `if` check will be deleted. It can be used for including "debug" code in hot code, where the extra flag check itself might be expensive.If the `assert` compiles out, wouldn’t -O also possibly compile the `read()` out as well given `magic` isn’t used after the assign?
It could be optimised away if all following uses would invalidate (seek, but only with SEEK_SET or SEEK_END) or ignore (pread/pwrite) the file offset, but that seems like an enormous amount of fussy work for what I would guess is little to no payback.
It’s trivial to write a debug_assert macro in C, so no, you’re not forced to do that.
Rust provides some remarkably rich features to help you reason about the assumptions, preconditions and postconditions your code has, but debug_assert isn't one of them.
In the example of "min(ages) > 0", making age a NonZero type renders the assert unnecessary. Rust even has some fancy perf optimizations it can do with that information. It's a win all around.
How would one communicate properties/traits of a custom type (like a non-zero unit) that a compiler can leverage to optimize (in general, for any programming language)?
An example of the niche optimisations is in that link; if a 32bit number is NonZero, you can put that in an Option, and your Option<NonZero<u32>> will be the same size as a normal u32.
In rust might be a NewType: https://www.howtocodeit.com/articles/ultimate-guide-rust-new...
> It is on production runs that the security is most required, since it is the results of production runs that will actually be trusted as the basis of actions such as expenditure of money and perhaps even lives. The strategy now recommended to many programmers is equivalent to that of a sailor who wears a lifejacket during his training on dry land but takes it off when he is sailing his boat on the sea. It is small wonder that computers acquire a bad reputation when programmed in accordance with this common policy.
It is also quoted by Donald Knuth in "Structured programming with goto statements" (1974) [1] ( which incidentally is also the source of the quote about premature optimization):
> He [Tony Hoare] points out quite correctly that the current practice of compiling subscript range checks into the machine code while a program is being tested, then suppressing the check during production runs, is like a sailor who wears his life preserver while training on land but leaves it behind when he sails!
[0]: https://ora.ox.ac.uk/objects/uuid:dff9483b-e72f-4599-bf90-76... p. 341
[1]: https://dl.acm.org/doi/pdf/10.1145/356635.356640 p. 269
If the ship sinks, they're worse than useless. But since you've decided they'll never be needed, you get more beer for your cruise.
- `assert` is disabled in unsafe `danger` mode or can be disabled with a flag for performance
- `doAssert` cannot be disabled
While I've never come across an argument for why there are two types of assert, over time I’ve naturally started using them in the same way as the author.Note that Rust got this exactly right. assert, debug_assert. Clear and fail-safe.
They are both enabled in release AND debug modes. You would have to explicitly compile code with -d:danger flag to disable any assertions.
> And they didn't even name the safe version clearly!
In this context safe version is clearly named as "release" mode, and unsafe one is even more clear - "danger" mode. "danger" obviously implies it should be used with caution.
I see asserts as a less temporary version of print() based debugging. Sometimes very useful as a quick and dirty fix, but 9 times out of 10 you’re better off with some combination of a real debugger, unit tests, tracing/logging, and better typing or validation.
For instance, if you assert that an incoming index into a function is within the bounds of a vector, then during the rest of the function the compiler can elide any bounds checking.
Run time checks should simply be enabled. Normally, you're checking some context on entry or exit from a function. 99% of the time it simply won't matter to performance. And, when it does, it will pop out at you and you can remove it.
The bigger issue as alluded to is assert() in libraries. As a user, you can't add an assert() to a library. And also, as a user, you can't remove an assert() that is getting in your way.
(assert (< index (length some-vector))
(index)
"Can't change the element at index ~D, index must be smaller than ~D."
index
(length some-vector))
That will print a message when the index is too large and give you the option of providing another index.E.g., imagine in this example that the code code throw an invalid index exception, some calling code could catch that, and supply a new index, and control flow would resume from the throw expression.
This would be a complete mess, but it would be interesting nonetheless :)
https://en.m.wikibooks.org/wiki/Common_Lisp/Advanced_topics/...
Basically there are three cases.
1. The performance hit of the assertion checks are okay in production
2. It has a cost but can be lived with
3. Cannot be tolerated
For number two there is some space to play with.
What I've wanted is something more fine-grained. Not a global off/on switch but a way to turn on asserts for certain modules/classes/subsection, certain kinds of checks, and so on. It would also be nice to get some sort of data about the code coverage for assertions which have lived in a deployed program for months. You could then use that data to change the program (hopefully you can toggle these dynamically); maybe there is some often-hit assertion that has a noticeable impact which checks a code path that has been stable for months. Turning it off would not mean that you lose that experience wholesale if you have some way to store that data. I mean: imagining that there is some history tool that you can load the data about this code path into. You see the data travels through it and what range it uses. Then you have empirical data across however many runs (in various places) that indeed this assertion is never triggered. Which you can use to make a case about the likelihood of regression if you turn off that assertion.
That assumes that the code path is stable. You might need to enable the assertion again if the code path changes.
Just as a concrete example. You are working with some generated code in a language which doesn't support macros or some other, better way than code generation. You need to assert that the code is true to whatever it is modelling and doesn't fall out of sync. You could then enable some kind of reflection code that checks that every time the generated code interacts with the rest of the system. Then you eventually end up with twenty classes like that one and performance degrades. Well you could run these assertions until you have enough data to argue that the generated code is correct with a high level of certainty.
Then every time you need to regenerate and edit the code you would turn the assertions back on.
This is Spring Boot so it seems possible.
The cost of a failing assertion is often incurred at some point where it is most expensive. I know this because I've worked for companies that have charged exorbitant prices for my time to diagnose these sorts of failures.
The cost of a unit test is some learning and some time. LLMs are making the time portion drive towards zero.
As a general rule, I'd say avoid littering your code with assertions. It's a crappy engineering practice.
Assert and (unit)tests are completely orthogonal and unrelated things in terms of functionality even though both are aimed at improving the software correctness.
Failing an assert in production of course sucks and is costly. But what is more costly is letting the bug slip through and cause hard to diagnose bugs, program incorrectness and even (in some cases) silent address space corruption that will then manifest itself in all kinds of weird issues later on during the program run.
The whole point of (correct) use of asserts is to help make sure that the program stays within its own well defined logic and doesn't veer off course and if it does then make that bug immediately as loud as clear as possible.
When the bugs are quick to detect and diagnose you'll learn that you have less and less asserts triggering in production and thus you end up with improved product quality.
As a general rule I'd say use asserts liberally to verify things such as invariants, post- and pre-conditions and ALWAYS have them built-in.
Finally I want to point out that using assert is not ERROR checking (file not found, IP address not resolved, TCP connection failed, system resource failed to allocate, etc.) but BUG checking.
Do not write code for BUGS.
> Do not write code for BUGS.
I have nothing to add, but am quoting the above because it is very well put.
No, I've had great experiences with assertions in code. People have paid my salary because the assertions are invalid and cause more problems than they solved. :D
> Failing an assert in production of course sucks and is costly. But what is more costly is letting the bug slip through and cause hard to diagnose bugs, program incorrectness and even (in some cases) silent address space corruption that will then manifest itself in all kinds of weird issues later on during the program run.
The direct counterpoint to this is that:
Any assertion that validates a runtime invariant can (and IMO should) be converted into a test which covers that same invariant, with coverage information proved by tooling.
This is possible unless the underlying design of the system under test is such that it prevents adequate testing, or your approach to testing is lacking. If you have those problems then asserts are a band-aid on broken practices. Moving quality checks to the left (design / compile time, not runtime) is a generally beneficial practice.
Put another way, I've seen many bugs which should have been caught cheaply early with adequate testing practice, rather than at runtime where they caused system failures. It's a rare bug that I see that that isn't the case.
Perhaps there are points where this broad recommendation doesn't apply. Safety engineering might be one of those, but the problem space of selling someone a widget over the internet rarely has that same level of need for runtime invariant testing that sending a rocket to space might.
---
On a different side of this, I do think that system level assertions (i.e. real code paths that result in actions not `debug_assert!` calls which result in crashing) can belong in systems to check that some process has reached a specific state. I prefer systems to be designed that don't (provably) crash ever.
---
A third side to this is that assertions are code too. They are a place which is rarely if ever tested (and is generally impossible to test because they cover invariants). This means that they're an unmitigatable risk to your system.
A thought experiment for you, what if LeftPad[1] (instead of being deleted) added an assertion that the total number of characters was < 10. Removal caused a bunch of pain for devs. Assuming that this change rolled out through development chains as normal, this change would have broken many runtime systems, and would have been much more costly.
- You can't use the type system for some constraints.
- Asserts can be stripped in production so you don't pay the price for them.
- There is not other syntax for that.
Unfortunatly, because most devs don't know about "-O" and use asserts for things you should use an Exception for, you can't use it: https://www.bitecode.dev/p/the-best-python-feature-you-canno...
tratt might also be doing that more for consistency with the desugaring than as a routine behaviour.
You never use assert for conditions that are logical (error) conditions that the program is expected to handle.
For example your browser handling 404 HTML error is only an error from the user perspective. From the software correctness perspective there's no error, there's just a logical condition that needs to be reasoned about.
Compare this to a BUG which is a mistake (an error made by the programmer), for example violating some invariant, going out of bounds on an array etc.
This is a scenario where the program is violating its own logic and constraints and as a result is no longer in a valid state. Assert is a tool to catch BUGS made by the programmer and nothing else.
Perhaps I should have been clearer that it was a programming error that allowed the cursor to get to a state where it wasn’t visible.
Assertions are basically opposite to that. Useless in development because you watch/test closely anyway and useless in “release” because they vanish.
As I rarely write #CLK-level performance-required code - like most developers I believe - and mostly create systems that just do things rather than doing things in really tight loops many times a second, I always leave as much debug info and explicit failure modes as it is reasonable in my production code, so that the few-weeks failure would be immediate and explained in the logs (“reasonable” being readable without holding pgdn and not filling the ssd just after a week). It doesn’t mean that the system must crash hard as in sigsegv, ofc. It means it reports and logs all(!) steps and errors as they occur in a form that is obviously greppable by all important params (client id, task id, module tag, etc), and it stops a logical process in which an inconsistency occurred from advancing it further. If someone asks you later why it happened or what happened at the specific time with a specific process, you always have the answer almost immediately.
Tldr. Unless you write performance-first systems, i.e. you have performance requirements document in any form to follow, don’t turn off assertions and do log everything. You’ll thank yourself later.
Nonetheless, I use asserts that are deactivated in release builds. The reason I do that is not because I need the speed. It's because it frees me from having to think about speed at all, when writing assertions. And that makes me write more assertions.
You could ask me, if I were to enable assertions in release builds today, what would the slowdown be? Would I even notice? And my answer would be, I don't know and I don't care. What I do know is that if assertions had been enabled in release builds from the start, then I would have written fewer of them.
The reason to not default to leaving all assertions and logging enabled is that performance-sensitive applications are pretty common. They're not performance-first, but the performance of any user-facing application matters. If leaving the asserts in provides good enough performance, do that. If dynamically enabled asserts provide good enough performance, do that -- at least you'll be able to quickly retry things. And since different asserts have different costs, do what the article says and distinguish between assert and debug_assert.