Rust's Two Kinds of 'Assert' Make for Better Code (opens in new tab)

(tratt.net)

57 pointsuntilted1y ago85 comments

85 comments

    > These days I thus view asserts as falling into two categories:
    >    1. Checking problem domain assumptions.
    >    2. Checking internal assumptions.

(1) is the category of assert that should not be an assert. That is an error to be handled, not asserted.

Ok, to be fair, (1) is really a combination of two categories: (1a) assumptions about uncontrolled external input, and (1b) assumptions about supposedly controlled or known input. Both should be handled with error checking most of the time, but it's forgivable for (1b) to be asserted if it's too inconvenient to do proper error handling. (1b) and (2) are problems that you need to fix, and the sooner and clearer the issue is announced, the more likely and easier it is to be fixed.

One thing I didn't see mentioned is that asserts, especially category (2), enable fuzz testing to be vastly more effective. A fuzz test doesn't have to stumble across something that causes a crash or a recognizably bad output; it just needs to trigger an assert. Which is another reason to not use asserts for unexpected input; fuzzers are supposed to give unexpected input, and your program is supposed to handle it reasonably gracefully. If you over-constrain the input, then first you'll be wrong because weirdness will sneak in anyway, and second the fuzzer is harder to implement correctly and is less powerful. The fuzzer is supposed to be a chaos monkey, and it works best if you allow it to be one.

josephg1y ago

Yeah I do this kind of fuzz testing all the time. Its an incredible way to test code.

For fuzz testing I go even further with asserts. I usually also write a function called dbg_check(), which actively goes through all internal data and checks that all the internal invariants hold. Eg, in a b-tree, the depth should be the same for all children, children should be in order, width of all nodes is between n/2-n, and so on.

If anything breaks during fuzz testing (which is almost guaranteed), you want the program to crash as soon as possible - since that makes it much easier to debug. I'll wrap a lot of methods which modify the data structure in calls to dbg_check, calling it both before and after making changes. If dbg_check passes before a function runs, but fails afterwards - then I have a surefire way to narrow in on the buggy behaviour so I can fix it.

hawski1y ago

I fully agree with you. Though I would add that it a little depends on the type of program. A CLI program's lazy, but often sufficient way of handling erroneous input are asserts (that are still enabled in release). In GUI app it shouldn't happen.

sfink1y ago

Yes. I mean, assert spew should probably be maximally informative, which means it's probably going to vomit out a big stack trace. So for heavily used CLIs (especially if the users are other people), it can be nice to handle the bulk of the common errors specifically (by emitting a brief and to the point error message). Bonus points for suggesting a root cause and what to do differently to make it not happen. And this isn't just unnecessary polish -- it's easy for the cause of the error ("/homw/sfink/.config/myapp not found") to get buried in the noise and for the user to not notice when they've fixed one thing and moved on to the next.

But that's only worth it for some CLI tools. For many, I agree that spewing out an assert failure is plenty good enough.

nneonneo1y ago

It's worth pointing out that Python's asserts can also be "compiled" away if you use the -O flag (or PYTHONOPTIMIZE=1), which eliminates their runtime cost.

It's also worth pointing out that this is the reason you should *never* put side-effects or security-relevant checks in assert statements. For example, you should never do something like this:

  assert f.read(4) == b"\x89PNG", "Not a PNG file"
  # proceed to read and parse the rest of the file

but rather, you should do

  magic = f.read(4)
  assert magic == b"\x89PNG", "Not a PNG file"

so that your code doesn't suddenly break when someone decides to be clever and use -O.

Also, fun unrelated fact: Python does have something like a preprocessor, although it's rarely used. If you condition on the flag __debug__:

  if __debug__:
    expensive_runtime_check()

and then run Python with -O, the if statement and its body will be entirely deleted from the bytecode - even the `if` check will be deleted. It can be used for including "debug" code in hot code, where the extra flag check itself might be expensive.

alfiedotwtf1y ago

> but rather, you should do > magic = f.read(4) > assert magic == b"\x89PNG", "Not a PNG file"

If the `assert` compiles out, wouldn’t -O also possibly compile the `read()` out as well given `magic` isn’t used after the assign?

masklinn1y ago

No. Ignoring that Python does not have anywhere near this level of optimisation, read() has side effects so optimising it away would be broken in the general case.

It could be optimised away if all following uses would invalidate (seek, but only with SEEK_SET or SEEK_END) or ignore (pread/pwrite) the file offset, but that seems like an enormous amount of fussy work for what I would guess is little to no payback.

alfiedotwtf1y ago

Ah side effects, which Python doesn’t track anyway. Thanks

juliangmp1y ago

I don't know the Python rules for thus but the way I understand it from C and C++ is that the read can't be removed by the optimizer because it has (or rather is considered to have) side effects. Iirc every syscall falls into this category.

oefrha1y ago

> The problem that I’ve just expressed ultimately occurs because languages like C force us to encode both kinds of assumption with a single assert statement: either all asserts are compiled in or none are.

It’s trivial to write a debug_assert macro in C, so no, you’re not forced to do that.

delaaxe1y ago

It says so in the last paragraph

dmurray1y ago

Right, this whole debug_assert thing is just a tiny bit of syntactic sugar. Everyone starts their C/C++ programs with a dozen lines of #define that wouldn't be necessary if only the authors in the 1980s had the foresight to design programming languages to fit the taste of 2020s programmers. This eliminates one of those lines.

Rust provides some remarkably rich features to help you reason about the assumptions, preconditions and postconditions your code has, but debug_assert isn't one of them.

phamilton1y ago

Asserts are convenient, but every time I encounter them I ask "Can the compiler prove this can't happen?"

In the example of "min(ages) > 0", making age a NonZero type renders the assert unnecessary. Rust even has some fancy perf optimizations it can do with that information. It's a win all around.

winter_blue1y ago

If you created a custom NonZero type, how would the Rust compiler figure out how to optimize that?

How would one communicate properties/traits of a custom type (like a non-zero unit) that a compiler can leverage to optimize (in general, for any programming language)?

dwattttt1y ago

I don't think the mechanics of being able to convey a niche are stable/developer-accessible, but NonZero is a type provided by the standard library: https://doc.rust-lang.org/std/num/struct.NonZero.html

An example of the niche optimisations is in that link; if a 32bit number is NonZero, you can put that in an Option, and your Option<NonZero<u32>> will be the same size as a normal u32.

phamilton1y ago

An interesting project for niche types: https://github.com/rick-de-water/nonany

It's not perfect but does allow some flexibility.

IshKebab1y ago

In general, you need a type system that supports sets of integer values, e.g. range(2, 7) or set(3, 5, 7). Rust doesn't support that unfortunately so it has a special annotation instead to make NonZero work.

jiggawatts1y ago

In terms of category theory, what we would need is subtraction and division types on top of product and sum types.

So a 32-bit integer is the product of 32 two-state bit types. Something akin to NonZero could be defined as that type minus one state, such that there are now 4294967296 - 1 representable values.

Similarly, pointer types on some machines always have some bits set to 0 due to hardware constraints. These can be represented in the type system as 2^64 / 2^n where 'n' is the number of bits that are not usable, resulting in something like 2^46 for typical CPUs. This would allow extra bits of state to be "packed in there".

This concept is more generally useful, not just for bit-packing.

For example, a database table row might be represented by a struct that contains a field for every column. But then, how to represent the result of a SELECT query!? Either you code-gen this into a new type, create that type manually, or use compiler-generated "unspeakable" type names. (Linq in C# does this.)

Or... the language could natively support type division, so you could use one struct type for "all columns" and then use division on the type to delete columns you don't need for a particular query or REST response.

There's a whole ecosystem of tools that work around the problem of not having this as a built-in feature in typical languages: AutoMapper, automap, Mapster, etc...

3 more replies

Timshel1y ago

In general such a thing is written with a wrapper type which disappear at compilation (a kind of stricter type alias).

In rust might be a NewType: https://www.howtocodeit.com/articles/ultimate-guide-rust-new...

erik_seaberg1y ago

I remember someone arguing that disabling assertions in prod is like wearing a life jacket in the harbor but throwing it overboard going to sea. And Moore's Law paid for them years ago.

martinhath1y ago

This is from C.A.R Hoare's "Prospects for a better programming language" (1972) [0]:

> It is on production runs that the security is most required, since it is the results of production runs that will actually be trusted as the basis of actions such as expenditure of money and perhaps even lives. The strategy now recommended to many programmers is equivalent to that of a sailor who wears a lifejacket during his training on dry land but takes it off when he is sailing his boat on the sea. It is small wonder that computers acquire a bad reputation when programmed in accordance with this common policy.

It is also quoted by Donald Knuth in "Structured programming with goto statements" (1974) [1] ( which incidentally is also the source of the quote about premature optimization):

> He [Tony Hoare] points out quite correctly that the current practice of compiling subscript range checks into the machine code while a program is being tested, then suppressing the check during production runs, is like a sailor who wears his life preserver while training on land but leaves it behind when he sails!

[0]: https://ora.ox.ac.uk/objects/uuid:dff9483b-e72f-4599-bf90-76... p. 341

[1]: https://dl.acm.org/doi/pdf/10.1145/356635.356640 p. 269

teo_zero1y ago

I'd say it's more like wearing a life jacket while building and testing a ship, but not imposing to every passenger to wear one once the ship is certified and put in service.

garaetjjte1y ago

Some replace asserts with __builtin_unreachable - I guess that would be like filling life jacket with stones?

sfink1y ago

More like filling them with beer.

If the ship sinks, they're worse than useless. But since you've decided they'll never be needed, you get more beer for your cruise.

archargelod1y ago

Nim has this too:

  - `assert` is disabled in unsafe `danger` mode or can be disabled with a flag for performance
  - `doAssert` cannot be disabled

While I've never come across an argument for why there are two types of assert, over time I’ve naturally started using them in the same way as the author.

IshKebab1y ago

How are people still making the classic `foo`/`foo_safe_version` mistake? In a supposedly modern language. And they didn't even name the safe version clearly! If the rest of Nim is designed on this level then I hope it never succeeds.

Note that Rust got this exactly right. assert, debug_assert. Clear and fail-safe.

archargelod1y ago

I don't understand, what's wrong with assert and doAssert?

They are both enabled in release AND debug modes. You would have to explicitly compile code with -d:danger flag to disable any assertions.

> And they didn't even name the safe version clearly!

In this context safe version is clearly named as "release" mode, and unsafe one is even more clear - "danger" mode. "danger" obviously implies it should be used with caution.

IshKebab1y ago

Ok I guess we've all got to learn it at some point. Maybe there should be some kind of test of lessons that "the industry" has learnt for new people. Anyway...

1. There's no semantic different between `assert` and `doAssert`. Does `assert` not "do" the assert? Of course it does. The names are meant to communicate what the functions do, and these fail. It should be called `assert` and `always_assert` or something....

2. Except that it shouldn't because the "obvious" one to use (`assert`) should be the safest. People don't always read the manual (including you) and they sometimes make mistakes (yes, including you), so the default actions should be safe. Danger should be opt-in, not opt-out-if-you-happen-to-know-you-have-to.

That's why it's `assert`/`debug_assert` not `release_assert`/`assert`.

There are a couple of famous examples where they got it completely wrong: Python's YAML library has `load` and `load_safe`. MySQL has `escape` and `real_escape`. There's probably more.

1 more reply

HdS841y ago

The names do not communicate what's happening and when. Do assert and assert, what's the difference? Something like debugassert and alwaysAssert would be much more understandable

thegeekpirate1y ago

Nim is my anti-language... I'd make the opposite language design choice in almost every instance.

davelee1y ago

Swift calls these precondition() and assert(). Preconditions are enabled in release (and debug) builds, and asserts are enabled in debug builds only.

darkr1y ago

Similarly python in optimised mode with -O or -OO flags will disable asserts.

I see asserts as a less temporary version of print() based debugging. Sometimes very useful as a quick and dirty fix, but 9 times out of 10 you’re better off with some combination of a real debugger, unit tests, tracing/logging, and better typing or validation.

sagacity1y ago

One point I didn't see mentioned that asserts can be used by the compiler to enable certain optimisations.

For instance, if you assert that an incoming index into a function is within the bounds of a vector, then during the rest of the function the compiler can elide any bounds checking.

teo_zero1y ago

What C compiler would add bounds checks?

sagacity1y ago

That was just an example. In Rust bounds checks can be eliminated in this way, but I'm sure there are similar optimisation opportunities in C code as well.

bsder1y ago

"Using runtime assert() in a tight performance loop can impact performance?" Um, like ... duh?

Run time checks should simply be enabled. Normally, you're checking some context on entry or exit from a function. 99% of the time it simply won't matter to performance. And, when it does, it will pop out at you and you can remove it.

The bigger issue as alluded to is assert() in libraries. As a user, you can't add an assert() to a library. And also, as a user, you can't remove an assert() that is getting in your way.

justincormack1y ago

Yeah, assert in a performance loop is generally a mistake.

tmtvl1y ago

Common Lisp does assertions right: you can let `assert' provide a restart which allows the user to fix a problem when it crops up. For example:

  (assert (< index (length some-vector))
    (index)
    "Can't change the element at index ~D, index must be smaller than ~D."
    index
    (length some-vector))

That will print a message when the index is too large and give you the option of providing another index.

mort961y ago

Wat's up with calling the variable "index" in the second parameter to assert? What happens when you try to call what's presumably an integer..?

masklinn1y ago

There is no call. assert is a macro, the second parameter is a list of places which the restart can update.

foota1y ago

It would be interesting if a language allowed control flow to jump between catches and exceptions with named sort of exceptions.

E.g., imagine in this example that the code code throw an invalid index exception, some calling code could catch that, and supply a new index, and control flow would resume from the throw expression.

This would be a complete mess, but it would be interesting nonetheless :)

masklinn1y ago

That is literally what Common Lisp has and GP describes…

https://en.m.wikibooks.org/wiki/Common_Lisp/Advanced_topics/...

taeric1y ago

It is always fun when people learn some of the things that common lisp has done for a long time.

foota1y ago

I realize you could probably build this in lisp, but this (by default) seems to be missing the part about jumping back to where the exception was raised, instead of resuming flow control from where the restart was defined, iiuc.

2 more replies

skavi1y ago

Exceptions are a special case of general effect systems which do include the capability you describe.

PhilipRoman1y ago

You can do it easily in any language with coroutines, for example Lua.

wruza1y ago

You can do it easily in any language with functions and flow control. Restarts and coroutines are just if-errs and CPS-natured code in disguise. The difference is only syntactic, and there are ways to reduce the noise to manageable levels.

frizlab1y ago

Swift has assert (removed when compiling w/ optimizations) and precondition (usually kept but can be removed w/ compiler option, and compiler might make assumptions depending on content of precondition).

chrishill891y ago

I agree with the others here that classic assert statements are in a weird spot. You enable it when testing, in other words when the data flowing through your program is likely to be constrained and lazy. And then turn it off when your program is about to meet real data (production). There's some wrinkle to this where you might want to log certain things in production instead of aborting with an assertion failure. But let's assume failing hard and fast is preferred for now.

Basically there are three cases.

1. The performance hit of the assertion checks are okay in production

2. It has a cost but can be lived with

3. Cannot be tolerated

For number two there is some space to play with.

What I've wanted is something more fine-grained. Not a global off/on switch but a way to turn on asserts for certain modules/classes/subsection, certain kinds of checks, and so on. It would also be nice to get some sort of data about the code coverage for assertions which have lived in a deployed program for months. You could then use that data to change the program (hopefully you can toggle these dynamically); maybe there is some often-hit assertion that has a noticeable impact which checks a code path that has been stable for months. Turning it off would not mean that you lose that experience wholesale if you have some way to store that data. I mean: imagining that there is some history tool that you can load the data about this code path into. You see the data travels through it and what range it uses. Then you have empirical data across however many runs (in various places) that indeed this assertion is never triggered. Which you can use to make a case about the likelihood of regression if you turn off that assertion.

That assumes that the code path is stable. You might need to enable the assertion again if the code path changes.

Just as a concrete example. You are working with some generated code in a language which doesn't support macros or some other, better way than code generation. You need to assert that the code is true to whatever it is modelling and doesn't fall out of sync. You could then enable some kind of reflection code that checks that every time the generated code interacts with the rest of the system. Then you eventually end up with twenty classes like that one and performance degrades. Well you could run these assertions until you have enough data to argue that the generated code is correct with a high level of certainty.

Then every time you need to regenerate and edit the code you would turn the assertions back on.

sfink1y ago

Logging systems often have the specificity you're describing, with selectors for modules etc. You could expand a logging system to handle asserting. "Log the current state, which should never be Running."

chrishill891y ago

Thanks. Yeah, I've also been having similar thoughts for logging systems. In particular right now we (at my job) configure the log level through compile-time switches. It would be nicer to be able to toggle things dynamically. Yes, including per module instead of just "debug" or "info"

This is Spring Boot so it seems possible.

joshka1y ago

I don't think I've ever in 30+ years of programming seen a runtime assertion that wouldn't be improved by doing something else instead. Writing unit tests and choosing designs that codify the constraints in the type system are two things that are obvious.

The cost of a failing assertion is often incurred at some point where it is most expensive. I know this because I've worked for companies that have charged exorbitant prices for my time to diagnose these sorts of failures.

The cost of a unit test is some learning and some time. LLMs are making the time portion drive towards zero.

As a general rule, I'd say avoid littering your code with assertions. It's a crappy engineering practice.

samiv1y ago

Respectfully I completely disagree and I think you've got it rather backwards since you've had some bad experiences (understandable).

Assert and (unit)tests are completely orthogonal and unrelated things in terms of functionality even though both are aimed at improving the software correctness.

Failing an assert in production of course sucks and is costly. But what is more costly is letting the bug slip through and cause hard to diagnose bugs, program incorrectness and even (in some cases) silent address space corruption that will then manifest itself in all kinds of weird issues later on during the program run.

The whole point of (correct) use of asserts is to help make sure that the program stays within its own well defined logic and doesn't veer off course and if it does then make that bug immediately as loud as clear as possible.

When the bugs are quick to detect and diagnose you'll learn that you have less and less asserts triggering in production and thus you end up with improved product quality.

As a general rule I'd say use asserts liberally to verify things such as invariants, post- and pre-conditions and ALWAYS have them built-in.

Finally I want to point out that using assert is not ERROR checking (file not found, IP address not resolved, TCP connection failed, system resource failed to allocate, etc.) but BUG checking.

Do not write code for BUGS.

sfink1y ago

> Finally I want to point out that using assert is not ERROR checking (file not found, IP address not resolved, TCP connection failed, system resource failed to allocate, etc.) but BUG checking.

> Do not write code for BUGS.

I have nothing to add, but am quoting the above because it is very well put.

joshka1y ago

> since you've had some bad experiences

No, I've had great experiences with assertions in code. People have paid my salary because the assertions are invalid and cause more problems than they solved. :D

> Failing an assert in production of course sucks and is costly. But what is more costly is letting the bug slip through and cause hard to diagnose bugs, program incorrectness and even (in some cases) silent address space corruption that will then manifest itself in all kinds of weird issues later on during the program run.

The direct counterpoint to this is that:

Any assertion that validates a runtime invariant can (and IMO should) be converted into a test which covers that same invariant, with coverage information proved by tooling.

This is possible unless the underlying design of the system under test is such that it prevents adequate testing, or your approach to testing is lacking. If you have those problems then asserts are a band-aid on broken practices. Moving quality checks to the left (design / compile time, not runtime) is a generally beneficial practice.

Put another way, I've seen many bugs which should have been caught cheaply early with adequate testing practice, rather than at runtime where they caused system failures. It's a rare bug that I see that that isn't the case.

Perhaps there are points where this broad recommendation doesn't apply. Safety engineering might be one of those, but the problem space of selling someone a widget over the internet rarely has that same level of need for runtime invariant testing that sending a rocket to space might.

---

On a different side of this, I do think that system level assertions (i.e. real code paths that result in actions not `debug_assert!` calls which result in crashing) can belong in systems to check that some process has reached a specific state. I prefer systems to be designed that don't (provably) crash ever.

---

A third side to this is that assertions are code too. They are a place which is rarely if ever tested (and is generally impossible to test because they cover invariants). This means that they're an unmitigatable risk to your system.

A thought experiment for you, what if LeftPad[1] (instead of being deleted) added an assertion that the total number of characters was < 10. Removal caused a bunch of pain for devs. Assuming that this change rolled out through development chains as normal, this change would have broken many runtime systems, and would have been much more costly.

[1]: https://en.wikipedia.org/wiki/Npm_left-pad_incident

samiv1y ago

Basically what you're saying is

"If you never have an accident you don't need seat belts in the car, and since we test drove the vehicle in the factory parking lot and didn't have an accident we decided not to have the seat belts".

Point being asserts are the final back stop. Your unit tests don't help you validate/test any real execution instance or function call that happens right now in production.

You're right in the sense though that if you have some functionality that is only ever called in a way that all the calls known ahead of time and you can test all the possible code paths and inputs then you can get away without asserts. But I find that these scenarios don't manifest themselves that often. Most code executions are impossible to know 100% ahead of time and your unit tests are only ever testing a subset of all possible inputs and execution flows and even if you have 100% bullet proof coverage right now in the future you probably won't and then you're just one innocent change away from letting bugs slip through in your production runs.

1 more reply

sn91y ago

> Any assertion that validates a runtime invariant can (and IMO should) be converted into a test which covers that same invariant, with coverage information proved by tooling.

How would you unit test a loop invariant [0]?

[0] https://en.wikipedia.org/wiki/Loop_invariant

1 more reply

BiteCode_dev1y ago

In Python, contract based programming can be only be done properly with asserts:

- You can't use the type system for some constraints.

- Asserts can be stripped in production so you don't pay the price for them.

- There is not other syntax for that.

Unfortunatly, because most devs don't know about "-O" and use asserts for things you should use an Exception for, you can't use it: https://www.bitecode.dev/p/the-best-python-feature-you-canno...

EE84M3i1y ago

Is it common to use parenthesis for python assert statements?

masklinn1y ago

I would say no. In fact it is risky, because if you try to add an assertion message inside the parens you’re now asserting a non-empty tuple which is always truthy. Though obviously you might know enough to not make this error, and want the parens for consistency when asserts need to be multiline.

tratt might also be doing that more for consistency with the desugaring than as a routine behaviour.

piker1y ago

What about the fact that the assert causes a panic that crashes the program? Is there not a better alternative of doing your bounds checking as a first class aspect of the function and returning an error? If someone is deep into a session with your application and the cursor somehow drifts into an unknown state (which could be corrected with the “home” key or something else), for example, getting too crazy with assertions seems like a much worse result. (E.g. “assert!(is_on_screen())”)

samiv1y ago

Your example is an incorrect use of an assert.

You never use assert for conditions that are logical (error) conditions that the program is expected to handle.

For example your browser handling 404 HTML error is only an error from the user perspective. From the software correctness perspective there's no error, there's just a logical condition that needs to be reasoned about.

Compare this to a BUG which is a mistake (an error made by the programmer), for example violating some invariant, going out of bounds on an array etc.

This is a scenario where the program is violating its own logic and constraints and as a result is no longer in a valid state. Assert is a tool to catch BUGS made by the programmer and nothing else.

piker1y ago

Interesting. An example would be helpful.

Perhaps I should have been clearer that it was a programming error that allowed the cursor to get to a state where it wasn’t visible.

wruza1y ago

The distinction between debug and release mode wrt my code (assertions, debug logging) was always confusing to me. The experience told me uncountable times that “release” (iow “prod”) is where you want debug information to be produced, cause that’s where your code meets full-blown reality. Otherwise your system silently breaks in production after few weeks, and two weeks later someone will ask you why and to repair it asap. Good luck investigating it without megabytes of the debug info. And don’t expect that you’ll get only definitive questions. In “who’s right” and disputes the question you receive will often have a great share of uncertainty.

Assertions are basically opposite to that. Useless in development because you watch/test closely anyway and useless in “release” because they vanish.

As I rarely write #CLK-level performance-required code - like most developers I believe - and mostly create systems that just do things rather than doing things in really tight loops many times a second, I always leave as much debug info and explicit failure modes as it is reasonable in my production code, so that the few-weeks failure would be immediate and explained in the logs (“reasonable” being readable without holding pgdn and not filling the ssd just after a week). It doesn’t mean that the system must crash hard as in sigsegv, ofc. It means it reports and logs all(!) steps and errors as they occur in a form that is obviously greppable by all important params (client id, task id, module tag, etc), and it stops a logical process in which an inconsistency occurred from advancing it further. If someone asks you later why it happened or what happened at the specific time with a specific process, you always have the answer almost immediately.

Tldr. Unless you write performance-first systems, i.e. you have performance requirements document in any form to follow, don’t turn off assertions and do log everything. You’ll thank yourself later.

munch1171y ago

I understand and respect your position.

Nonetheless, I use asserts that are deactivated in release builds. The reason I do that is not because I need the speed. It's because it frees me from having to think about speed at all, when writing assertions. And that makes me write more assertions.

You could ask me, if I were to enable assertions in release builds today, what would the slowdown be? Would I even notice? And my answer would be, I don't know and I don't care. What I do know is that if assertions had been enabled in release builds from the start, then I would have written fewer of them.

sfink1y ago

It depends on the application. For some applications, a problem in an assert-less production application is easy to reproduce in an assert-ful debug build, because the relevant inputs are easy to capture and replay.

The reason to not default to leaving all assertions and logging enabled is that performance-sensitive applications are pretty common. They're not performance-first, but the performance of any user-facing application matters. If leaving the asserts in provides good enough performance, do that. If dynamically enabled asserts provide good enough performance, do that -- at least you'll be able to quickly retry things. And since different asserts have different costs, do what the article says and distinguish between assert and debug_assert.

j / k navigate · click thread line to collapse

85 comments

sfink1y ago

    > These days I thus view asserts as falling into two categories:
    >    1. Checking problem domain assumptions.
    >    2. Checking internal assumptions.

(1) is the category of assert that should not be an assert. That is an error to be handled, not asserted.

josephg1y ago

Yeah I do this kind of fuzz testing all the time. Its an incredible way to test code.

hawski1y ago

sfink1y ago

But that's only worth it for some CLI tools. For many, I agree that spewing out an assert failure is plenty good enough.

nneonneo1y ago

It's worth pointing out that Python's asserts can also be "compiled" away if you use the -O flag (or PYTHONOPTIMIZE=1), which eliminates their runtime cost.

It's also worth pointing out that this is the reason you should *never* put side-effects or security-relevant checks in assert statements. For example, you should never do something like this:

  assert f.read(4) == b"\x89PNG", "Not a PNG file"
  # proceed to read and parse the rest of the file

but rather, you should do

  magic = f.read(4)
  assert magic == b"\x89PNG", "Not a PNG file"

so that your code doesn't suddenly break when someone decides to be clever and use -O.

Also, fun unrelated fact: Python does have something like a preprocessor, although it's rarely used. If you condition on the flag __debug__:

  if __debug__:
    expensive_runtime_check()

alfiedotwtf1y ago

> but rather, you should do > magic = f.read(4) > assert magic == b"\x89PNG", "Not a PNG file"

If the `assert` compiles out, wouldn’t -O also possibly compile the `read()` out as well given `magic` isn’t used after the assign?

masklinn1y ago

No. Ignoring that Python does not have anywhere near this level of optimisation, read() has side effects so optimising it away would be broken in the general case.

alfiedotwtf1y ago

Ah side effects, which Python doesn’t track anyway. Thanks

juliangmp1y ago

oefrha1y ago

It’s trivial to write a debug_assert macro in C, so no, you’re not forced to do that.

delaaxe1y ago

It says so in the last paragraph

dmurray1y ago

Rust provides some remarkably rich features to help you reason about the assumptions, preconditions and postconditions your code has, but debug_assert isn't one of them.

phamilton1y ago

Asserts are convenient, but every time I encounter them I ask "Can the compiler prove this can't happen?"

In the example of "min(ages) > 0", making age a NonZero type renders the assert unnecessary. Rust even has some fancy perf optimizations it can do with that information. It's a win all around.

winter_blue1y ago

If you created a custom NonZero type, how would the Rust compiler figure out how to optimize that?

How would one communicate properties/traits of a custom type (like a non-zero unit) that a compiler can leverage to optimize (in general, for any programming language)?

dwattttt1y ago

An example of the niche optimisations is in that link; if a 32bit number is NonZero, you can put that in an Option, and your Option<NonZero<u32>> will be the same size as a normal u32.

phamilton1y ago

An interesting project for niche types: https://github.com/rick-de-water/nonany

It's not perfect but does allow some flexibility.

IshKebab1y ago

jiggawatts1y ago

In terms of category theory, what we would need is subtraction and division types on top of product and sum types.

So a 32-bit integer is the product of 32 two-state bit types. Something akin to NonZero could be defined as that type minus one state, such that there are now 4294967296 - 1 representable values.

This concept is more generally useful, not just for bit-packing.

There's a whole ecosystem of tools that work around the problem of not having this as a built-in feature in typical languages: AutoMapper, automap, Mapster, etc...

3 more replies

Timshel1y ago

In general such a thing is written with a wrapper type which disappear at compilation (a kind of stricter type alias).

In rust might be a NewType: https://www.howtocodeit.com/articles/ultimate-guide-rust-new...

erik_seaberg1y ago

I remember someone arguing that disabling assertions in prod is like wearing a life jacket in the harbor but throwing it overboard going to sea. And Moore's Law paid for them years ago.

martinhath1y ago

This is from C.A.R Hoare's "Prospects for a better programming language" (1972) [0]:

It is also quoted by Donald Knuth in "Structured programming with goto statements" (1974) [1] ( which incidentally is also the source of the quote about premature optimization):

[0]: https://ora.ox.ac.uk/objects/uuid:dff9483b-e72f-4599-bf90-76... p. 341

[1]: https://dl.acm.org/doi/pdf/10.1145/356635.356640 p. 269

teo_zero1y ago

I'd say it's more like wearing a life jacket while building and testing a ship, but not imposing to every passenger to wear one once the ship is certified and put in service.

garaetjjte1y ago

Some replace asserts with __builtin_unreachable - I guess that would be like filling life jacket with stones?

sfink1y ago

More like filling them with beer.

If the ship sinks, they're worse than useless. But since you've decided they'll never be needed, you get more beer for your cruise.

archargelod1y ago

Nim has this too:

  - `assert` is disabled in unsafe `danger` mode or can be disabled with a flag for performance
  - `doAssert` cannot be disabled

While I've never come across an argument for why there are two types of assert, over time I’ve naturally started using them in the same way as the author.

IshKebab1y ago

Note that Rust got this exactly right. assert, debug_assert. Clear and fail-safe.

archargelod1y ago

I don't understand, what's wrong with assert and doAssert?

They are both enabled in release AND debug modes. You would have to explicitly compile code with -d:danger flag to disable any assertions.

> And they didn't even name the safe version clearly!

In this context safe version is clearly named as "release" mode, and unsafe one is even more clear - "danger" mode. "danger" obviously implies it should be used with caution.

IshKebab1y ago

Ok I guess we've all got to learn it at some point. Maybe there should be some kind of test of lessons that "the industry" has learnt for new people. Anyway...

That's why it's `assert`/`debug_assert` not `release_assert`/`assert`.

There are a couple of famous examples where they got it completely wrong: Python's YAML library has `load` and `load_safe`. MySQL has `escape` and `real_escape`. There's probably more.

1 more reply

HdS841y ago

The names do not communicate what's happening and when. Do assert and assert, what's the difference? Something like debugassert and alwaysAssert would be much more understandable

thegeekpirate1y ago

Nim is my anti-language... I'd make the opposite language design choice in almost every instance.

davelee1y ago

Swift calls these precondition() and assert(). Preconditions are enabled in release (and debug) builds, and asserts are enabled in debug builds only.

darkr1y ago

Similarly python in optimised mode with -O or -OO flags will disable asserts.

sagacity1y ago

One point I didn't see mentioned that asserts can be used by the compiler to enable certain optimisations.

For instance, if you assert that an incoming index into a function is within the bounds of a vector, then during the rest of the function the compiler can elide any bounds checking.

teo_zero1y ago

What C compiler would add bounds checks?

sagacity1y ago

That was just an example. In Rust bounds checks can be eliminated in this way, but I'm sure there are similar optimisation opportunities in C code as well.

bsder1y ago

"Using runtime assert() in a tight performance loop can impact performance?" Um, like ... duh?

The bigger issue as alluded to is assert() in libraries. As a user, you can't add an assert() to a library. And also, as a user, you can't remove an assert() that is getting in your way.

justincormack1y ago

Yeah, assert in a performance loop is generally a mistake.

tmtvl1y ago

Common Lisp does assertions right: you can let `assert' provide a restart which allows the user to fix a problem when it crops up. For example:

  (assert (< index (length some-vector))
    (index)
    "Can't change the element at index ~D, index must be smaller than ~D."
    index
    (length some-vector))

That will print a message when the index is too large and give you the option of providing another index.

mort961y ago

Wat's up with calling the variable "index" in the second parameter to assert? What happens when you try to call what's presumably an integer..?

masklinn1y ago

There is no call. assert is a macro, the second parameter is a list of places which the restart can update.

foota1y ago

It would be interesting if a language allowed control flow to jump between catches and exceptions with named sort of exceptions.

E.g., imagine in this example that the code code throw an invalid index exception, some calling code could catch that, and supply a new index, and control flow would resume from the throw expression.

This would be a complete mess, but it would be interesting nonetheless :)

masklinn1y ago

That is literally what Common Lisp has and GP describes…

https://en.m.wikibooks.org/wiki/Common_Lisp/Advanced_topics/...

taeric1y ago

It is always fun when people learn some of the things that common lisp has done for a long time.

foota1y ago

2 more replies

skavi1y ago

Exceptions are a special case of general effect systems which do include the capability you describe.

PhilipRoman1y ago

You can do it easily in any language with coroutines, for example Lua.

wruza1y ago

frizlab1y ago

chrishill891y ago

Basically there are three cases.

1. The performance hit of the assertion checks are okay in production

2. It has a cost but can be lived with

3. Cannot be tolerated

For number two there is some space to play with.

That assumes that the code path is stable. You might need to enable the assertion again if the code path changes.

Then every time you need to regenerate and edit the code you would turn the assertions back on.

sfink1y ago

chrishill891y ago

This is Spring Boot so it seems possible.

joshka1y ago

The cost of a unit test is some learning and some time. LLMs are making the time portion drive towards zero.

As a general rule, I'd say avoid littering your code with assertions. It's a crappy engineering practice.

samiv1y ago

Respectfully I completely disagree and I think you've got it rather backwards since you've had some bad experiences (understandable).

Assert and (unit)tests are completely orthogonal and unrelated things in terms of functionality even though both are aimed at improving the software correctness.

When the bugs are quick to detect and diagnose you'll learn that you have less and less asserts triggering in production and thus you end up with improved product quality.

As a general rule I'd say use asserts liberally to verify things such as invariants, post- and pre-conditions and ALWAYS have them built-in.

Finally I want to point out that using assert is not ERROR checking (file not found, IP address not resolved, TCP connection failed, system resource failed to allocate, etc.) but BUG checking.

Do not write code for BUGS.

sfink1y ago

> Finally I want to point out that using assert is not ERROR checking (file not found, IP address not resolved, TCP connection failed, system resource failed to allocate, etc.) but BUG checking.

> Do not write code for BUGS.

I have nothing to add, but am quoting the above because it is very well put.

joshka1y ago

> since you've had some bad experiences

No, I've had great experiences with assertions in code. People have paid my salary because the assertions are invalid and cause more problems than they solved. :D

The direct counterpoint to this is that:

Any assertion that validates a runtime invariant can (and IMO should) be converted into a test which covers that same invariant, with coverage information proved by tooling.

---

[1]: https://en.wikipedia.org/wiki/Npm_left-pad_incident

samiv1y ago

Basically what you're saying is

"If you never have an accident you don't need seat belts in the car, and since we test drove the vehicle in the factory parking lot and didn't have an accident we decided not to have the seat belts".

Point being asserts are the final back stop. Your unit tests don't help you validate/test any real execution instance or function call that happens right now in production.

1 more reply

sn91y ago

> Any assertion that validates a runtime invariant can (and IMO should) be converted into a test which covers that same invariant, with coverage information proved by tooling.

How would you unit test a loop invariant [0]?

[0] https://en.wikipedia.org/wiki/Loop_invariant

1 more reply

BiteCode_dev1y ago

In Python, contract based programming can be only be done properly with asserts:

- You can't use the type system for some constraints.

- Asserts can be stripped in production so you don't pay the price for them.

- There is not other syntax for that.

Unfortunatly, because most devs don't know about "-O" and use asserts for things you should use an Exception for, you can't use it: https://www.bitecode.dev/p/the-best-python-feature-you-canno...

EE84M3i1y ago

Is it common to use parenthesis for python assert statements?

masklinn1y ago

tratt might also be doing that more for consistency with the desugaring than as a routine behaviour.

piker1y ago

samiv1y ago

Your example is an incorrect use of an assert.

You never use assert for conditions that are logical (error) conditions that the program is expected to handle.

Compare this to a BUG which is a mistake (an error made by the programmer), for example violating some invariant, going out of bounds on an array etc.

This is a scenario where the program is violating its own logic and constraints and as a result is no longer in a valid state. Assert is a tool to catch BUGS made by the programmer and nothing else.

piker1y ago

Interesting. An example would be helpful.

Perhaps I should have been clearer that it was a programming error that allowed the cursor to get to a state where it wasn’t visible.

wruza1y ago

Assertions are basically opposite to that. Useless in development because you watch/test closely anyway and useless in “release” because they vanish.

munch1171y ago

I understand and respect your position.

sfink1y ago

j / k navigate · click thread line to collapse