Beware of Fast-Math (opens in new tab)

(simonbyrne.github.io)

319 pointsblobcode11mo ago224 comments

224 comments

orlp11mo ago

I helped design an API for "algebraic operations" in Rust: <https://github.com/rust-lang/rust/issues/136469>, which are coming along nicely.

These operations are

1. Localized, not a function-wide or program-wide flag.

2. Completely safe, -ffast-math includes assumptions such that there are no NaNs, and violating that is undefined behavior.

So what do these algebraic operations do? Well, one by itself doesn't do much of anything compared to a regular operation. But a sequence of them is allowed to be transformed using optimizations which are algebraically justified, as-if all operations are done using real arithmetic.

pclmulqdq11mo ago

-ffast-math is actually something like 15 separate flags, and you can use them individually if you want. 3 of them are "no NaNs," "no infinities," and "no subnormals." Several of the other flags allow you to treat math as associative or distributive if you want that.

The library has some merit, but the goal you've stated here is given to you with 5 compiler flags. The benefit of the library is choosing when these apply.

foota11mo ago

There's probably a benefit to being able to choose which you want in different places though, right?

glkindlmann11mo ago

That sounds neat. What would be really neat is if the language helped to expose the consequences of the ensuing rounding error by automating things that are otherwise clumsy for programmers to do manually, like running twice with opposite rounding directions, or running many many times with internally randomized directions (two of the options in Sec 4 of *). That is, it would be cool if Rust enabled people learn about the subtleties of floating point, instead of hiding them away.

* https://people.eecs.berkeley.edu/~wkahan/Mindless.pdf

eqvinox11mo ago

Are these calls going to clear the FTZ and DAZ flags in the MXCSR on x86? And FZ & FIZ in the FPCR on ARM?

orlp11mo ago

I don't believe so, no. Currently these operations only set the LLVM flags to allow reassociation, contraction, division replaced by reciprocal multiplication, and the assumption of no signed zeroes.

This can be expanded in the future as LLVM offers more flags that fall within the scope of algebraically motivated optimizations.

eqvinox11mo ago

Ah sorry I misunderstood and thought this API was for the other way around, i.e. forbidding "unsafe" operations. (I guess the question reverses to setting those flags)

('Naming: "algebraic" is not very descriptive of what this does since the operations themselves are algebraic.' :D)

1 more reply

evrimoztamur11mo ago

Does that mean that a physics engine written with these operations will always compile to yield the same deterministic outcomes across different platforms (assuming they correctly implement (or able to do so) algebraic operations)?

Sharlin11mo ago

It's more like the opposite. These tell the compiler to assume for optimization purposes that floats are associative and so on (ie. algebraic), even when in reality they aren't. So the results may vary depending on what transformations the compiler performs – in particular, they may vary between optimized and non-optimized builds, which normally isn't allowed.

vanderZwan11mo ago

> These tell the compiler to assume for optimization purposes that floats are associative and so on (ie. algebraic), even when in reality they aren't.

I wonder if it is possible to add an additional constraint that guarantees the transformation has equal or fewer numerical rounding errors. E.g. for floating point doubles (0.2 + 0.1) - 0.1 results in 0.20000000000000004, so I would expect that transforming some (A + B) - B to just A would always reduce numerical error. OTOH, it's floating point maths, there's probably some kind of weird gotcha here as well.

3 more replies

orlp11mo ago

No, there is no guarantee which (if any) optimizations are applied, only that they may be applied. For example a fused multiply-add instruction may be emitted for a*b + c on platforms which support it, which is not cross-platform.

SkiFire1311mo ago

No, the result may depend on how the compiler reorders them, which could be different on different platforms.

smcameron11mo ago

One thing I did not see mentioned in the article, or in these comments (according to ctrl-f anyway) is the use of feenableexcept()[1] to track down the source of NaNs in your code.

    feenableexcept(FE_DIVBYZERO | FE_INVALID | FE_OVERFLOW);

will cause your code to get a SIGFPE whenever a NaN crawls out from under a rock. Of course it doesn't work with fast-math enabled, but if you're unknowingly getting NaNs without fast-math enabled, you obviously need to fix those before even trying fast-math, and they can be hard to find, and feenableexcept() makes finding them a lot easier.

[1] https://linux.die.net/man/3/feenableexcept

DavidVoid11mo ago

Yeah it's pretty useful to enable every once in a while just to see if anything complains.

Be very careful with it in production code though [1]. If you're in a dll then changing the FPU exception flags is a big no-no (unless you're really really careful to restore them when your code goes out of scope).

[1]: https://randomascii.wordpress.com/2016/09/16/everything-old-...

jart11mo ago

Trapping math is the enlightened way to do things. I wrote an example in the cosmo repo of how to use it. https://github.com/jart/cosmopolitan/blob/master/examples/tr...

emn1311mo ago

I get the feeling that the real problem here are the IEEE specs themselves. They include a huge bunch of restrictions that each individually aren't relevant to something like 99.9% of floating point code, and probably even in aggregate not a single one is relevant to a large majority of code segments out in the wild. That doesn't mean they're not important - but some of these features should have been locally opt-in, not opt out. And at the very least, standards need to evolve to support hardware realities of today.

Not being able to auto-vectorize seems like a pretty critical bug given hardware trends that have been going on for decades now; on the other hand sacrificing platform-independent determinism isn't a trivial cost to pay either.

I'm not familiar with the details of OpenCL and CUDA on this front - do they have some way to guarrantee a specific order-of-operations such that code always has a predictable result on all platforms and nevertheless parallelizes well on a GPU?

adrian_b11mo ago

Not being able to auto-vectorize is not the fault of the IEEE standard, but the fault of those programming languages which do not have ways to express that the order of some operations is irrelevant, so they may be executed concurrently.

Most popular programming languages have the defect that they impose a sequential semantics even where it is not needed. There have been programming languages without this defect, e.g. Occam, but they have not become widespread.

Because nowadays only a relatively small number of users care about computational applications, this defect has not been corrected in any mainline programming language, though for some programming languages there are extensions that can achieve this effect, e.g. OpenMP for C/C++ and Fortran. CUDA is similar to OpenMP, even if it has a very different syntax.

The IEEE standard for floating-point arithmetic has been one of the most useful standards in all history. The reason is that both hardware designers and naive programmers have always had the incentive to cheat in order to obtain better results in speed benchmarks, i.e. to introduce errors in the results with the hope that this will not matter for users, which will be more impressed by the great benchmark results.

There are always users who need correct results more than anything else and it can be even a matter of life and death. For the very limited in scope uses where correctness does not matter, i.e. mainly graphics and ML/AI, it is better to use dedicated accelerators, GPUs and NPUs, which are designed by prioritizing speed over correctness. For general-purpose CPUs, being not fully-compliant with the IEEE standard is a serious mistake, because in most cases the consequences of such a choice are impossible to predict, especially not by the people without experience in floating-point computation who are the most likely to attempt to bypass the standard.

Regarding CUDA, OpenMP and the like, by definition if some operations are parallelizable, then the order of their execution does not matter. If the order matters, then it is impossible to provide guarantees about the results, on any platform. If the order matters, it is the responsibility of the programmer to enforce it, by synchronization of the parallel threads, wherever necessary.

Whoever wants vectorized code should never rely on programming languages like C/C++ and the like, but they should always use one of the programming language extensions that have been developed for this purpose, e.g. OpenMP, CUDA, OpenCL, where vectorization is not left to chance.

emn1311mo ago

If you care about absolute accuracy, I'm skeptical you want floats at all. I'm sure it depends on the use case.

Whether it's the standards fault or the languages fault for following the standard in terms of preventing auto-vectorization is splitting hairs; the whole point of the standard is to have predictable and usually fairly low-error ways of performing these operations, which only works when the order of operations is defined. That very aim is the problem; to the extent the stardard is harmless when ordering guarrantees don't exist you're essentially applying some of those tricky -ffast-math suboptimizations.

But to be clear in any case: there are obviously cases whereby order-of-operations is relevant enough and accuracy altering reorderings are not valid. It's just that those are rare enough that for many of these features I'd much prefer that to be the opt-in behavior, not opt-out. There's absolutely nothing wrong with having a classic IEEE 754 mode and I expect it's an essentialy feature in some niche corner cases.

However, given the obviously huge application of massively parallel processors and algorithms that accept rounding errors (or sometimes conversely overly precise results!), clearly most software is willing to generally accept rounding errors to be able to run efficiently on modern chips. It just so happens that none of the computer languages that rely on mapping floats to IEEE 754 floats in a straitforward fashion are any good at that, which is seems like its a bad trade off.

There could be multiple types of floats instead; or code-local flags that delineate special sections that need precise ordering; or perhaps even expressions that clarify how much error the user is willing to accept and then just let the compiler do some but not all transformations; and perhaps even other solutions.

alfiedotwtf11mo ago

> Most popular programming languages have the defect that they impose a sequential semantics even where it is not needed. There have been programming languages without this defect, e.g. Occam, but they have not become widespread.

We have memory ordering functions to let compilers know the atomic operation preference of the programmer… couldn’t we do the same for maths and in general a set of expressions?

adrian_b11mo ago

An example of programming language syntax that avoids to specify sequential execution where not needed is to specify that a sequence of expressions separated by semicolons must be executed sequentially, but a sequence of expressions separated by commas may be executed in any order or concurrently.

This is just a minor change from the syntax of the most popular programming languages, because they typically already specify that the order of evaluation of the expressions used for the arguments of a function, which are separated by commas, can be arbitrary.

Early in its history, the C language has been close to specifying this behavior for its comma operator, but unfortunately its designers have changed their mind and they have made the comma operator behave like a semicolon, in order to be able to use it inside for statement headers, where the semicolons have a different meaning. A much better solution for C, instead of making both comma and semicolon to have the same behavior, would have been to allow a block to appear in any place where an expression is expected, giving it the value of the last expression evaluated in the block.

dzaima11mo ago

The precise requirements of IEEE-754 may not be important for any given program, but as long as you want your numbers to have any form of well-defined semantics beyond "numbers exist, and here's a list of functions that do Something™ that may or may not be related to their name", any number format that's capable of (approximately) storing both 10^20 and 10^-20 in 64 bits is gonna have those drawbacks.

AFAIK GPU code is basically always written as scalar code acting on each "thing" separately, that's, as a whole, semantically looped over by the hardware, same way as multithreading would (i.e. no order guaranteed at all), so you physically cannot write code that'd need operation reordering to vectorize. You just can't write an equivalent to "for (each element in list) accumulator += element;" (or, well, you can, by writing that and running just one thread of it, but that's gonna be slower than even the non-vectorized CPU equivalent (assuming the driver respects IEEE-754)).

adrian_b11mo ago

A CUDA "kernel" is the same thing as what has been called "parallel DO" or "parallel FOR" since 1963, or perhaps even earlier.

This is slightly obfuscated by not using a keyword like "for" or "do", by the fact that the body of the loop (the "kernel") is written in one place and and the header of the loop (which gives the ranges for the loop indices) is written in another place, and by the fact that the loop indices have standard names.

A "parallel for" may have as well a syntax identical with a sequential "for". The difference is that for the "parallel for" the compiler knows that the iterations are independent, so they may be scheduled to be executed concurrently.

NVIDIA has been always greatly annoying by inventing a huge amount of new terms that are just new words for old terms that have been used for decades in the computing literature, with no apparent purpose except of obfuscating how their GPUs really work. Worse, AMD has imitated NVIDIA, by inventing their own terms that correspond to those used by NVIDIA, but they are once again different.

anthk11mo ago

xargs does a parallel for too. And OFC Forth people might did that too in a breeze.

1 more reply

Affric11mo ago

How does IEEE 754 prevent auto-vectorisation?

dahart11mo ago

The spec doesn’t prevent auto-vectorization, it only says the language should avoid it when it wants to opt in to producing “reproducible floating-point results” (section 11 of IEEE 754-2019). Vectorizing can be implemented in different ways, so whether a language avoids vectorizing in order to opt in to reproducible results is implementation dependent. It also depends on whether there is an option to not vectorize. If a language only had auto-vectorization, and the vectorization result was deterministic and reproducible, and if the language offered no serial mode, this could adhere to the IEEE spec. But since C++ (for example) offers serial reductions in debug & non-optimized code, and it wants to offer reproducible results, then it has to be careful about vectorizing without the user’s explicit consent.

kzrdude11mo ago

If you write a loop `for x in array { sum += x }` Then your program is a specification that you want to add the elements in exactly that order, one by one. Vectorization would change the order.

dahart11mo ago

The bigger problem there is the language not offering a way to signal the author’s intent. If an author doesn’t care about the order of operations in a sum, they will still write the exact same code as the author who does care. This is a failure of the language to be expressive enough, and doesn’t reflect on the IEEE spec. (The spec even does suggest that languages should offer and define these sorts of semantics.) Whether the program is specifying an order of operations is lost when the language offers no way for a coder to distinguish between caring about order and not caring. This is especially difficult since the vast majority of people don’t care and don’t consider their own code to be a specification on order of operations. Worse, most people would even be surprised and/or annoyed if the compiler didn’t do certain simplifications and constant folding, which change the results. The few cases where people do care about order can be extremely important, but they are rare nonetheless.

stingraycharles11mo ago

Yup, because of the imprecision of floating points, cannot just assume that “(a + c) + (b + d)” is the same as “a + b + c + d”.

It would be pretty ironic if at some point fixed point / bignum implementations end up being faster because of this.

2 more replies

Kubuxu11mo ago

IIRC reordering additions can cause the result to change which makes auto-vectorisation tricky.

goalieca11mo ago

Floating point arithmetic is neither commutative or associative so you shouldn’t.

lo0dot011mo ago

While it technically correct to say this it also gets the wrong point across because it leaves out the fact that ordering changes create only a small difference. Other examples where arithmetic is not commutative, e.g. matrix multiplication , can create much larger differences.

1 more reply

StefanKarpinski11mo ago

Floating-point arithmetic is non-associative, but it is commutative for the operations that are algebraically commutative: x + y == y + x and x*y == y*x. And x - y = -(y - x) so subtraction is properly anti-commutative.

The only very marginal exception to this is that when both arguments are NaN, the return value will be NaN, but which NaN payload is returned can depend on argument order. But no one ever uses this because it's not specified, so it can't be used reliably for anything useful. The behavior I wish IEEE 754 had specified for this is to define a standard NaN value (or two), and when the return value of an op is NaN, and some of the arguments are non-standard NaNs, then one of those non-standard NaN values must be returned. This doesn't depend on argument order and allows NaN payloads to be reliably propagated, which would let you encode useful debugging information in NaN payloads and know that it will flow through the program.

layer811mo ago

IEEE-754 addition and multiplication is commutative. It isn't distributive, though.

eapriv11mo ago

Why is it not commutative?

1 more reply

ajross11mo ago

> I get the feeling that the real problem here are the IEEE specs themselves.

Well, all standards are bad when you really get into them, sure.

But no, the problem here is that floating point code is often sensitive to precision errors. Relying on rigorous adherence to a specification doesn't fix precision errors, but it does guarantee that software behavior in the face of them is deterministic. Which 90%+ of the time is enough to let you ignore the problem as a "tuning" thing.

But no, precision errors are bugs. And the proper treatment for bugs is to fix the bugs and not ignore them via tricks with determinism. But that's hard, as it often involves design decisions and complicated math (consider gimbal lock: "fixing" that requires understanding quaternions or some other orthogonal orientation space, and that's hard!).

So we just deal with it. But IMHO --ffast-math is more good than bad, and projects should absolutely enable it, because the "problems" it discovers are bugs you want to fix anyway.

chuckadams11mo ago

> (consider gimbal lock: "fixing" that requires understanding quaternions or some other orthogonal orientation space, and that's hard!)

Or just avoiding gimbal lock by other means. We went to the moon using Euler angles, but I don't suppose there's much of a choice when you're using real mechanical gimbals.

ajross11mo ago

That is the "tuning" solution. And mostly it works by limiting scope of execution ("just don't do that") and if that doesn't work by having some kind of recovery method ("push this button to reset", probably along with "use this backup to recalibrate"). And it... works. But the bug is still a bug. In software we prefer more robust techniques.

FWIW, my memory is that this was exactly what happened with Apollo 13. It lost its gyro calibration after the accident (it did the thing that was the "just don't do that") and they had to do a bunch of iterative contortions to recover it from things like the sun position (because they couldn't see stars out the iced-over windows).

NASA would have strongly preferred IEEE doubles and quaternions, in hindsight.

Sharlin11mo ago

> -funsafe-math-optimizations

What's wrong with fun, safe math optimizations?!

keybored11mo ago

Hah! I was just about to comment that I immediately read it as fun-safe, everytime I see it.

I guess that happens when I don’t deal with compiler flags daily.

vardump11mo ago

”This roller coaster is optimized to be Fun and Safe!”

Sharlin11mo ago

Many funroll loops in that coaster.

teleforce11mo ago

“Nothing brings fear to my heart more than a floating point number.” - Gerald Jay Sussman

Is there any IEEE standards committee working on FP alternative for examples Unum and Posit [1],[2].

[1] Unum & Posit:

https://posithub.org/about

[2] The End of Error:

https://www.oreilly.com/library/view/the-end-of/978148223986...

kvemkon11mo ago

I'm wondering, why there are still no announcements for hardware support of such approaches in CPUs.

vanderZwan11mo ago

Gustavson's last presentation starts with him holding an actual piece of hardware supporting posits, fwiw.

https://m.youtube.com/watch?v=vzVlQhaAZtQ

kvemkon11mo ago

Wow, great! Thanks for pointing this out.

Just found an announcement:

Calligo secures funds to scale POSIT-based chip development (07.04.2025)

https://www.verdict.co.uk/calligo-secures-funds/

neepi11mo ago

HP had proper deterministic decimal arithmetic since the 1970s.

teleforce11mo ago

Any link to that?

Q6T46nT668w6i3m11mo ago

Is this sarcasm? If not, the proposed posit standard, IEEE P3109.

pclmulqdq11mo ago

The current P3109 draft has no posits in it.

teleforce11mo ago

Great, didn't know that it exists.

storus11mo ago

This problem is happening even on Apple MPS with PyTorch in deep learning, where fast math is used by default in many operations, leading to a garbage output. I hit it recently while training an autoregressive image generation model. Here is a discussion by folks that hit it as well:

https://github.com/pytorch/pytorch/issues/84936

Sophira11mo ago

Previously discussed at https://news.ycombinator.com/item?id=29201473 (which the article itself links to at the end).

anthk11mo ago

On Forth, there's the philosophy of the fixed point:

https://www.forth.com/starting-forth/5-fixed-point-arithmeti...

With 32 and 64 bit numbers, you can just scale decimals up. So, Torvalds was right. On dangerous contexts (uper-precise medical doses, FP has good reasons to exist, and I am not completely sure).

Also, both Forth and Lisp internally suggest to use represented rationals before floating point numbers. Even toy lisps from https://t3x.org have rationals too. In Scheme, you have both exact->inexact and inexact->exact which convert rationals to FP and viceversa.

If you have a Linux/BSD distro, you may already have Guile installed as a dependency.

Hence, run it and then:

      scheme@(guile-user)> (inexact->exact 2.5)
      $2 = 5/2

      scheme@(guile-user)> (exact->inexact (/ 5 2))
      $3 = 2.5

Thus, in Forth, I have a good set of q{+,-,*,/} operations for rational (custom coded, literal four lines) and they work great for a good 99% of the cases.

As for irrational numbers, NASA used up 16 decimals, and the old 113/355 can be precise enough for a 99,99 of the pieces built in Earth. Maybe not for astronomical distances, but hey...

In Scheme:

         scheme@(guile-user)> (exact->inexact (/ 355 113))
         $5 = 3.1415929203539825

In Forth, you would just use

         : pi* 355 133 m*/ ;

with a great precision for most of the objects being measured against.

AlotOfReading11mo ago

Floats are fixed point, just done in log space. The main change is that the designers dedicated a few bits to variable exponents, which introduces alignment and normalization steps before/after the operation. If you don't mix exponents, you can essentially treat it as identical to a lower precision fixed point system.

anthk11mo ago

No, not even close. Scaling integers to mimic decimals under 32 and 64 bit can be much faster. And with 32 bit double numbers you can cover Plank numbers, so with 64 bit double numbers you can do any field.

eqvinox11mo ago

Those rational numbers fly out the window as soon as your math involves any kind of more complicated trigonometry, or even a square root…

stassats11mo ago

You can turn them back into rationals, (rational (sqrt 2d0)) => 6369051672525773/4503599627370496

Or write your own operations that compute to the precision you want.

1 more reply

dreamcompiler11mo ago

If you want high precision trig functions on rationals, nothing's stopping you from writing a Taylor series library for them. Or some other polynomial appromation or a lookup table or CORDIC.

anthk11mo ago

Check CORDIC, please.

https://en.wikipedia.org/wiki/CORDIC

Also, on sqrt functions, even a FP-enabled toy EForth under the Subleq VM (just as a toy, again, but it works) provides some sort of fsqrt functions:

    2 f fsqrt f.
    1.414 ok

Under PFE Forth, something 'bigger':

   40 set-precision ok  
   2e0 fsqrt f. 1.4142135623730951454746218587388284504414 ok

EForth's FP precision it's tiny but good enough for very small microcontrollers. But it wasn't so far from the exponents the 80's engineers worked to create properly usable machinery/hardware and even software.

leephillips11mo ago

This part was fascinating:

“The problem is how FTZ actually implemented on most hardware: it is not set per-instruction, but instead controlled by the floating point environment: more specifically, it is controlled by the floating point control register, which on most systems is set at the thread level: enabling FTZ will affect all other operations in the same thread.

“GCC with -funsafe-math-optimizations enables FTZ (and its close relation, denormals-are-zero, or DAZ), even when building shared libraries. That means simply loading a shared library can change the results in completely unrelated code, which is a fun debugging experience.”

cycomanic11mo ago

I think this article overstates the importance of the problems even for scientific software. In the scientific code I've written, noise processes are often orders of magnitude larger than what what is discussed here and I believe this applies to many (most?) simulations modelling the real world (i.e. Physics chemistry,..). At the same time enabling fast-math has often yielded a very significant (>10%) performance boost.

I particularly find the discussion of - fassociative-math because I assume that most writers of some code that translates a mathetical formula to into simulations will not know which would be the most accurate order of operations and will simply codify their derivation of the equation to be simulated (which could have operations in any order). So if this switch changes your results it probably means that you should have a long hard look at the equations you're simulating and which ordering will give you the most correct results.

That said I appreciate that the considerations might be quite different for libraries and in particular simulations for mathematics.

londons_explore11mo ago

It would be nice if there was some syntax for "math order matters, this is the order I want it done in".

Then all other math will be fast-math, except where annotated.

sfn4211mo ago

I thought most languages have this? If you simply write a formula operations are ordered according to the language specifiction. If you want different ordering you use parentheses.

Not sure how that interacts with this fast math thing, I don't use C

kstrauser11mo ago

That’s a different kind of ordering.

Imagine a function like Python’s `sum(list)`. In abstract, Python should be able to add those values in any order it wants. Maybe it could spawn a thread so that one process sums the first half in the list, another sums the second half at the same time, and then you return the sum of those intermediate values. You could imagine a clever `sum()` being many times faster, especially using SIMD instructions or a GPU or something.

But alas, you can’t optimize like that with common IEEE-754 floats and expect to get the same answer out as when using the simple one-at-a-time addition. The result depends on what order you add the numbers together. Order them differently and you very well may get a different answer.

That’s the kind of ordering we’re talking about here.

hansvm11mo ago

The article mentioned that gcc and clang have such extensions. Having it in the language is nice though, and that's the approach Zig took.

on_the_train11mo ago

I worked in cad, robotics and now semiconductor optics. In every single field, floating precision down to the very last digits was a huge issue

AlotOfReading11mo ago

"precision" is an ambiguous term here. There's reproducibility (getting the same results every time), accuracy (getting as close as possible to same results computed with infinite precision), and the native format precision.

ffast-math is sacrificing both the first and the second for performance. Compilers usually sacrifice the first for the second by default with things like automation fma contraction. This isn't a necessary trade-off, it's just easier.

There's very few cases where you actually need accuracy down to the ULP though. No robot can do anything meaningful with femtometer+ precision, for example. Instead you choose a development balance between reproducibility (relatively easy) and accuracy (extremely hard). In robotics, that will usually swing a bit towards reproducibility. CAD would swing more towards accuracy.

cycomanic11mo ago

Interesting, I stand corrected. In most of the fields I'm aware off one could easily work in 32bit without any issues.

I find the robotics example quite surprising in particular. I think the precision of most input sensors is less than 16bit so. If your inputs have this much noise on them how come you need so much precision your calculations?

spookie11mo ago

The precision isn't uniform across a range of possible inputs. This means you need a higher bit depth, even though "you aren't really using it", just so you can establish a good base precision you are sure you are hitting at every range. The part where you are saying "most sensors" is doing a lot of leverage here.

DavidVoid11mo ago

It matters for reproducibility between software versions, right?

I work in audio software and we have some comparison tests that compare the audio output of a chain of audio effects with a previous result. If we make some small refactoring of the code and the compiler decides to re-organize the arithmetic operations then we might suddenly get a slightly different output. So of course we disable fast-math.

One thing we do enable though, is flushing denormals to zero. That is predictable behavior and it saves some execution time.

recursivecaveat11mo ago

Yeah that is the killer for me. I'm not particularly attached to IEEE semantics. Unfortunately the replacement is that your results can change between any two compiles, for nearly any reason. Even if you think you don't care about tiny precision variances: consider that if you ever score and rank things with an algorithm that involves floats, the resulting order can change.

chuckadams11mo ago

I haven't worked with C in nearly 20 years and even I remember warnings against -ffast-math. It really ought not to exist: it's just a super-flag for things like -funsafe-math-optizations, and the latter makes it really clear that it's, well, unsafe (or maybe it's actually funsafe!)

datameta11mo ago

Luckily outside of mission critical systems, like in demoscene coding, I can happily use "44/7" as a 2pi approximation (my beloved)

zinekeller11mo ago

(2021)

Previous discussion: Beware of fast-math (Nov 12, 2021, https://news.ycombinator.com/item?id=29201473)

quotemstr11mo ago

All I want for Christmas is a programming language that uses dependant typing to make floating point precision part of the type system. Catastrophic cancellation should be a compiler error if you assign the output to a float with better ulps than you get with worst case operands.

thesuperbigfrog11mo ago

Ada might have what you want:

https://www.jviotti.com/2017/12/05/an-introduction-to-adas-s...

http://www.ada-auth.org/standards/22rm/html/RM-3-5-7.html

http://www.ada-auth.org/standards/22rm/html/RM-A-5-3.html

Ada also has fixed point types:

http://www.ada-auth.org/standards/22rm/html/RM-3-5-9.html

Affric11mo ago

For non-associativity what is the best way to order operations? Is there an optimal order for precision whereby more similar values are added/multiplied first?

EDIT: I am now reading Goldberg 1991

Double edit: Kahan Summation formula. Goldberg is always worth going back to.

zokier11mo ago

Herbie can optimize arbitrary floating point expressions for accuracy

https://herbie.uwplse.org/

hyghjiyhu11mo ago

One thing I wonder is what happens if you have an inline function in a header that is compiled with fast math by one translation unit and without in another.

jmb9911mo ago

I haven’t checked, but my assumption is that the output of each compilation unit will be different. The one definition rule doesn’t apply here (there’s still one definition), and there shouldn’t be a conflict if the functions are inlined in their respective compilation units so the linker shouldn’t complain.

Could be wrong but that’s my gut feeling.

cbarrick11mo ago

This page consistently crashes on Vivaldi for Android.

Vivaldi 7.4.3691.52

Android 15; ASUS_AI2302 Build/AQ3A.240812.002

boulos11mo ago

I've also come around to --ffast-math considered harmful. It's useful though to help find optimization opportunities, but in the modern (AVX2+) world, I think the risks outweigh the benefits.

I'm surprised by the take that FTZ is worse than reassociation. FTZ being environmental rather than per instruction is certainly unfortunate, but that's true of rounding modes generally in x86. And I would argue that most programs are unprepared to handle subnormals anyway.

By contrast, reassociation definitely allows more optimization, but it also prohibits you from specifying the order precisely:

> Allow re-association of operands in series of floating-point operations. This violates the ISO C and C++ language standard by possibly changing computation result.

I haven't followed standards work in forever, but I imagine that the introduction of std::fma, gets people most of the benefit. That combined with something akin to volatile (if it actually worked) would probably be good enough for most people. Known, numerically sensitive code paths would be carefully written, while the rest of the code base can effectively be "meh, don't care".

eqvinox11mo ago

I wish the Twitter links in this article weren't broken.

genewitch11mo ago

Change X to xcancel

Smaug12311mo ago

They aren't, at least for the spot-check I performed; probably you need to be logged in.

eqvinox11mo ago

All it says is "Something went wrong. Try reloading." — no indication having an account logged in would help (…and I don't feel like creating an account just to check…)

SunlitCat11mo ago

Maybe an unpopular opinion, but having to be logged in, is being broken. ;)

JKCalhoun11mo ago

> Even compiler developers can't agree.

> This is perhaps the single most frequent cause of fast-math-related StackOverflow questions and GitHub bug reports

The second line above should settle the first.

layer811mo ago

The first line points out that it doesn't, even if one thinks that it should. Also, note the "perhaps".

dirtyhippiefree11mo ago

I’m stunned by the following admission: “If fast-math was to give always the correct results, it wouldn’t be fast-math”

If it’s not always correct, whoever chooses to use it chooses to allow error…

Sounds worse than worthless to me.

razighter77711mo ago

The worst thing that strikes fear into me is seeing floating points used for real world currency. Dear god. So many things can go wrong. I always use unsigned integers counting number of cents. And if I gotta handle multiple currencies, then I'll use or make a wrapper class.

jksflkjl3jk311mo ago

Floating point math shouldn't be that scary. The rules are well defined in standards, and for many domains are the only realistic option for performance reasons.

I've spent most of my career writing trading systems that have executed 100's of billions of dollars worth of trades, and have never had any floating point related bugs.

Using some kind of fixed point math would be entirely inappropriate for most HFT or scientific computing applications.

usefulcat11mo ago

You can certainly make trading systems that work using floating point, but there are just so many fewer edge cases to consider when using fixed point.

With fixed point and at least 2 decimal places, 10.01 + 0.01 is always exactly equal to 10.02. But with FP you may end up with something like 10.0199999999, and then you have to be extra careful anywhere you convert that to a string that it doesn't get truncated to 10.01. That could be logging (not great but maybe not the end of the world if that goes wrong), or you could be generating an order message and then it is a real problem. And either way, you have to take care every time you do that, as opposed to solving the problem once at the source, in the way the value is represented.

> Using some kind of fixed point math would be entirely inappropriate for most HFT or scientific computing applications.

In the case of HFT, this would have to depend very greatly on the particulars. I know the systems I write are almost never limited by arithmetical operations, either FP or integer.

gamescr11mo ago

I work on game engines and the problem with floats isn't on small values like 10.01 but on large ones like 400,010.01 that's when the precision wildly varies.

3 more replies

kolbe11mo ago

It depends on what you're doing. If your system is a linear regression on 30 features, you should probably use floating point. My recollection is fixed is prohibitively slower and with far less FOSS support.

phendrenad211mo ago

I'm wondering if trading systems would run into the same issues as a bank or scientific calculation. You might not be making as many repeated calculations, and might not care if things are "off" by a tiny amount, because you're trading between money and securities, and the "loss" is part of your overhead. If a bank lost $0.01 after every 1 million transactions it would be a minor scandal.

usefulcat11mo ago

Personally, I would be more concerned about something like determining whether the spread is more than a penny. Something like:

    if (ask - bid > 0.01) {
        // etc
    }

With floating point, I have to think about the following questions: * What if the constant 0.01 is actually slightly greater than mathematical 0.01? * What if the constant 0.01 is actually slightly less than mathematical 0.01? * What if ask - bid is actually slightly greater than the mathematical result? * What if ask - bid is actually slightly less than the mathematical result?

With floating point, that seemingly obvious code is anything but. With fixed point, you have none of those problems.

Granted, this only works for things that are priced in specific denominations (typically hundredths, thousandths, or ten thousandths), which is most securities.

1 more reply

kolbe11mo ago

All your price field messages are sent to the exchange and back via fixed point, so you are using fixed point for at least some of the process (unless you're targeting those few crypto exchanges that use fp prices).

If you need to be extremely fast (like fpga fast), you don't waste compute transforming their fixed point representation into floating.

djrj477dhsnv11mo ago

Sure, string encodings are used for most APIs and ultra HFT may pattern match on the raw bytes, but for regular HFT if you're doing much math, it's going to be floating point math.

1 more reply

T0Bi11mo ago

> Using some kind of fixed point math would be entirely inappropriate for most HFT or scientific computing applications.

May I ask why? (generally curious)

jcranmer11mo ago

For starters, it's giving up a lot of performance, since fixed-point isn't accelerated by hardware like floating-point is.

1 more reply

Athas11mo ago

The problem with fixed point is in its, well, fixed point. You assign a fixed number of bits to the fractional part of the number. This gives you the same absolute precision everywhere, but the relative precision (distance to the next highest or lowest number) is worse for small numbers - which is a problem, because those tend to be pretty important. It's just overall a less efficient use of the bit encoding space (not just performance-wise, but also in the accuracy of the results you get back). Remember that fixed point does not mean absence of rounding errors, and if you use binary fixed point, you still cannot represent many decimal fractions such as 0.1.

1 more reply

osigurdson11mo ago

Fundamentally there is uncertainty associated with any physical measurement which is usually proportional to the magnitude being measured. As long as floating point is << this uncertainty results are equally predictive. Floating point numbers bake these assumptions in.

f33d517311mo ago

It's the front of house/back of house distinction. Front of house should use fixed point, back of house should use floating point. Unless you're doing trading, you want really strict rules with regards to rounding and such, which are going to be easier to achieve with fixed point.

pasc187811mo ago

I don't think it is that clear. The split I think is between calculating settlement amounts which lead to real transfers of money and so should be fixed point whilst risk, pricing (thus trading) and valuation use models which need many calculations so need to be floating point.

eddd-ddde11mo ago

How do you handle the lack of commutativity? I've always wondered about the practical implications.

jakevoytko11mo ago

I asked an ex-Bloomberg coder this question once after he told me he used floating points to represent currency all the time, and his response was along the lines of “unless you have blindingly-obvious problems like doing operations on near-zero numbers against very large numbers, these calculations are off by small amounts on their least-significant digits. Why would you waste the time or the electricity dealing with a discrepancy that’s not even worth the money to fix?”

jcranmer11mo ago

Floating-point is completely commutative (ignoring NaN payloads).

It's the associativity law that it fails to uphold.

BeetleB11mo ago

Nitpick: FP arithmetic is commutative. It's not associative.

nurettin11mo ago

I inherited systems that trade real world money using f64. They work surprisingly well, and the errors and bugs are almost never due to rounding. Those that are also have easy fixes. So I'm always baffled by this "expert opinion" of using integers for cents. It is pretty much up there with "never use python pickle it is unsafe" and "never use http, even if the program will never leave the subnet".

dataangel11mo ago

you can't accurately represent 10 cents with floats, 0.1 is not directly representable. same with 1 cent, 0.01. Seems like if you do and significant math on prices you should run into rounding issues pretty quickly?

adgjlsfhk111mo ago

no. Float64 has 16 digits of precision. Therefore even if you're dealing with trillions of dollars, you have accuracy down to the thousandth of a cent.

1 more reply

CamperBob211mo ago

At the end of a long chain of calculations you're going to round to the nearest 0.01. It will be a LONG time before errors caused by double-precision floats cause you to gain or lose a penny.

nulld3v11mo ago

I'm curious where you got this idea from because it is trivially disprovable by typing 0.1 or 0.01 into any python or JS REPL?

2 more replies

SonOfLilit11mo ago

You can make money modeling buy/sell decisions in floats and then having the bank execute them, but if the bank models your account as a float and loses a cent here and there, it will be sued into bankruptcy.

jcranmer11mo ago

A double-precision float has ~16 decimal digits of precision. Which means as long as your bank account is less than a quadrillion dollars, it can accurately store the balance to the nearest cent.

kccqzy11mo ago

You will not lose a cent here and there just by using float64, for the range of values that banks deal with. For added assurance, just round to the nearest cent after each operation.

1 more reply

simonw11mo ago

I've been having an interesting challenge relating to this recently. I'm trying to calculate costs for LLM usage, but the amounts of money involved are so tiny. Gemini 1.5 Flash 8B is $0.0375 per million tokens!

Should I be running my accounting system on units of 10 billionths of a dollar?

scott_w11mo ago

Fixed point Decimal is your friend here. I’m guessing you buy tokens in increments of 1,000,000 so it isn’t too much of an issue to account for. You can then normalise in your accounting so 1,000,000 is just “1 unit,” or you can just account in increments of 1,000,000 but that does start looking weird (but might be necessary!)

Filligree11mo ago

No, billing happens per-token. It’s entirely necessary to use billionths of a dollar here, if you don’t use floating point.

1 more reply

marcosdumay11mo ago

Accounting happens on the unities people pay, not the ones that generate expenses.

But you probably should run your billing in fixed point or floating decimals with a billionth of a dollar precision, yes. Either that or you should consolidate the expenses into larger bunches.

outurnate11mo ago

You're better off representing values as rationals; a ratio between two different numbers. For example, 0.0375 would be represented as 375 over 10000, or 3 over 80

anthk11mo ago

From Forth, here's how I'd set the rationals:

    : gcd begin dup while tuck mod repeat drop ;
    : lcm 2dup \* abs -rot gcd / ;
    : reduce 2dup gcd tuck / >r / r> ;
    : q+ rot 2dup \* >r rot \* -rot \* + r> reduce ;
    : q- swap negate swap q+ ;
    : q\* rot \* >r  \* r> reduce ;
    : q/ >r \* swap r> \* swap reduce ;

Example: to compute 70 * 0.25 = 35/2

70 1 1 4 q* reduce .s 35 2 ok

On stack managing words like 2dup, rot and such, these are easily grasped under either Google/DDG or any Forth with the words "see" and/or "help".

as a hint, q- swaps the top two numbers in the stack, (which compose a rational), makes the last one negative and then turns back its position. And then it calls q+.

So, 2/5 - 3/2 = 2/5 + -3/2.

simonw11mo ago

Sounds hard to model in SQLite.

2 more replies

klysm11mo ago

Convert to money as late as possible

roryirvine11mo ago

This is surely the right answer: simply count the number of tokens used, and do the billing reconciliation as a separate step.

As an added benefit, it makes it much easier to deal with price changes.

latchkey11mo ago

Ethereum is 1e18 or 1 wei.

https://ethereum.stackexchange.com/questions/158517/does-sol...

kolbe11mo ago

I've used Auroa Units to do this. You can define the dollars dimension, and then all the nano-micro-whatever scale comes with.

scott_w11mo ago

For far too many years I had inherited a billing system that used floats for all calculations then rounded up or down. Also doing some calculations in JS and mirroring them on the Python backend, so “just switch to Decimal” wasn’t an easy change to make…

jcranmer11mo ago

I've found fear of the use of floating-point in finance to be a good litmus test for how knowledgeable people are about floating-point. Because as far as I can tell, finance people almost exclusively uses (binary) floating-point [1], whereas a lot of floating-point FUD focuses on how disastrous it is for finance. And honestly, it's a bit baffling to me why so many people seem to think that floating-point is disastrous.

My best guess for the latter proposition is that people are reacting to the default float printing logic of languages like Java, which display a float as the shortest base-10 number that would correctly round to that value, which extremely exaggerates the effect of being off by a few ULPs. By contrast, C-style printf specifies the number of decimal digits to round to, so all the numbers that are off by a few ULPs are still correct.

[1] I'm not entirely sure about the COBOL mainframe applications, given that COBOL itself predates binary floating-point. I know that modern COBOL does have some support for IEEE 754, but that tells me very little about what the applications running around in COBOL do with it.

munch11711mo ago

The answer is accounting. In accounting you want predictability and reproducibility more than anything, and you are prepared to throw away precision on that alter.

If you're summing up the cost of items in a webshop, then you're in the domain of accounting. If the result appears to be off by a single cent because of a rounding subtlety, then you're in trouble, because even though no one should care about that single cent, it will give the appearance that you don't know what you're doing. Not to mention the trouble you could get in for computing taxes wrong.

If, on the other hand, you're doing financial forecasting or computing stock price targets, then you're not in the domain of accounting, and using floating point for money is just fine.

I'm guessing from your post that your finance people are more like the latter. I could be wrong though - accountants do tend to use Excel.

jcranmer11mo ago

To get the right answers for accounting, all you have to do is pay attention to how you're doing rounding, which is no harder for floating-point than it is for fixed-point. Actually, it might be slightly easier for floating-point, since you're probably not as likely to skip over the part of the contract that tells you what the rounding rules you have to follow are.

1 more reply

pgwhalen11mo ago

I agree overall but my take is that it shows more ignorance about the domain of finance (or a particular subdomain) than it does about floating-point ignorance.

It’s really more of a concern in accounting, when monetary amounts are concrete and represent real money movement between distinct parties. A ton of financial software systems (HFT, trading in general) deal with money in a more abstract way in most of their code, and the particular kinds of imprecision that FP introduces doesn’t result in bad business outcomes that outweigh its convenience and other benefits.

munch11711mo ago

FP does not introduce imprecision. Quite the contrary: The continuous rounding (or truncation) triggered by using scaled integers is what introduces imprecision. Whereas exponent scaling in floating point ensures that all the bits in the mantissa are put to good use.

It's a trade-off between precision and predictability. Floating point provides the former. Scaled integers provide the latter.

1 more reply

layer811mo ago

Wait until you learn that Excel calculates everything using floating-point, and doesn't even fully observe IEEE 754.

https://learn.microsoft.com/en-us/office/troubleshoot/excel/...

(It nevertheless happens to work just fine for most of what Excel is used for.)

osigurdson11mo ago

Wouldn't it be better to use a decimal type?

MobiusHorizons11mo ago

This is what’s called a fixed point decimal type. If you need variable precision, then a decimal type might be a good idea, but fixed point removes a lot of potential foot guns if the constraints work for you.

osigurdson11mo ago

I meant fixed point decimal type (like C#) 128 bit. I don't understand why the parent commenter (top voted comment?) used unsigned integers to track individual cents. Why roll your own decimal type?

Using arbitrary precision doesn't make sense if the data needs to be stored in a database (for most situations at least). Regardless, infinite precision is magical thinking anyway: try adding Pi to your bank account without loss of precision.

1 more reply

jjmarr11mo ago

IEEE754 defines a floating point decimal type. What are your opinions on that?

1 more reply

rcleveng11mo ago

Wrappers are good even when non dealing with multiple currencies since in many places some transactions are in fractions of cents, so depending on the usecase may need to push that decimal a few places out.

I always have a wrapper class to put the logic of converting to whole currency units when and if needed, as well as when requirements change and now you need 4 digits past the decimal instead of 2, etc.

pie_flavor11mo ago

One of the things I always appreciate about the crypto community is that you do not have to ask what numeric type is being used for money, it is always 8-digit fixed-point. No floating-point rounding errors to be found anywhere.

immibis11mo ago

Correction: Bitcoin is 8-digit fixed-point. But Lightning is 10, IIRC. Other currencies have different conventions. Still, it's fixed within a given system and always fixed-point. As far as I'm aware, there are no floating-point cryptocurrencies at all, because it would be an obvious exploit vector - keep withdrawing 0.000000001 units from your account that has 1.0 units.

Athas11mo ago

How does this avoid rounding error? Division and multiplication and still result in nonrepresentable numbers, right?

pie_flavor11mo ago

It is not hard to remember what int division is about, when your types are ints in code. It also comes up almost never, and isn't what floating-point rounding error means. You aren't multiplying money 99% of the time, and when you are, you don't care about exacting precision (e.g. 20% discount). Floating-point rounding error, on the other hand, is about how 0.1 + 0.2 != 0.3.

knert11mo ago

How do you store negative numbers?

psychoslave11mo ago

Maybe as in accounting, one column for benefits, one for debts?

MobiusHorizons11mo ago

You use a signed integer type, so you just store a negative number.

You can think of fixed point as equivalent to ieee754 floats with a fixed exponent and a two’s complement mantissa instead of a sign bit.

sholladay11mo ago

Correctness > performance, almost always. It’s easier to notice that you need more performance than to notice that you need more correctness. Though performance outliers can definitely be a hidden problem that will bite you.

Make it work. Make it right. Make it fast.

mg79461311mo ago

Haha, the neverending cycle.

Stop trying. Let their story unfold. Let the pain commence.

Wait 30 years and see them being frustrated trying to tell the next generation.

rlpb11mo ago

> I mean, the whole point of fast-math is trading off speed with correctness. If fast-math was to give always the correct results, it wouldn’t be fast-math, it would be the standard way of doing math.

A similar warning applies to -O3. If an optimization in -O3 were to reliably always give better results, it wouldn't be in -O3; it'd be in -O2. So blindly compiling with -O3 also doesn't seem like a great idea.

CamouflagedKiwi11mo ago

The optimisations in -O3 aren't supposed to give incorrect results. They're not in -O2 because they make a more aggressive space/speed tradeoff or increase compile times more significantly. In the same way, the optimisations in -O2 are not meant to be less correct than -O1, but they aren't in that group for similar reasons.

-Ofast is the 'dangerous' one. (It includes -ffast-math).

rlpb11mo ago

> The optimisations in -O3 aren't supposed to give incorrect results.

I didn't mean to imply that they result in incorrect results.

> they make a more aggressive space/speed tradeoff...

Right...so "better" becomes subjective, depends on the use case, so it doesn't make sense to choose -O3 blindly unless you understand the trade-offs and want that side of them for the particular builds you're doing. Things that everyone wants would be in -O2. That's all I'm saying.

eqvinox11mo ago

It doesn't become subjective; things in -O3 can objectively be understood to produce equal or faster code for a higher build cost in the vast majority of cases, roughly averaged across platforms. (Without loss in correctness.)

If you know your exact target and details about your input expectations, of course you can optimize further, which might involve turning off some things in -O3 (or even -O2). On a whole bunch of systems, -Os can be faster than -O3 due to I-cache size limits. But at-large, you can expect -O3 to be faster.

Similar considerations apply for LTO and PGO. LTO is commonly default for release builds these days, it just costs a whole lot of compile time. PGO is done when possible (i.e. known majority inputs).

CamouflagedKiwi11mo ago

If they're things that everyone wants, why aren't they in -O1?

1 more reply

wffurr11mo ago

If the answer can be wrong, you can make it as fast as you want.

zzo38computer11mo ago

It depend what kind of wrong answers are acceptable (and in what circumstances).

bsenftner11mo ago

[flagged]

tomhow11mo ago

Be kind. Don't be snarky. Converse curiously; don't cross-examine. Edit out swipes.

Please don't fulminate.

https://news.ycombinator.com/newsguidelines.html

glkindlmann11mo ago

Forgive them. That kind of educational failure is common: floating point is just not covered in basic intro classes, or covered hurriedly. There are enough subtleties to how floating point code works that even when an elective class covers it, it is still peripheral to the focus of the class (e.g. intro computer graphics). So students can get by without really coming to terms with the ubiquity of rounding error, or the relationship between denormalized and normalized values. Understanding those fundamentals is a pre-req to understanding how corners are being cut with fast-math.

bsenftner11mo ago

This is not an issue of floating point math, it's a basic understanding that nothing comes without a cost. If there is a "faster anything" switch, that is at the cost of something else, and it is our entire professional integrity to understand that relationship, and then investigate what is being compromised by that "optimization". Anything less, that's not engineering.

j / k navigate · click thread line to collapse

224 comments

orlp11mo ago

I helped design an API for "algebraic operations" in Rust: <https://github.com/rust-lang/rust/issues/136469>, which are coming along nicely.

These operations are

1. Localized, not a function-wide or program-wide flag.

2. Completely safe, -ffast-math includes assumptions such that there are no NaNs, and violating that is undefined behavior.

pclmulqdq11mo ago

The library has some merit, but the goal you've stated here is given to you with 5 compiler flags. The benefit of the library is choosing when these apply.

foota11mo ago

There's probably a benefit to being able to choose which you want in different places though, right?

glkindlmann11mo ago

* https://people.eecs.berkeley.edu/~wkahan/Mindless.pdf

eqvinox11mo ago

Are these calls going to clear the FTZ and DAZ flags in the MXCSR on x86? And FZ & FIZ in the FPCR on ARM?

orlp11mo ago

I don't believe so, no. Currently these operations only set the LLVM flags to allow reassociation, contraction, division replaced by reciprocal multiplication, and the assumption of no signed zeroes.

This can be expanded in the future as LLVM offers more flags that fall within the scope of algebraically motivated optimizations.

eqvinox11mo ago

Ah sorry I misunderstood and thought this API was for the other way around, i.e. forbidding "unsafe" operations. (I guess the question reverses to setting those flags)

('Naming: "algebraic" is not very descriptive of what this does since the operations themselves are algebraic.' :D)

1 more reply

evrimoztamur11mo ago

Sharlin11mo ago

vanderZwan11mo ago

> These tell the compiler to assume for optimization purposes that floats are associative and so on (ie. algebraic), even when in reality they aren't.

3 more replies

orlp11mo ago

SkiFire1311mo ago

No, the result may depend on how the compiler reorders them, which could be different on different platforms.

smcameron11mo ago

One thing I did not see mentioned in the article, or in these comments (according to ctrl-f anyway) is the use of feenableexcept()[1] to track down the source of NaNs in your code.

    feenableexcept(FE_DIVBYZERO | FE_INVALID | FE_OVERFLOW);

[1] https://linux.die.net/man/3/feenableexcept

DavidVoid11mo ago

Yeah it's pretty useful to enable every once in a while just to see if anything complains.

[1]: https://randomascii.wordpress.com/2016/09/16/everything-old-...

jart11mo ago

Trapping math is the enlightened way to do things. I wrote an example in the cosmo repo of how to use it. https://github.com/jart/cosmopolitan/blob/master/examples/tr...

emn1311mo ago

adrian_b11mo ago

emn1311mo ago

If you care about absolute accuracy, I'm skeptical you want floats at all. I'm sure it depends on the use case.

alfiedotwtf11mo ago

We have memory ordering functions to let compilers know the atomic operation preference of the programmer… couldn’t we do the same for maths and in general a set of expressions?

adrian_b11mo ago

dzaima11mo ago

adrian_b11mo ago

A CUDA "kernel" is the same thing as what has been called "parallel DO" or "parallel FOR" since 1963, or perhaps even earlier.

anthk11mo ago

xargs does a parallel for too. And OFC Forth people might did that too in a breeze.

1 more reply

Affric11mo ago

How does IEEE 754 prevent auto-vectorisation?

dahart11mo ago

kzrdude11mo ago

If you write a loop `for x in array { sum += x }` Then your program is a specification that you want to add the elements in exactly that order, one by one. Vectorization would change the order.

dahart11mo ago

stingraycharles11mo ago

Yup, because of the imprecision of floating points, cannot just assume that “(a + c) + (b + d)” is the same as “a + b + c + d”.

It would be pretty ironic if at some point fixed point / bignum implementations end up being faster because of this.

2 more replies

Kubuxu11mo ago

IIRC reordering additions can cause the result to change which makes auto-vectorisation tricky.

goalieca11mo ago

Floating point arithmetic is neither commutative or associative so you shouldn’t.

lo0dot011mo ago

1 more reply

StefanKarpinski11mo ago

layer811mo ago

IEEE-754 addition and multiplication is commutative. It isn't distributive, though.

eapriv11mo ago

Why is it not commutative?

1 more reply

ajross11mo ago

> I get the feeling that the real problem here are the IEEE specs themselves.

Well, all standards are bad when you really get into them, sure.

So we just deal with it. But IMHO --ffast-math is more good than bad, and projects should absolutely enable it, because the "problems" it discovers are bugs you want to fix anyway.

chuckadams11mo ago

> (consider gimbal lock: "fixing" that requires understanding quaternions or some other orthogonal orientation space, and that's hard!)

Or just avoiding gimbal lock by other means. We went to the moon using Euler angles, but I don't suppose there's much of a choice when you're using real mechanical gimbals.

ajross11mo ago

NASA would have strongly preferred IEEE doubles and quaternions, in hindsight.

Sharlin11mo ago

> -funsafe-math-optimizations

What's wrong with fun, safe math optimizations?!

keybored11mo ago

Hah! I was just about to comment that I immediately read it as fun-safe, everytime I see it.

I guess that happens when I don’t deal with compiler flags daily.

vardump11mo ago

”This roller coaster is optimized to be Fun and Safe!”

Sharlin11mo ago

Many funroll loops in that coaster.

teleforce11mo ago

“Nothing brings fear to my heart more than a floating point number.” - Gerald Jay Sussman

Is there any IEEE standards committee working on FP alternative for examples Unum and Posit [1],[2].

[1] Unum & Posit:

https://posithub.org/about

[2] The End of Error:

https://www.oreilly.com/library/view/the-end-of/978148223986...

kvemkon11mo ago

I'm wondering, why there are still no announcements for hardware support of such approaches in CPUs.

vanderZwan11mo ago

Gustavson's last presentation starts with him holding an actual piece of hardware supporting posits, fwiw.

https://m.youtube.com/watch?v=vzVlQhaAZtQ

kvemkon11mo ago

Wow, great! Thanks for pointing this out.

Just found an announcement:

Calligo secures funds to scale POSIT-based chip development (07.04.2025)

https://www.verdict.co.uk/calligo-secures-funds/

neepi11mo ago

HP had proper deterministic decimal arithmetic since the 1970s.

teleforce11mo ago

Any link to that?

Q6T46nT668w6i3m11mo ago

Is this sarcasm? If not, the proposed posit standard, IEEE P3109.

pclmulqdq11mo ago

The current P3109 draft has no posits in it.

teleforce11mo ago

Great, didn't know that it exists.

storus11mo ago

https://github.com/pytorch/pytorch/issues/84936

Sophira11mo ago

Previously discussed at https://news.ycombinator.com/item?id=29201473 (which the article itself links to at the end).

anthk11mo ago

On Forth, there's the philosophy of the fixed point:

https://www.forth.com/starting-forth/5-fixed-point-arithmeti...

With 32 and 64 bit numbers, you can just scale decimals up. So, Torvalds was right. On dangerous contexts (uper-precise medical doses, FP has good reasons to exist, and I am not completely sure).

If you have a Linux/BSD distro, you may already have Guile installed as a dependency.

Hence, run it and then:

      scheme@(guile-user)> (inexact->exact 2.5)
      $2 = 5/2

      scheme@(guile-user)> (exact->inexact (/ 5 2))
      $3 = 2.5

Thus, in Forth, I have a good set of q{+,-,*,/} operations for rational (custom coded, literal four lines) and they work great for a good 99% of the cases.

As for irrational numbers, NASA used up 16 decimals, and the old 113/355 can be precise enough for a 99,99 of the pieces built in Earth. Maybe not for astronomical distances, but hey...

In Scheme:

         scheme@(guile-user)> (exact->inexact (/ 355 113))
         $5 = 3.1415929203539825

In Forth, you would just use

         : pi* 355 133 m*/ ;

with a great precision for most of the objects being measured against.

AlotOfReading11mo ago

anthk11mo ago

eqvinox11mo ago

Those rational numbers fly out the window as soon as your math involves any kind of more complicated trigonometry, or even a square root…

stassats11mo ago

You can turn them back into rationals, (rational (sqrt 2d0)) => 6369051672525773/4503599627370496

Or write your own operations that compute to the precision you want.

1 more reply

dreamcompiler11mo ago

If you want high precision trig functions on rationals, nothing's stopping you from writing a Taylor series library for them. Or some other polynomial appromation or a lookup table or CORDIC.

anthk11mo ago

Check CORDIC, please.

https://en.wikipedia.org/wiki/CORDIC

Also, on sqrt functions, even a FP-enabled toy EForth under the Subleq VM (just as a toy, again, but it works) provides some sort of fsqrt functions:

    2 f fsqrt f.
    1.414 ok

Under PFE Forth, something 'bigger':

   40 set-precision ok  
   2e0 fsqrt f. 1.4142135623730951454746218587388284504414 ok

leephillips11mo ago

This part was fascinating:

cycomanic11mo ago

That said I appreciate that the considerations might be quite different for libraries and in particular simulations for mathematics.

londons_explore11mo ago

It would be nice if there was some syntax for "math order matters, this is the order I want it done in".

Then all other math will be fast-math, except where annotated.

sfn4211mo ago

I thought most languages have this? If you simply write a formula operations are ordered according to the language specifiction. If you want different ordering you use parentheses.

Not sure how that interacts with this fast math thing, I don't use C

kstrauser11mo ago

That’s a different kind of ordering.

That’s the kind of ordering we’re talking about here.

hansvm11mo ago

The article mentioned that gcc and clang have such extensions. Having it in the language is nice though, and that's the approach Zig took.

on_the_train11mo ago

I worked in cad, robotics and now semiconductor optics. In every single field, floating precision down to the very last digits was a huge issue

AlotOfReading11mo ago

cycomanic11mo ago

Interesting, I stand corrected. In most of the fields I'm aware off one could easily work in 32bit without any issues.

spookie11mo ago

DavidVoid11mo ago

It matters for reproducibility between software versions, right?

One thing we do enable though, is flushing denormals to zero. That is predictable behavior and it saves some execution time.

recursivecaveat11mo ago

chuckadams11mo ago

datameta11mo ago

Luckily outside of mission critical systems, like in demoscene coding, I can happily use "44/7" as a 2pi approximation (my beloved)

zinekeller11mo ago

(2021)

Previous discussion: Beware of fast-math (Nov 12, 2021, https://news.ycombinator.com/item?id=29201473)

quotemstr11mo ago

thesuperbigfrog11mo ago

Ada might have what you want:

https://www.jviotti.com/2017/12/05/an-introduction-to-adas-s...

http://www.ada-auth.org/standards/22rm/html/RM-3-5-7.html

http://www.ada-auth.org/standards/22rm/html/RM-A-5-3.html

Ada also has fixed point types:

http://www.ada-auth.org/standards/22rm/html/RM-3-5-9.html

Affric11mo ago

For non-associativity what is the best way to order operations? Is there an optimal order for precision whereby more similar values are added/multiplied first?

EDIT: I am now reading Goldberg 1991

Double edit: Kahan Summation formula. Goldberg is always worth going back to.

zokier11mo ago

Herbie can optimize arbitrary floating point expressions for accuracy

https://herbie.uwplse.org/

hyghjiyhu11mo ago

One thing I wonder is what happens if you have an inline function in a header that is compiled with fast math by one translation unit and without in another.

jmb9911mo ago

Could be wrong but that’s my gut feeling.

cbarrick11mo ago

This page consistently crashes on Vivaldi for Android.

Vivaldi 7.4.3691.52

Android 15; ASUS_AI2302 Build/AQ3A.240812.002

boulos11mo ago

I've also come around to --ffast-math considered harmful. It's useful though to help find optimization opportunities, but in the modern (AVX2+) world, I think the risks outweigh the benefits.

By contrast, reassociation definitely allows more optimization, but it also prohibits you from specifying the order precisely:

> Allow re-association of operands in series of floating-point operations. This violates the ISO C and C++ language standard by possibly changing computation result.

eqvinox11mo ago

I wish the Twitter links in this article weren't broken.

genewitch11mo ago

Change X to xcancel

Smaug12311mo ago

They aren't, at least for the spot-check I performed; probably you need to be logged in.

eqvinox11mo ago

All it says is "Something went wrong. Try reloading." — no indication having an account logged in would help (…and I don't feel like creating an account just to check…)

SunlitCat11mo ago

Maybe an unpopular opinion, but having to be logged in, is being broken. ;)

JKCalhoun11mo ago

> Even compiler developers can't agree.

> This is perhaps the single most frequent cause of fast-math-related StackOverflow questions and GitHub bug reports

The second line above should settle the first.

layer811mo ago

The first line points out that it doesn't, even if one thinks that it should. Also, note the "perhaps".

dirtyhippiefree11mo ago

I’m stunned by the following admission: “If fast-math was to give always the correct results, it wouldn’t be fast-math”

If it’s not always correct, whoever chooses to use it chooses to allow error…

Sounds worse than worthless to me.

razighter77711mo ago

jksflkjl3jk311mo ago

Floating point math shouldn't be that scary. The rules are well defined in standards, and for many domains are the only realistic option for performance reasons.

I've spent most of my career writing trading systems that have executed 100's of billions of dollars worth of trades, and have never had any floating point related bugs.

Using some kind of fixed point math would be entirely inappropriate for most HFT or scientific computing applications.

usefulcat11mo ago

You can certainly make trading systems that work using floating point, but there are just so many fewer edge cases to consider when using fixed point.

> Using some kind of fixed point math would be entirely inappropriate for most HFT or scientific computing applications.

In the case of HFT, this would have to depend very greatly on the particulars. I know the systems I write are almost never limited by arithmetical operations, either FP or integer.

gamescr11mo ago

I work on game engines and the problem with floats isn't on small values like 10.01 but on large ones like 400,010.01 that's when the precision wildly varies.

3 more replies

kolbe11mo ago

phendrenad211mo ago

usefulcat11mo ago

Personally, I would be more concerned about something like determining whether the spread is more than a penny. Something like:

    if (ask - bid > 0.01) {
        // etc
    }

With floating point, that seemingly obvious code is anything but. With fixed point, you have none of those problems.

Granted, this only works for things that are priced in specific denominations (typically hundredths, thousandths, or ten thousandths), which is most securities.

1 more reply

kolbe11mo ago

If you need to be extremely fast (like fpga fast), you don't waste compute transforming their fixed point representation into floating.

djrj477dhsnv11mo ago

Sure, string encodings are used for most APIs and ultra HFT may pattern match on the raw bytes, but for regular HFT if you're doing much math, it's going to be floating point math.

1 more reply

T0Bi11mo ago

> Using some kind of fixed point math would be entirely inappropriate for most HFT or scientific computing applications.

May I ask why? (generally curious)

jcranmer11mo ago

For starters, it's giving up a lot of performance, since fixed-point isn't accelerated by hardware like floating-point is.

1 more reply

Athas11mo ago

1 more reply

osigurdson11mo ago

f33d517311mo ago

pasc187811mo ago

eddd-ddde11mo ago

How do you handle the lack of commutativity? I've always wondered about the practical implications.

jakevoytko11mo ago

jcranmer11mo ago

Floating-point is completely commutative (ignoring NaN payloads).

It's the associativity law that it fails to uphold.

BeetleB11mo ago

Nitpick: FP arithmetic is commutative. It's not associative.

nurettin11mo ago

dataangel11mo ago

adgjlsfhk111mo ago

no. Float64 has 16 digits of precision. Therefore even if you're dealing with trillions of dollars, you have accuracy down to the thousandth of a cent.

1 more reply

CamperBob211mo ago

At the end of a long chain of calculations you're going to round to the nearest 0.01. It will be a LONG time before errors caused by double-precision floats cause you to gain or lose a penny.

nulld3v11mo ago

I'm curious where you got this idea from because it is trivially disprovable by typing 0.1 or 0.01 into any python or JS REPL?

2 more replies

SonOfLilit11mo ago

jcranmer11mo ago

A double-precision float has ~16 decimal digits of precision. Which means as long as your bank account is less than a quadrillion dollars, it can accurately store the balance to the nearest cent.

kccqzy11mo ago

You will not lose a cent here and there just by using float64, for the range of values that banks deal with. For added assurance, just round to the nearest cent after each operation.

1 more reply

simonw11mo ago

Should I be running my accounting system on units of 10 billionths of a dollar?

scott_w11mo ago

Filligree11mo ago

No, billing happens per-token. It’s entirely necessary to use billionths of a dollar here, if you don’t use floating point.

1 more reply

marcosdumay11mo ago

Accounting happens on the unities people pay, not the ones that generate expenses.

But you probably should run your billing in fixed point or floating decimals with a billionth of a dollar precision, yes. Either that or you should consolidate the expenses into larger bunches.

outurnate11mo ago

You're better off representing values as rationals; a ratio between two different numbers. For example, 0.0375 would be represented as 375 over 10000, or 3 over 80

anthk11mo ago

From Forth, here's how I'd set the rationals:

    : gcd begin dup while tuck mod repeat drop ;
    : lcm 2dup \* abs -rot gcd / ;
    : reduce 2dup gcd tuck / >r / r> ;
    : q+ rot 2dup \* >r rot \* -rot \* + r> reduce ;
    : q- swap negate swap q+ ;
    : q\* rot \* >r  \* r> reduce ;
    : q/ >r \* swap r> \* swap reduce ;

Example: to compute 70 * 0.25 = 35/2

70 1 1 4 q* reduce .s 35 2 ok

On stack managing words like 2dup, rot and such, these are easily grasped under either Google/DDG or any Forth with the words "see" and/or "help".

as a hint, q- swaps the top two numbers in the stack, (which compose a rational), makes the last one negative and then turns back its position. And then it calls q+.

So, 2/5 - 3/2 = 2/5 + -3/2.

simonw11mo ago

Sounds hard to model in SQLite.

2 more replies

klysm11mo ago

Convert to money as late as possible

roryirvine11mo ago

This is surely the right answer: simply count the number of tokens used, and do the billing reconciliation as a separate step.

As an added benefit, it makes it much easier to deal with price changes.

latchkey11mo ago

Ethereum is 1e18 or 1 wei.

https://ethereum.stackexchange.com/questions/158517/does-sol...

kolbe11mo ago

I've used Auroa Units to do this. You can define the dollars dimension, and then all the nano-micro-whatever scale comes with.

scott_w11mo ago

jcranmer11mo ago

munch11711mo ago

The answer is accounting. In accounting you want predictability and reproducibility more than anything, and you are prepared to throw away precision on that alter.

If, on the other hand, you're doing financial forecasting or computing stock price targets, then you're not in the domain of accounting, and using floating point for money is just fine.

I'm guessing from your post that your finance people are more like the latter. I could be wrong though - accountants do tend to use Excel.

jcranmer11mo ago

1 more reply

pgwhalen11mo ago

I agree overall but my take is that it shows more ignorance about the domain of finance (or a particular subdomain) than it does about floating-point ignorance.

munch11711mo ago

It's a trade-off between precision and predictability. Floating point provides the former. Scaled integers provide the latter.

1 more reply

layer811mo ago

Wait until you learn that Excel calculates everything using floating-point, and doesn't even fully observe IEEE 754.

https://learn.microsoft.com/en-us/office/troubleshoot/excel/...

(It nevertheless happens to work just fine for most of what Excel is used for.)

osigurdson11mo ago

Wouldn't it be better to use a decimal type?

MobiusHorizons11mo ago

osigurdson11mo ago

I meant fixed point decimal type (like C#) 128 bit. I don't understand why the parent commenter (top voted comment?) used unsigned integers to track individual cents. Why roll your own decimal type?

1 more reply

jjmarr11mo ago

IEEE754 defines a floating point decimal type. What are your opinions on that?

1 more reply

rcleveng11mo ago

pie_flavor11mo ago

immibis11mo ago

Athas11mo ago

How does this avoid rounding error? Division and multiplication and still result in nonrepresentable numbers, right?

pie_flavor11mo ago

knert11mo ago

How do you store negative numbers?

psychoslave11mo ago

Maybe as in accounting, one column for benefits, one for debts?

MobiusHorizons11mo ago

You use a signed integer type, so you just store a negative number.

You can think of fixed point as equivalent to ieee754 floats with a fixed exponent and a two’s complement mantissa instead of a sign bit.

sholladay11mo ago

Make it work. Make it right. Make it fast.

mg79461311mo ago

Haha, the neverending cycle.

Stop trying. Let their story unfold. Let the pain commence.

Wait 30 years and see them being frustrated trying to tell the next generation.

rlpb11mo ago

CamouflagedKiwi11mo ago

-Ofast is the 'dangerous' one. (It includes -ffast-math).

rlpb11mo ago

> The optimisations in -O3 aren't supposed to give incorrect results.

I didn't mean to imply that they result in incorrect results.

> they make a more aggressive space/speed tradeoff...

eqvinox11mo ago

Similar considerations apply for LTO and PGO. LTO is commonly default for release builds these days, it just costs a whole lot of compile time. PGO is done when possible (i.e. known majority inputs).

CamouflagedKiwi11mo ago

If they're things that everyone wants, why aren't they in -O1?

1 more reply

wffurr11mo ago

If the answer can be wrong, you can make it as fast as you want.

zzo38computer11mo ago

It depend what kind of wrong answers are acceptable (and in what circumstances).

bsenftner11mo ago

[flagged]

tomhow11mo ago

Be kind. Don't be snarky. Converse curiously; don't cross-examine. Edit out swipes.

Please don't fulminate.

https://news.ycombinator.com/newsguidelines.html

glkindlmann11mo ago

bsenftner11mo ago

j / k navigate · click thread line to collapse