(a) You're doing something wrong (usually, not standardizing your data, etc.) which means that you're not thinking about the fact that computers have finite numerical precision and adjusting your problem accordingly (e.g., have a wide dynamic range of numbers).
Or, (b) your problem is pretty ill-conditioned and there's probably no solving it in any useful way unless you crank up the precision.
Almost every problem I've encountered of this form (whenever I've worked on optimizers, or, more generally, solving problems) is of type (a). There aren't many more choices if your problem is of type (b), since you're almost always losing digits somewhere in any implementation of a numerical algorithm.
The mean of two numbers is a particularly easy example to analyze, but I don't think that the implementation is "wrong" since every (even slightly more complicated) numerical algorithm will run into the same problems. I think this is a good post in its content, I just don't agree with the author's final interpretation.
You're right that there are issues with precision and floating point math (the article touches on this a little), but the implementation is definitely wrong if it promises to work like in the case of Python's statistics library (which was correctly using the fractions library to store arbitrary numbers, but converted to a fraction in the wrong order).
The point is that there is an “unwritten contract” between the user and the library: if we want any kind of performance, then we have to promise that the inputs are “well behaved” (in a very specific sense that varies from algorithm to algorithm). The user knows this and so does the developer, because otherwise most algorithms will spend most of the actual compute time converting problems to “fully general forms” that work on every input, but are hundreds if not thousands of times slower than the original.[0]
The point is that most people want “mean” to be fast, and a simple implementation of mean will work on almost every case, without needing to resort to arbitrary precision fractional arithmetic. The claim I make above is that the latter thing is almost never what people need, because if they did then almost every numerical algorithm is doomed from the start, whenever they perform operations that are more complex than “take the mean.” Now, if they really need this to be done for the huge numbers given (or whatever the case is), then there are libraries that will do this at a huge cost in performance. (But, if they really did need it—for example, when a problem cannot be rewritten in a simple way that would work with the available implementations—my bet is that this user would know).
Also apologies for mistakes, I’m currently on my phone walking to a meeting.
——
[0] that isn’t to say that if you can write a fast and general version, that you shouldn’t! Just that it usually incurs some overhead that is often unacceptable (see, e.g. using arbitrary precision fractions for Gaussian elimination).
"...incorrectly posed problems were considered ‘taboo’ for generations of mathematicians, until comparatively recently it became clear that there are a number of quite meaningful problems, the so-called ‘inverse problems,’ which are nearly always unstable with respect to fluctuations of input data." — H. Allison, Inverse Unstable Problems and Some of Their Applications (1979)
A ton of these cases which are "usually" ill-posed, as in compressed sensing, physical design, etc., have some more structure which is encoded as regularizers or priors. For example, in compressed sensing, we know the input signal is sparse, so even though we have far less samples than the Nyquist rate would require, we know something else about the input signal, usually encoded as an l1-regularizer or other kind of sparsifying regularizer, making the resulting optimization problem well-posed.
That being said, I'm not sure how this immediately plays into (b). My claim is that you're pretty much hosed unless you increase the general precision/range of the algorithm, which is independent of the noise in the input, but rather depends on the actual implementation of the algorithm and the atoms the algorithm uses (sums, multiplications, etc., of two floating point numbers, for example). The case being that you have to increase the precision of the atoms which the algorithm uses (e.g., like in the article, switching to arbitrary-precision fractions when sums of large numbers are needed).
---
NOTE: I guess more generally, I'm defining ill-conditioning/numerical instability as a problem with the algorithm in the sense that, for a fixed input precision, perfect arithmetic (the "correct" result) will differ heavily from the result with truncated arithmetic (with floats, or whatever it may be).
In addition to the largest numbers that floating point can handle, a related issue can come up with numbers that are not near the edge.
A good student of floating point numbers will notice that there is a good chance you will get a different result summing a list of floating point numbers if they are sorted in a different order. This is due to the rounding error that can happen if you add a large number to a smaller one. Thus, if you add up all the small numbers first, you can get an intermediate sum that is large enough to not overflow when added to the next on the list. Thus, you should sort an array of floating point numbers and begin the summation with the smallest.
Some careful thought is needed if numbers are important to your computation. While the author of this article makes important points about one aspect of this, they seem to shrug off further study of the referenced 30 page paper at https://hal.archives-ouvertes.fr/file/index/docid/576641/fil....
The authors of that paper, however, seems to misstate one vital point. They note that the mathematical formulas are "unstable", but to the point of working with floating point numbers, it isn't the mathematics that is unstable, it is the failure to understand how deeply modern floating point numbers are an approximation. Methods that fail to take this into account will be unstable.
Is this method more accurate than Kahan summation? And if so, is it worth the extra cost of sorting?
I suspect something similar could be done here, even if it would be a few times slower, to get an ulp-accurate mean regardless of the order/size of the individual elements.
This is the TXR Lisp interactive listener of TXR 214.
Quit with :quit or Ctrl-D on empty line. Ctrl-X ? for cheatsheet.
1> (defun mean (seq) (/ (sum seq) (len seq)))
mean
2> (mean '(8.988465674311579e+307 8.98846567431158e+307))
** out-of-range floating-point result
** during evaluation at expr-1:1 of form (sum seq)
Works for me; error is caught. Breaking news: floating-point has a limited range! Detailed story at 11 ...Almost any non-trivial calculation that overflows can be massaged into something that avoids overflow. One important motivation behind floating point is to have enough practical range so that you don't run into overflow.
If this is an actual practical problem someone faces, the tool to reach for is wider floating-point, not to mess with the code. A bigger "E" than +307 is a thing.
Also, multi-precision floating-point is a thing.
#lang racket/base
(define (mean seq) (/ (apply + seq) (length seq)))
(mean '(#e8.988465674311579e+307 #e8.98846567431158e+307))
89884656743115795000[...]He shows off a function named "sumsto" which takes a single positive double precision floating point number "x" as an argument.
"sumsto" always returns the same vector of 2046 floating point numbers -- just in an order so that naive left-to-right summation returns "x". Just changing the order of that vector lets you sum to almost any positive double precision number.
If you want to run the code, I didn't see where "realmax" was defined, but this works:
realmax() = prevfloat(typemax(Float64))
I have a hunch there could be a bit of relation between these things.
https://arxiv.org/abs/1505.05571
Fortunately pairwise summation (Julia's default) is fast and fairly hard to trip up, although it certainly is possible to trick it.
For that you can :
- sort the numbers (clearly not the most cost effective solution but easy),
- use a high precision accumulator (which can give you an exact result if you go far enough),
- use a compensated summation (such as Kahan summation but the state of the art can do much better)
For extreme cases there are algorithms that can give you an exact result, my current favorite being Accurate Floating-Point Summation Part I: Faithful Rounding : https://www.researchgate.net/publication/228664006_Accurate_...
I guess this still assumes that the largest number in the original list is less than or equal to the maximum floating point value, but otherwise you stay roughly in the same space of precision as the original data.
def mean(ls):
mu = 0.0
for i, x in enumerate(ls):
mu = i/(i+1) * mu + x/(i+1)
return mu mu = mu + (x - mu) / (i+1)
This is more natural when written in C: mu += (x - mu) / (i+1)If you are not aware of how floating point value work, this is an excellent illustration of that, and you should spend some time learning about the risks and edge cases associated with working with them.
At the same time, enabling floating point exceptions on your platform is probably the most pragmatic way of dealing with these issues. until you are in a domain where this level of effort is an investment, over-engineering modules that use floating points is a waste of time.
[0] https://github.com/pjungwir/aggs_for_arrays [1] https://github.com/pjungwir/aggs_for_arrays/blob/master/arra...
And if that isn't enough, you have subnormal numbers as a safety net as well, although using these kills performance on many FPUs.
You should not use SSE for that however, but the old x87 "FPU" instructions.
using known stable numeric methods? Kahan-Babuška summation?
precision = Float::DIG + 1
require 'bigdecimal'
a = BigDecimal(8.98846567431158e+307, precision)
b = BigDecimal(8.988465674311579e+307, precision)
mean = (a + b) / 2
puts "Works" if mean > [a, b].min && mean < [a, b].max
=> Works
(edit: I used 0 first and it works fine but converts to precision 15 instead of 16 (which is the maximum on my computer) for some reason, I may file a bug for this)Works for small floats too:
smalls = [1.390671161567e-309, 1.390671161567e-309, 1.390671161567e-309].map { BigDecimal(@1, precision) }
mean = smalls.sum / smalls.size
puts "Works" if mean >= smalls.min && mean <= smalls.max
=> Works
One could also use BigDecimal("8.98846567431158e+307") but seems like the numbers are coming as floats.My first guess: Normalize it. Find the largest exponent and divide all values with the exponent. This is a power-of-2 operation so would not lose any precision. Then divide all numbers by the length. Perform the same normalization again. Sum the numbers, then revert the exponent that was removed in the first step.
My second guess (backup plan): Use the interval arithmetic. I don't remember the specifics but I remember this is the safest form of floating point operations. But perhaps I should be careful about the performance penalty.
Suppose you want to find the mean of (1, 1, 1). If you divide by the length of the list first, you're going to compute sum(1/3, 1/3, 1/3), which reduces to sum(0, 0, 0). And 0 < min(1, 1, 1).
How do REPLs and databases handle this edge case?
Python has Decimal for those who need correctness under all circumstances.
And you could add the split logic to calculating the mean of the means. Just as insurance.
So would that work?
OK but then the first example is a naive computation of the mean of two enormous numbers.
All this really illustrates is that you need to be aware of the range of your inputs and choose a solution that accomodates that.
(defn average [numbers] (/ (apply + numbers) (count numbers)))
(average [8.988465674311579e+307M, 8.98846567431158e+307M])
=> 8.9884656743115795E+307MYou could say that the “hard problem” mentioned has been removed by those abstractions.
reduce(lambda x,y: (x+y)/2.0, numlist)
Why wouldn't this work?The mean is 2.5 if you sum them all and divide by 4.
With that method it is:
mean([1, 2, 3, 4]) = mean([1.5, 3, 4]) = mean([2.25, 4]) = 3.125
X = Mean([a,b,c]) = (a+b+c)/3
Y = YourFunction([a,b,c]) = (((a+b)/2)+c)/2 = (a+b+2c)/4
X does not equal Y. It doesn't matter if the reduce is left or right, btw, it's still wrong.