The discussion of algorithms, being informal and well-illustrated, was the high point of the books. The code, not so much.
I had the Fortran book and so, when the C book came out, I bought that as well. The C code seemed to have been auto-generated by the Fortran code, and it was really quite terrible.
Functions in the fortran code had arguments for what I'd call "work arrays". There was some risk in that, because if the documentation told you to give space for 2N+1 floating point values and you gave 2N, you might get into problems. I don't think that was a real problem in practice, because people were accustomed to this general way of working and so they were careful.
The C code took a different approach. The functions allocated and deallocated space as needed. And, since lots of functions called other functions for very simple things, there was a massive amount of memory juggling going on. This made the functions be wildly slow. So, the point of the boycott was not terribly relevant, because I think pretty much everyone trying to use that C code just rewrote it, anyway.
20+ years ago I took my NR class as part of my physics PhD program, it was a required course for all physics and astronomy students.
I don't think anybody in the class was planning to use the actual recipes from the book as a drop-in placement for their work, but it was to explain the methods behind the software we all used (Matlab in my group, IDL and IRAF for the astronomers, etc), so users can make the best choice for which curve fitting or regression or optimisation routines to call, and avoid potential pitfalls.
To me it seems like an OS class that uses minix to explain behind-the-scenes operation system functionality. Hardly anyone in the class will go on to use Minix as their primary OS in their future career. (I've not done any formal CS study, so forgive the analogy).
Oh, yes. The arrays in the C code were indexed from 1, following the FORTRAN code, with functions to convert the subscripts. Pointers to arrays were offset by one element size, so they didn't point to the array at all. They pointed to an address outside of the array.
I rewrote parts of NR in C++, but ancient, 1990s C++, with pre-STL Microsoft collection classes.
Exactly. And what's nice about that approach is that it sidesteps that ludicrous license.
As many people point out, you shouldn't roll-your-own numeric code. Unless you are solely responsible for knowing it's what you want and are cognizant of all the tricky edge cases and can make informed engineering decisions of course.
As the article says "Numerical Recipes is a good book for learning about numerical methods." I agree. If I need a quick algorithm in my language of choice, I might refer to NR in Fortran or C as a guide, but the actual implementation must be cleaned up and integrated correctly. Probably extended and specialized and combined with whatever tricks are required for the purpose. The books are (were) useful as books, but the code was never more than executable pseudo-code. I find it helpful to understand the algorithm, and once I do, I immediately begin seeing where I need to adjust it too.
And you can't copyright algorithms. So as long as I don't make a NR.py lib (which no-one needs) that contains the entire Structure, sequence and organization (SSO) of the book, the publishers can stick their crappy license where they like. I don't have to boycott the book.
(But what about open-source numerical methods? Well, EISPACK had been published in 01974, 12 years before NR came out.)
Typing in code from a book was perfectly reasonable.
The issue today, though, is not what restrictions the book's copyright holders imposed, or purported to impose, in 01986. The issue is the restrictions they impose today.
People from the year 100000 might be offended by this notation. But in seriousness, what is the extra 0 for? Even without it, 1974 and 10000 are perfectly distinguishable, and for computers good methods to store long dates use binary storage formats, so amount of ASCII decimal digits doesn't matter there
I'm not convinced about "black art" given, for instance, the Harwell library and later NAG, although they weren't free software. (I knew about Ian Grant's work as an undergraduate, though he didn't teach me.) Later I knew how to get code from Netlib by email before we had general trans-Atlantic network access.
I found Nick Higham has a soft spot for the undergraduate textbook we had, from rather earlier, Acton's Numerical Methods that Work.
Tapes by mail was a thing ...
I've seen it breed horrors - "senior" people instructing interns to implement its matrix inversion routine line-by-line in Matlab. Today it would be Python.
Oh, and here's a list of alternatives, from the heyday of NR: https://web.archive.org/web/19990202141044/http://math.jpl.n...
> Their code has the implicit assumption that all elliptic problems have homogeneous Dirichlet boundary conditions!
I wanted to read about algorithms for choosing a random float in [0,1], and IIRC its only suggestion was randint()/INT_MAX. This is wrong in part because the smallest nonzero value you'll ever choose is 1/INT_MAX, which is much larger than the smallest nonzero float.
Is there a better book out there? I was hoping for something fabulous like Hacker's Delight but for floating point.
while exponent < -127 || exponent > -1 {
into while exponent < -127 {
It uses a single call to the RNG 255/256 = 99.61% of the time, if it has to use an extra RNG call the chance it then needs another after that is 2^-32, ad infinitum, which is negligible. More calls to the RNG could be needed if the exponent it generated was out of bounds which is also incredibly, incredibly unlikely (~2^-33 for [0, 1), ~2^-127 for [0, 1]). Thus the expected number of RNG calls is essentially just 255/256 + 2*1/256 ~= 1.00391.A friend of mine also made a blog post that includes another implementation and discusses it: https://lokathor.github.io/prng/.
The difference between this technique and virtually all others out there on the internet is that this one is actually unbiased and can generate every single possible floating point value in [0, 1). Unbiased here means that the procedure simulates picking a real number uniformly at random from [0, 1] (with infinite precision), and then rounding it to the closest floating point value. We don't have to use infinite RNG bits to generate our infinite precision real number because we only have to generate enough bits such that it is unambiguous which floating point number we round to. The actual number of RNG bits needed is constant in expectation (see above), but unbounded, because our real number could be arbitrarily close to the boundary between two floating point values.
from random import Random
from math import ldexp
class FullRandom(Random):
def random(self):
mantissa = 0x10_0000_0000_0000 | self.getrandbits(52)
exponent = -53
x = 0
while not x:
x = self.getrandbits(32)
exponent += x.bit_length() - 32
return ldexp(mantissa, exponent)
¹ https://docs.python.org/3/library/random.html#recipes // Returns k with probability 2^(k-1).
Did you mean 2^(-k-1)?There are other problems with what you say their suggestion was, e.g. INT_MAX should be RAND_MAX, and you need to take some care with rounding/conversion, and inclusive vs. exclusive upper bound requires careful attention. But with any simple approach you just can't use most of the small float space, and so the division approach with the details done correctly is pretty common!
Edit since rate-limited: orlp, I think we're only in disagreement about what qualifies as "very efficient". It's an efficient implementation of the more complex problem, but it's also much slower than the division. I wouldn't want it to be the default float generator for a general-pupose language - even most scientific modeling or statistical analysis code won't benefit from the increased output space. OP should think very hard about how they intend to use values in that range before insisting they really need it - the division approach really is sufficient and NR is right to recommend it by default!
I agree there is much more one could want out of a uniform random float generator than just k/N where k is a integer drawn uniformly from {0, 1, …, N}, but hitting arbitrarily small floating point numbers is not on this list.
If you are intending to sample numbers in the range [0,1] such that each float occurs with equal probability then that’s fine, but it’s certainly not uniformly random on [0,1].
It's fine to counter that most of the time the difference wouldn't matter or that it might be problematic to compute for some reason or another, but I don't think it's reasonable to critique the idea on the grounds of the result not being uniform when they didn't actually ask for each float to have equal probability.
[0] https://pages.cs.wisc.edu/~david/courses/cs552/S12/handouts/...
Even if you decide that returning only RAND_MAX+1 different values, I think you should divide the [0,1] interval into RAND_MAX+1 equal sized intervals and return the floating point value at the center of one of such intervals. The given solution has both zero and one stand for a value in an interval of size 1/(2RAND_MAX), but picks them with probability 1/RAND_MAX, so it picked both of them way too often.
Aside from that there’s:
- rand(), on many systems, is a very poor random number generator.
- the API of rand()* is poor. Because there’s only one global state, you can’t run multiple independent generators, and it doesn’t help you much as soon as you want to generate anything other than a discrete distribution with RAND_MAX+1 items (a number that even is implementation dependent, so that portable code needs special care even if it may not be needed on most platforms). For example, because RAND_MAX may be even, you can’t even simulate a fair coin toss with a single call of rand().
float frand( int *seed )
{
union
{
float fres;
unsigned int ires;
};
seed[0] *= 16807;
ires = ((((unsigned int)seed[0])>>9 ) | 0x3f800000);
return fres - 1.0f;
}
You basically assemble a float building a random mantissa and
assigning the right exponent and sign bits.It's not portable, but it's a very neat trick!
If some grad student, postdoc or independent uses it and doesn't quite follow the license terms, I expect nothing will happen. Though they would probably do well to update their code base, eventually, for technical as well as license reasons if they "scale up" beyond using the code for a paper or two. But the tone of the OP's call to "Boycott" is a bit over the top, RMS-style, and not in a good way.
Last-Modified: Thu, 10 Aug 2006 17:08:26 GMTAlso, the venerable ccmath library by the late Daniel Atkinson is an extraordinary jewel of readability, simplicity, portability and performance among numerical codes.
If you know how to code and want a leg up on understanding an algorithm, NR is a very convenient structural cheat sheet.
You can't just use a library by experts. You have to know or learn how to evaluate said library for correctness, efficiency, and stability (numerical and operational). If you know how to do that, you may be able to use a flawed library as a tool to get the shape of your processing quickly and fill in the rest of the details yourself.
[1] They've published a major book through three editions and the book cover calls them leading scientists.
Quoting from a review: "The authors of Numerical Recipes were not specialists in numerical analysis or mathematical software prior to publication of this book and its software, and this deficiency shows WHENEVER WE TAKE A CLOSE LOOK AT A TOPIC in the book [editor's emphasis]. The authors have attempted to cover a very extensive range of topics. They have basically found `some' way to approach each topic rather than finding one of the best contemporary ways. In some cases they were apparently not aware of standard theory and algorithms, and consequently devised approaches of their own. The MEDFIT code of section 14.6 is a particularly unfortunate example of this latter situation.
"One should independently check the validity of any information or codes obtained from `Numerical Recipes'...."
Matrix computations, by Golub and van Loan.
There are NR algorithms not present in MATLAB that are very useful. When trying to find the nearest index in one very large monotonically increasing vector in another you could use interp1, but what is much faster is to do a binary search. There is no binary search function in MATLAB. It's lightning fast to use the NR-C algorithm for a binary search in MATLAB. If you want to find many nearest indexes, then interp1 likely faster.
I don't think the algos are obsolete though.
Think about a student who learned from NR, then in grad school needs a method, and in the spirit of "I'm smart, I can do this myself", copies the method out of the book.
In larger projects, someone is more likely to be aware of alternatives, and the issues related to using source covered under the NR copyright.
Back in ... 1991? I did use the C version of the Simplex Method almost straight from the book. I needed a concise linear programming optimization module that would be called in the middle of a not-really-concise production programming system handling (automotive) engines and gearboxes production for a couple of FIAT plants.
Basically a short-period scheduler to decide what to produce (slave to an immense system running on IBM mainframes; COBOL, again).
So I quickly put together the C version (with minimal modifications) and it chugged along happily on a VAX for at least a decade.
In C++ or even better a language like D or Ada or whatever with built-in slices you can write assertions that explicitly encode assumptions and "what I meant to write, modulo bugs"-isms much more easily than in C. How long is that array? How big can that float be before the model is unstable? Maybe express the units in the type system etc.
It doesn't necessarily make the license a farce. An Oracle-like business model can be established by auditing the users of the NR library and assessing fees for license violations, including the inadvertent ones.
As an undergrad, I had a research project which used Markov methods, and I had to generate some fairly large (at the time) matrices. I wrote that code in Fortran, on a PC/AT system (1986, so ...). It also ran on an IBM 3090 VF 180 mainframe at Stony Brook.
What I recall from that time, was the professor giving me a text on Markov methods, and me learning what I needed to construct the transition probabilities (matrix elements). I remember writing extensive tests within the code, called sanity checks, which I reported on. Fortran, no assert statement at the time, and its nice to see a note "COMPUTATION IS INSANE DO NOT TRUST" error message when you ran into problems.
Honestly, that process taught me a number of things.
1st, the code you need may or may not have been written by someone else, and you really need to make sure you have sanity checks built in to establish the portions you can and cannot trust. Later when I used the NR code, I added sanity checks into their code, to make sure we weren't getting into strange spaces.
2nd, it is (often very) important to exploit mathematical features of your problems (symmetries, relation of derivatives to exact closed form equations, etc.) to simplify your calculation. Plug and crank is usually the wrong approach, and you need a solid understanding of your problem domain to have a fighting chance at not producing a garbage simulation.
I've followed these coding practices for decades at this point, through many languages and systems. NR was a great way to start with these things, but I picked up Hamming's, Hildebrant's, and many other numerical analysis books along the way, taught myself, and generated code that was hopefully sane.
I don't care so much about the license of NR, as I am not distributing code containing it. My thesis advisor warned me about that in early 90s. So that was removed from code I shared.
I guess I am a bit sad that I offered said book to offspring as she starts a PhD in MechE later this year, and I don't think she wants it. She did take my numerical analysis books though, so I'm happy about that.
I get that for most just using Numpy, GSL or similar library would be sufficient, but what if you really want to implement them yourself.
First make sure you're familiar with background issues in numerical analysis/computation such that you're generally familiar with IEEE floating point and its foibles, numerical stability, basic error analysis etc.
Figure out which algorithm you're interested, find a couple of open source implementations (academics generally generate a vast array of these, though many may become abandonware), look at any papers they are associated with/mention and read those. Now you're ready to start writing.
Once you have a basic implementation, most fields have a set of standard problems they benchmark on. Find those and see how you compare on numerical accuracy and performance against the libraries you looked at before.
So, what's the book for learning that?
I really, strongly believe that our obsession with measurement and repeatability is a net negative that has taught a whole generation to treat cooking like a standardised test rather than a creative form of human expression.
You look at the older generations from cultures that value cooking and they simply don't give a fuck about what the recipe says to do.
I hate that almost no recipe specifies salt in grams. How much salt fits in half a teaspoon is pretty arbitrary since it strongly depends on the type of salt. For many dishes it doesn't matter, since you just salt according to taste. But if you are making meatloaf, getting the amount of salt right is essential, and you can't really taste it unless you like tasting raw meat, so an amount in gram would be appreciated.
The vast majority of people just want to make some shit to eat everyday. And precise measurements are great for that.
Wouldn't that make cooking much less accessible?
To continue this off-topic rant, recipes have been receiving a lot of attention lately, and the medium is well overdue an overhaul to cut down on the fluff. A comprehensive wiki, with cooking markup (eg. Cooklang) would be amazing and would fill a long-overdue gap that hasn’t been filled since Larousse Gastronomic was released. The idea that recipes are ‘owned’ by someone who happened to publish it first is crazy, but could nonetheless be addressed with an attribution license (eg. the new MIT-0?)
At least one commenter here dismissed NR, saying that it is equivalent to an undergraduate numerical analysis book. Well, duh! That's one of its strengths! It is a good book for an audience like me -- somebody with scientific background but with no formal numerical analysis training who needs to write some numerical code.
The target reader is an engineer who needs to implement something specific. That's a very common scenario -- you are given the job of implementing some calculation as part of a larger project. You're not an expert in the particular mathematical discipline. You don't want to fool around trying to find an algo in some book since you don't know which of the zillions of numerical analysis books to look at. Also, most books will just give you a general algo, not ready-to-run code. Everybody knows how hard it can be to translate a wordy description of an algo into real code. If you had the budget you'd just hire somebody to do it for you. NR solves your problem -- you use the code and your problem is solved so you can move on to the next step of your project.
A different commenter here suggested using Golub and Van Loan instead. That's laughable. First off, as others pointed out, their "Matrix Compuations" book addresses numerical linear algebra exclusively. NR is a general-purpose numerical analysis book, so covers a much larger topic set. But secondly, Matrix Computations is difficult reading for non-mathematicians. Even if I needed to implement some sort of matrix decomposition, I would find it a pain to use that book as a source.
(Yes, I am sure some testosterone-soaked HN commenter will now say that they read "Matrix Computations" in 3rd grade as a way to prove how smart they are. Allow me to roll my eyes now.)
The real complaint made in the linked post involves the licensing terms of the code in NR. I agree the license stinks by today's standards. But the book was originally written back in the 1980s. My copy is from the early 1990s. Back then the GPL was basically unknown outside of a group of happy hackers at MIT. Software licensing certainly not a topic occupying mindspace of ordinary scientists or engineers. There was some DOS-based "freeware" circulating on 3.5" disks, but the whole "free (libre) software" movement was still in the future -- it needed the internet to become widespread in order to take off. Finally, I can imagine the NR authors wanted some sort of copyright or license which prevented wholesale copying of their work by folks who would try to profit off it. It's reasonable for them to have wanted to get some monetary reward for their hard work. Their license is probably an artifact of its time.
In my travels I have seen situations where code used in commercial products incorporates this or that function presented in NR. For example, I saw the FFT algo used in a test widget. (I authored none of the copied code -- I simply observe it in projects done by others.) What's funny is that NR's FFT algo is not particularly great. However, the guys who needed an FFT probably didn't know that. They probably would not have cared anyway -- they just wanted a working solution and that's what they got from NR. I am sure that little violations of the NR license happen all the time. It may not be strictly legal, but I also see people making Xerox copies of copyrighted books and journal articles all the time too.
Finally, regarding alternatives like GSL, ccmath, or whatever is available on netlib, those projects post-date NR. In any event, the internet is what made all those things widely available. I bought my copy of NR in a bookstore before that happened.
If there’s a better book that is accessible to the masses I’d like to know about it. But I haven’t found it yet.
No, Netlib started in April 1985[1], though as a physicist or engineer you likely had access to NAG or something similar already. It must have made an impression on me as I still remember F02ABF... (I hasten to add I hit scientific computing early in life.)
1. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.73...
There are normally many subtle gotchas that affect numerical stability of a method, as well as great attention to detail required to handle degenerate edge cases. And that's before we get to subtle bugs that only trigger occasionally or are otherwise hard to detect.
It's not like NAG/ISML/HSL etc don't predate NR, in some cases by several decades. Libraries were always available, though back in the day you may have needed to pay for them.
This is empty advice without some context. Many of us work precisely in rolling new numerics.