I mostly used C for my (small-scale) HPC work in grad school because it’s what I knew best, but at several points I wished I had learned Fortran instead.
Probably one of the only “higher level” languages that’s ever been used for serious petascale scientific computing is Julia (first with Celeste on the astro side, possibly soon with CliMA for climate modeling), which not coincidentally follows similar array indexing conventions as FORTRAN. And while that’s what I mostly use now, I don’t see Fortran going away any time soon.
If anything, with cool new projects like LFortran [1] making it possible to use Fortran more interactively, it’s probably quite a good time to learn modern Fortran!
They map well to practically freakin' everything, for what seems like.. not that much effort on the language design side, but an enormous amount of tedious, duplicated effort on the user side.
So a language with multidimensional arrays is in a lose-lose position of having to choose to either satisfy the linear algebraists at the cost of alienating general-purpose programmers who want to do pointer arithmetic, or else satisfy the latter while alienating the core demographic for multidimensional numeric arrays.
Personally, I’m fine with (or even slightly prefer) one-based for my own scientific computing, despite starting with C, since it really is more elegant for linear algebra, and I have never found myself needing or wanting to do pointer arithmetic in a language that does have good multidimensional arrays — but clearly it is still a major turn-off to many others.
And thus specialized data type, operators and syntax look like overhead. And thus language designers leave it out.
Most languages that prioritize Fortran code interop also adopt column-major order, but most other languages that support multidimensional arrays do row-major order. I'm not sure why Fortran went column-major but because it did, a lot of libraries designed for Fortran callers (such as LAPACK and all BLAS implementations) need to be told that input arrays have been transposed when they come from languages like C++.
[1] https://en.wikipedia.org/wiki/Row-_and_column-major_order
The number of woeful ad-hoc solutions I've seen to people handling matrix data in Java/C# for what should've been otherwise very basic analysis ...
Something with pandas-like capability in a lower level language would be amazing.
2. Despite the sneers and derision that Fortran has been subjected to from non-Fortran programmers over recent decades, Fortran is an excellent language to do intensive scientific and mathematics work. Compilers are optimized for calculation/math speed and large intensive calculations. From the outset, it could handle complex numbers, double precision, etc., etc. natively without having to resort to calling libraries/special routines as other languages had to do back then.
3. Scientific enterprise alongside mainframes and supercomputers have well established and stable ways of working including program and data exchange etc. Essentially, a well established computing ecosystem/infrastructure surrounds scientific computing that researchers know and understand well. There is no need to change it as it works well. Moreover, it's a stable and reliable environment when compared to other computing environments - Fortran was introduced long before the cowboys entered the computing field, back then the programming/computing environment was more formal and structured, this contributed to that stability.
For the record, Fortran was the first language I learned, and my programming back then was done on an IBM-360 using KP-26 and KP-29 keypunches and 80-column Hollerith cards.
[1] https://en.wikipedia.org/wiki/Monte_Carlo_N-Particle_Transpo...
At least until sometime in the '80s another advantage of Fortran over C was that it could handle single precision floating point. C always promoted float to double when you did arithmetic with it.
Arithmetic was faster on floats than doubles, and that could make quite a difference in a big simulation that was going to take a long time to run.
The high energy physics group at Caltech back then had a locally modified version of the C compiler that would use single precision when the code tries to do arithmetic on floats. Some of the physicists used that for programs that otherwise would have been in Fortran.
I seem to recall needing 'i' for vector stuff, Maxwell's equations etc. (If I'm wrong it must have been on the VAX FORTRAN-IV several years later.
Correct me if I'm wrong.
Use it wrong, break the compiler assumptions and the bug hunting fun starts.
I read Complex numbers came in FORTRAN IV, so mid 60’s.
I never used anything older than FORTRAN 77, and it certainly had complex numbers, and also ways for the “inventive” programmer to make use of pointer like functionality. You could e.g. pass functions to functions, if you were so inclined.
Somehow your mention of cowboys entering the field feels irritating, considering Kazushige Goto and his contributions to BLAS while being in Austin, TX.
Banzai!
But this article isn't about fundamental algorithms being correctly-implemented in an endowment of legacy code, it's about defending a siloed language choice, which seems like an antiquated concern to me.
I applaud anyone using a tool that works for them, but if it's good, then its users have accomplished things which transcend an individual tool.
1. The first comment I would make is that the headline premise is wrong or at least deliberately misleading. Let's start with a correction. I would very much doubt that climate models are written in programming languages from 1950s. The Fortran code of the 1950s was not that much like the Fortran code that I learned in the late 1960s, and that late 1960s code bears little resemblance with modern Fortran code of today. Furthermore, the Fortran standard is certainly not dead, it is being continually updated: http://fortranwiki.org/fortran/show/Standards.
2. When I made comment about libraries going back 60 or so years, this would imply that libraries written in Fortran II ca 1956 or so would pose a problem today. I would suggest that it is not so because the process of updating libraries is formal and strict, thus an updated Fortran II subroutine would work perfectly well with today's modern Fortran. This 'upgrade' process is not like converting code from say Pascal to C or whatever, for here we are still within the confines of the Fortran framework and that that conversion process is well understood, straightforward and procedural.
3. This isn’t my idea, nor am I defending something that I learned decades ago and don't want to give up. Frankly, I do little Fortran programming these days so it's essentially irrelevant to me. The point I made and that I make again is that there is a sophisticated scientific computing environment in existence and it is used by thousands of current researchers and scientists around the world. Scientists would not use antiquated software on cutting edge science if it did not work. The fact is modern Fortran is a modern programming language that delivers the goods par excellence—likely much better than any other language, especially so given its long and mature infrastructure. For example, here are the first two reference I came across on the net, there are thousands more just like this:
https://arxiv.org/abs/hep-ph/0612134
https://www.sciencedirect.com/science/article/abs/pii/S00104...
4. Now let's look at the current situation—'modern' software. To begin, you should read this 27-year old Scientific American article titled 'Software's Chronic Crisis' from September 1994:
https://www.researchgate.net/publication/247573088_Software'...
I would contend that this article is just as relevant today as it was 27 years ago if not more so. In summary it says.
• Programmers work more like undisciplined artists than processional engineers (this problem remains unresolved).
• Essentially programming is not true engineering (since the time of the article, computer science has progressed somewhat but on the ground we still have multitude of unresolved problems).
• If programming is to be likened to engineering then it is in a highly unevolved state somewhat akin to the chemical industry ca 1800. Its practical function or operation is a mismatch with the everyday world or we wouldn't have the proliferation of problems that we currently have.
When, these days, one examines the situation with literally hundreds of different computer languages in existence, it is clear that there isn't enough human time and effort to rationalise them all and develop a coherent set of tools, in essence almost everything around us is only half done. We stand in the midst of an unholy Tower of Babel and it's an unmitigated mess (I could spend hours telling you what's wrong but you'll know that already).
The crux of the problem is that programmers spend much time and resources learning one or more computer languages and that it's dead easy to poke fun at mature languages such as Fortran as being old fashioned and out of date. The fact is they either do not adequately understand them or the reasons why they are used, or it is both.
The fact is it is this very maturity of Fortran that makes it so valuable to scientist and engineers. Those who are not conversant with or do not program in Fortran have simply not understood the reasons for its success.
Scientists and engineers have found the most reliable, stable and best fit available and that is to use a modern version of Fortran—simply because its reliable and it works.
This article only shows authors lack of understanding of the problem.
Oh, BTW, let me add that I have no contention with theoretical computer science models. It's just the divide between theoretical computer science and what happens in practice is as wide as it ever was.
I think the assumption that "old is bad" is the cause of many, many, many foolish decisions. Useless code rewrites, company reorganizations that are not significant improvements, and many other bad ideas hinge on this Worship Of The New. Why are we using an alphabetic system originally developed c. 1800BC? It's old, we should switch to new writing systems every 10 years because they're new, right :-)?
Older is not better. Newer is not better. Better is better. There's no point in switching something if the destination isn't better, and even if it's clearly better, it needs to be so much better that it's worth the switching cost.
The audio plug is over 100 years old, and modern TTY's date back to what, 80 years? If it works, it works.
Also, damn Calculus is over 200 years old. Or maybe 2000, depending if you compare it to the method of exhaustion or not.
What practical problems does Fortran cause when used for numerical computing?
Excellent idea to help filter out those having the lesser number of decades experience.
In recent years I've adopted the following nomenclature and you'll note I've done so here in my earlier posts. That is to treat the name of each specific version as a proper name. As FORTRAN IV was originally called that including the Romanized numerals for the version number I use that out of respect for those who originally named it in the same way I'd always use say John and not john. Nowadays, when I refer to Fortran in its generic sense I use its new default name rather than its old acronym form.
Fortran is a domain-specific language for scientists, and excels at array arithmetic (for graph-based problems though, maybe look elsewhere). Even badly-written code can run reasonably fast, which is not the same for C/C++. There is also the decades of concerted hardware and compiler optimizations that make Fortran hard to beat on HPC systems.
It's not as readable as Python, but it's more readable than C/C++ written by a professional programmer.
The premise of the article is that Fortran, 70 years later is still an appropriate tool to use for crunching numbers which it absolutely is but it neglects one major problem.
Like the COBOL issue that was all the rage 20 years ago, it is difficult to hire younger generation programmers that want to and are excited to develop in Fortran.
> ...it is difficult to hire younger generation programmers that want to and are excited to develop in Fortran.
How much are you paying? Most often times I see this kind of reasoning, digging deeper shows that the salaries are not competitive. There's a large number of us that just want to work on interesting problems for adequate money and don't care what the toolset is. I'm fully on board with the idea of being paid to write Fortran.
Also, COBOL's problem isn't so much that younger generations aren't excited about it, but that the problems in the domain solved by COBOL all require highly specialized domain knowledge about an obtuse set of systems said code runs on (with most of their documentation paywalled, at least until recently). The barriers to entry are much, much higher and few companies are willing to train at the rates the language demands.
There are some other (i.e, “embarrassingly parallel”) scientific computing problems where a higher-latency distributed setup would be fine, but in climate models, as in any finite-element model, each grid cell needs to be able to “talk to” its neighbors at each timestep, leading to quite a lot of inter-process communication.
[1] www.simpack.com
Seems like a sector with high population and low barrier to entry is prone to illusory superiority that lowers the quality of the system.
Some excerpts from https://en.wikipedia.org/wiki/Fortran
Fortran 90:
- Ability to operate on arrays (or array sections) as a whole, thus greatly simplifying math and engineering computations.
- whole, partial and masked array assignment statements and array expressions, such as X(1:N)=R(1:N)*COS(A(1:N))
Fortran 2003:
- Object-oriented programming support: type extension and inheritance, polymorphism, dynamic type allocation, and type-bound procedures, providing complete support for abstract data types
Fortran 2008:
- Sub-modules—additional structuring facilities for modules; supersedes ISO/IEC TR 19767:2005
- Coarray Fortran—a parallel execution model
- The DO CONCURRENT construct—for loop iterations with no interdependencies
- The BLOCK construct—can contain declarations of objects with construct scope
Fortran 2018:
- Further interoperability with C
Are modern supercomputers faster than a cluster of consumer-grade GPU cards?
There is support for CUDA in Fortran. In fact, Nvidia purchased one of the main Fortran compiler vendors (PGI) and is open sourcing their compiler as flang.
CUDA is the predominant GPU programming model in the HPC space. There are open standards, but they are nowhere nearly as widely used.
> Are modern supercomputers faster than a cluster of consumer-grade GPU cards?
Fundamentally, supercomputers use the same processors and GPUs that you find in consumer hardware. The differences tend to lie in A) the sheer quantity of hardware used (think millions of cores for Top 10 systems), B) high bandwidth, low latency interconnects and C) some market segmentation by hardware vendors (e.g. Nvidia deliberately limits the double-float performance of consumer hardware)
On the top500 list, #1 does 400,000 TFlop/s, #500 does 1000 TFlop/s. How much would the kind of GPU cluster you're thinking of do?
So if your a Prof with a large code-base that you want to have a stream of Grad-students, undergrad research assistants, assoc. Profs etc. contribute to before they move on, having a language that doesn't require squandering half a semester on learning to code before you can start doing actual science is a big bonus.
A nuclear reactor simulator I ported from UNIX to Win32 in 1998 was several million lines of code written by nuclear engineers (not software engineers) and physicists. It's over 60 years old now.
def fibonacci(n):
if n < 2:
return n
else:
return fibonacci(n-1) + fibonacci(n-2)I haven't written Fortran in a while, but I was pretty sure that for illustrative examples like this, you could dispense with the entire MODULE declaration, the use of END FUNCTION Fibonacci instead of just END, and the usually-optional :: separator between the variable's type and name.
Something like this? Again, no recent experience:
implicit none
recursive function fibonacci(n) result (fib)
integer n
integer fib
if (n < 2) then
fib = n
else
fib = fibonacci(n - 1) + fibonacci(n - 2)
endif
end
(The IMPLICIT NONE has to stay because of the now-regrettable Fortran convention that without it, the type of a variable is determined by the first character of the name (n would be integer because variables starting with m, n, i, etc. are integer, while fib would be floating point).)I don't entirely agree with the overall assertion of this article. The author has some valid points, but I think it misses the forest for the trees.
TLDR: I think Fortran tooling and HPC clusters are a self-reinforcing local maximum. They are heavily optimized for each other, but at the cost of innovation and extensibility.
For example, we'll never get a fully differentiable climate model in Fortran. The tooling does not exist, and there are not enough Fortran developers to make a serious dent in the tooling progress made outside of the HPC world. The MPI stacks these codes rely on are not great for hardware outside of a supercomputer, and Fortran codes basically are built around full interconnect. I have many PFLOPs at my disposal that I cannot use because these codes are too brittle without being entirely rewritten.
At the end of the day, everything is a Turing machine, so you can technically do whatever you want in Fortran or any other language (or mix and match), but strategically staying in Fortran leaves a lot of resources on the table.
[1] https://doi.org/10.1145/2450153.2450158
[2] http://www-tapenade.inria.fr:8080/tapenade/index.jsp
[3] http://www-sop.inria.fr/ecuador/tapenade/distrib/README.html
I have not seen anyone use Nim or Zig yet. There are also some special-purpose languages like Fortress (apparently now defunct), Coarray-Fortran, and Chapel, though none seems to have achieved too much market-share.
Personally I have almost entirely switched to Julia (from mostly C), which lets me do my everyday plotting / analysis / interpretation and my HPC (via MPI.jl) in the same language. Fortran definitely still has some appeal as well though.
I'd be tempted to say that Rust could be used as well, but the equivalent of MPI and OpenMP for Rust is still not as fast as in C++/C/Fortran.[2] That's easy to understand: there are decades of investment in MPI/OpenMP for C/C++/Fortran, and Rust is not there yet.
Also, in some cases where high throughput is needed, languages with garbage collector are not suited. In this scenario, deterministic execution time and deterministic latency are very important. Not directly related to HPC, but Discord migrated from Go to Rust for this reason[2].
[1] https://github.com/trsupradeep/15618-project
[2] https://blog.discord.com/why-discord-is-switching-from-go-to...
But there are a ton of more specialist libraries, e.g. ARPACK that people probably aren't going to rewrite.
That said, there's a FORTRAN to C transpiler that works pretty well. I used it when I needed ARPACK and didn't want to deal with FORTRAN.
Good luck calling that slow.
Another clueless JS hipster, maybe.