OpenBLAS is incompatible with application threads. Most Linux distributions provide a multi-threaded OpenBLAS that burns in a fire if you use it in multi-threaded applications. Even though OpenBLAS' performance is great, I'd be careful to give a general recommendation for people to rely on OpenBLAS. Like this MKL example, you have to be aware of its threading issues, read the documentation and compile it with the right flags (in a multi-threaded application: single-threaded, but with locking).
it's worth noting that OpenBLAS is as fast as MKL
This depends highly on the application. E.g. MKL provides batch GEMM, which is used by libraries like PyTorch. So if you use PyTorch for machine learning, performance is still much better with MKL. Of course, that is if you do not have an AMD CPU. If you have an AMD CPU, you have to override Intel CPU detection if you do not want abysmal performance:
https://danieldk.eu/Posts/2020-08-31-MKL-Zen.html
https://www.agner.org/optimize/blog/read.php?i=49
The BLAS/LAPACK ecosystem is a mess. I wish that Intel would just open source MKL and properly support AMD CPUs.
Can you explain what you mean by this? Are you saying there's a correctness issue here? I only recall running into issues with MPI, where you (typically) run one MPI rank (process) per CPU core. Then if you combine that with a multi-threaded BLAS library you'll suddenly have N^2 BLAS threads fighting over the CPU's and performance goes down the drain. The solution to this is, like you say, to use a single-threaded OpenBLAS, or then the OpenMP OpenBLAS and set OMP_NUM_THREADS=1
I guess with threads you'll have the same issue if you launch N cpu-bound threads and all those call BLAS, resulting in the same N^2 issue as you see with MPI.
Given that their latest compilers are based on LLVM, that seems like a fair trade between the closed- and open-source worlds.
I’ve never had any issue when using it in OpenMP codes (either compiling it myself or using the libopenblas_omp.so present in some distros), what do you mean by “burn in a fire”?
R is single-threaded.
Also in a previous life, I recall running into distro openblas packages that were not compiled with DYNAMIC_ARCH=1 (which enables the openblas runtime cpu target architecture selection, similar to e.g. MKL) but were instead compiled with some lowest common denominator x86_64 arch. I filed some bug(s?), and IIRC this problem has subsequently been fixed.
Huh? The bug tracker is here:
Yes, for filing a bug you need to request an account because they apparently were overwhelmed with spam, as documented here:
It turns out the way the second R package would determine the required precision of floats in sparse arrays was based on the compiled linear algebra libraries available. It took a week for us to debug and ultimately it was easier for us to just rewrite the whole thing in Python.
Renv has made things easier but I don't think packrat/renv allows you to lock C/C++ libraries as well as R ones.
gcc -shared -Wl,-soname=libgomp.so.1 -o libgomp.so.1 empty.c -lomp5
where empty.c is an empty file, and put the result on LD_LIBRARY_PATH ahead of the real libgomp. Alternatively, preload the compatible libomp5. On Debian 11 there's already a libgomp in the llvm packaging. Dynamic linking assumed, as is right and fitting.That is, instead of checking after doing the x[i] *= SCALE bit with cblas, I would check both before and after the scaling.
I want to add another dimension to this argument, what if we can maintain an existing language eco-system (library, community, etc) but modernize the engine that's running and compilation of the R language. This new engine can avoid the dreaded global locking limitation, provides native multi-thread applications and seamless interface with non-native R libraries in C/C++. Interestingly someone has tried this, with a sponsorship from Oracle no less, and presented this futile effort in the last year's R flagship conference keynote [2].
IMHO he will be more successful in his endeavour if using D language in his previous endeavors. What so special about D language you may ask? I would refer to the fact most of the languages do not provide Ruby on Rails (RoR) like tool except D but that for another story (see ref [1]). There's also the fact that D has a working alternative library to OpenBLAS and MKL, and it's even faster than both them five years back [3]! D also supports open method as an alternative for multiple dispatches that is much touted by Julia language community. D is also bringing native support for borrow checker feature that's always mentioned in the same sentence as Rust language. In addition D also has second to none FFI support for C and C++ language. Heck the latest D compiler has standard C compiler built-in. I can go on furthermore but I think you've probably already got the pictures.
My not so humble proposal to R and D language community is to compile R on top of D language. Essentially you a have dynamic language of R that is compiled at runtime (CTFE) on top of static D language. This approach is becoming more popular now as posted recently for the new Val and Valet language combination [4]. Just think of CTFE as the new JVM, but provides truly static and native compilation for R.
[1] What makes a programming language productive? “Stop designing languages. Write libraries instead.”:
https://jaxenter.com/stop-designing-languages-write-librarie...
[2] Why R? 2020 Keynote - Jan Vitek - How I Learned to Love Failing at Compiling R:
https://www.youtube.com/watch?v=VdD0nHbcyk4
[3] Numeric age for D: Mir GLAS is faster than OpenBLAS and Eigen:
http://blog.mir.dlang.io/glas/benchmark/openblas/2016/09/23/...
[4] Show HN: Val - A powerful static and dynamic programming language(val-lang.org):
R is good for explorative data analysis but useless for everything else.
I hear a lot of "R is bad, Python is Enterprise Production Quality (TM)" blather at my work. It's always because the people involved don't understand computers, don't read documentation, don't debug, don't do root cause analysis, and want to quickly pass off responsibility for their laziness and incompetence. Meanwhile I and my team are happily chugging away, producing millions of dollars of reliable value for my company in R year after year.
Python lags far behind R in wide swaths of data science. Pandas is inferior to both dplyr and data.table, and R's modeling capabilities blow Python's out of the water in breadth and depth. You only use Python when you have to, e.g. for unstructured data and deep learning type stuff.
If your colleagues make you deal with their bad R code, that's too bad, but don't blame the language. It's designed to be easy to use, so a lot of bad coders use it. Go train your bad coders or hire better ones.
Now some people say this can be solved with a good IDE. Which might (or might not) be true if you can reliably identify, by manually reviewing the code, the ends of the functions, loops, etc which got munged in the paste.
But interestingly enough, jupyter notebooks (which seem to be the go-to tool these days) aren't IDEs. Making it incredibly easy to fubar otherwise perfectly working code by pasting it from your local IDE into, let's say an AWS Sagemaker instance, to pick one random example of a current widely used jupyter implementation. So even if the problem could be fixed by a good IDE, there is no guarantee that that IDE is (easily) accessible for production code.
I just have a hard time seeing how such a fundamental flaw in a language can lead to "good software engineering"
A language that's more flexible than your favorite "encourages bad habits", while a language that's less flexible than yours is "bureaucratic".