undefined | Better HN

0 pointsleephillips4y ago0 comments

I’m aware that there is plenty of serious computation done with these tools. I don’t want to overstate; I merely meant that, for a fresh project, Julia is now a better choice for a large-scale simulation. Note that no combination of any of the faster implementations of Python + Numpy libraries has ever been used at the most demanding level of scientific computation. That has always been Fortran, with some C and C++, and now Julia.

“It turns out that phrasing things as array operations with numpy operators is quite natural in this field”

But if A and B are numpy arrays, then A + B will calculate the elementwise sum on a single core only, correct? It will vectorize, but not parallelize. All large-scale computation is multi-core.

0 comments

spenczar54y ago

> Note that no combination of any of the faster implementations of Python + Numpy libraries has ever been used at the most demanding level of scientific computation. That has always been Fortran, with some C and C++, and now Julia.

This still seems like an overstatement, but maybe it depends on what you mean by "most demanding level." I work on systems for the Rubin Observatory, which is going to be the largest astronomical survey by a lot. There's a bunch of C++ certainly, but heaps of Python. For example, catalog simulation (https://www.lsst.org/scientists/simulations/catsim) is pretty much entirely in Python.

Take a look at `lsst/imsim`, for example, from the Dark Energy collaboration at LSST: https://github.com/LSSTDESC/imSim.

Maybe this isn't the "most demanding" but I don't really know why.

> But if A and B are numpy arrays, then A + B will calculate the elementwise sum on a single core only, correct? It will vectorize, but not parallelize.

That's correct, but numba will parallelize the computation for you (https://numba.pydata.org/numba-doc/latest/user/parallel.html). It's pretty common to use numba's parallelization when relevant.

leephillipsOP4y ago

By a large-scale calculation I have in mind something like this: https://arxiv.org/pdf/2006.09368.pdf, which is in your field of astronomy. It uses about a billion dark-matter elements and was run on the Cobra supercomputer at Max Planck, which has about 137,000 CPU cores. It used the AREPO code, which is a C program that uses MPI. If you know of any calculation in this class using Python I would be interested to hear about it. But generally one doesn’t have one’s team write a proposal for time on a national supercomputing center and then, if it is approved, when your 100-hour slot is finally scheduled, upload a Python script to the cluster. But strange things happen.

EDIT: Yes, numba is impressive.

j / k navigate · click thread line to collapse