undefined | Better HN

0 pointseigenspace1y ago0 comments

Numpy is not fast enough for actual performance sensitive scientific computing. Yes threading can help, but at the end of the day the single threaded perf isn't where it needs to be, and is held back too much by the python glue between Numpy calls. This makes interproceedural optimizations impossible.

Accellerated sub-languages like Numba, Jax, Pytorch, etc. or just whole new languages are really the only way forward here unless massive semantic changes are made to Python.

0 comments

rfoo1y ago

These "accelerated sub-languages" are still driven by, well, Python glue. That's why we need free-threading and faster Python. We want the glue to be faster because it's currently the most accessible glue to the community.

In fact, Sam, the man behind free-threading, works on PyTorch. From my understanding he decided to explore nogil because GIL is holding DL trainings written in PyTorch back. Namely, the PyTorch DataLoader code itself and almost all data loading pipelines in real training codebases are hopeless bloody mess just because all of the IPC/SHM nonsense.

j / k navigate · click thread line to collapse

0 comments

rfoo1y ago

j / k navigate · click thread line to collapse