undefined | Better HN

0 pointsalmostgotcaught11mo ago0 comments

it's funny - people around here really do not have a clue about the GPU ecosystem even though everyone is always talking about AI:

> The article is about the next wave of Python-oriented JIT toolchains

the article is content marketing (for whatever) but the actual product has literally has nothing to do with kernels or jitting or anything

https://github.com/NVIDIA/cuda-python

literally just cython bindings to CUDA runtime and CUB.

for once CUDA is aping ROCm:

https://github.com/ROCm/hip-python

0 comments

dragonwriter11mo ago

The mistake you seem to be making is confusing the existing product (which has been available for many years) with the upcoming new features for that product just announced at GTC, which are not addressed at all on the page for the existing product, but are addressed in the article about the GTC announcement.

almostgotcaughtOP11mo ago

> The mistake you seem to be making is confusing the existing product

i'm not making any such mistake - i'm just able to actually read and comprehend what i'm reading rather than perform hype:

> Over the last year, NVIDIA made CUDA Core, which Jones said is a “Pythonic reimagining of the CUDA runtime to be naturally and natively Python.”

so the article is about cuda-core, not whatever you think it's about - so i'm responding directly to what the article is about.

> CUDA Core has the execution flow of Python, which is fully in process and leans heavily into JIT compilation.

this is bullshit/hype about Python's new JIT which womp womp womp isn't all that great (yet). this has absolutely nothing to do with any other JIT e.g., the cutile kernel driver JIT (which also has absolutely nothing to do with what you think it does).

dragonwriter11mo ago

> i'm just able to actually read and comprehend what i'm reading rather than perform hype:

The evidence of that is lacking.

> so the article is about cuda-core, not whatever you think it's about

cuda.core (a relatively new, rapidly developing, library whose entire API is experimental) is one of several things (NVMath is another) mentioned in the article, but the newer and as yet unreleased piece mentioned in the article and the GTC announcement, and a key part of the “Native Python” in the headline, is the CuTile model [0]:

“The new programming model, called CuTile interface, is being developed first for Pythonic CUDA with an extension for C++ CUDA coming later.”

> this is bullshit/hype about Python's new JIT

No, as is is fairly explicit in the next line after the one you quote, it is about the Nvidia CUDA Python toolchain using in-process compilation rather than relying on shelling out to out-of-process command-line compilers for CUDA code.

[0] The article only has fairly vague qualitative description of what CuTile is, but (without having to watch the whole talk from GTC), one could look at this tweet for a preview of what the Python code using the model is expected to look like when it is released: https://x.com/blelbach/status/1902113767066103949?t=uihk0M8V...

1 more reply

squeaky-clean11mo ago

Isn't the main announcement of the article CuTile? Which has not been released yet.

Also the cuda-core JIT stuff has nothing to do with Python's new JIT, it's referring to integrating nvJitLink with python, which you can see an example of in cuda_core/examples/jit_lto_fractal.py

ashvardanian11mo ago

In case someone is looking for some performance examples & testimonials, even on RTX 3090 vs a 64-core AMD Epy/Threadripper, even a couple of years ago, CuPy was a blast. I have a couple of recorded sessions with roughly identical slides/numbers:

  - San Francisco Python meetup in 2023: https://youtu.be/L9ELuU3GeNc?si=TOp8lARr7rP4cYaw
  - Yerevan PyData meetup in 2022: https://youtu.be/OxAKSVuW2Yk?si=5s_G0hm7FvFHXx0u

Of the more remarkable results:

  - 1000x sorting speedup switching from NumPy to CuPy.
  - 50x performance improvements switching from Pandas to CuDF on the New York Taxi Rides queries.
  - 20x GEMM speedup switching from NumPy to CuPy.

CuGraph is also definitely worth checking out. At that time, Intel wasn't in as bad of a position as they are now and was trying to push Modin, but the difference in performance and quality of implementation was mind-boggling.

ladberg11mo ago

The main release highlighted by the article is cuTile which is certainly about jitting kernels from Python code

almostgotcaughtOP11mo ago

> main release

there is no release of cutile (yet). so the only substantive thing that the article can be describing is cuda-core - which it does describe and is a recent/new addition to the ecosystem.

man i can't fathom glazing a random blog this hard just because it's tangentially related to some other thing (NV GPUs) that clearly people only vaguely understand.

throwaway31415511mo ago

christ man lighten the fuck up. there's zero need to be _so_ god damn patronizing and disrespectful.

yieldcrv11mo ago

I just want to see benchmarks. is this new one faster than CuPy or not

j / k navigate · click thread line to collapse

0 comments

dragonwriter11mo ago

almostgotcaughtOP11mo ago

> The mistake you seem to be making is confusing the existing product

i'm not making any such mistake - i'm just able to actually read and comprehend what i'm reading rather than perform hype:

> Over the last year, NVIDIA made CUDA Core, which Jones said is a “Pythonic reimagining of the CUDA runtime to be naturally and natively Python.”

so the article is about cuda-core, not whatever you think it's about - so i'm responding directly to what the article is about.

> CUDA Core has the execution flow of Python, which is fully in process and leans heavily into JIT compilation.

dragonwriter11mo ago

> i'm just able to actually read and comprehend what i'm reading rather than perform hype:

The evidence of that is lacking.

> so the article is about cuda-core, not whatever you think it's about

“The new programming model, called CuTile interface, is being developed first for Pythonic CUDA with an extension for C++ CUDA coming later.”

> this is bullshit/hype about Python's new JIT

1 more reply

squeaky-clean11mo ago

Isn't the main announcement of the article CuTile? Which has not been released yet.

Also the cuda-core JIT stuff has nothing to do with Python's new JIT, it's referring to integrating nvJitLink with python, which you can see an example of in cuda_core/examples/jit_lto_fractal.py

ashvardanian11mo ago

  - San Francisco Python meetup in 2023: https://youtu.be/L9ELuU3GeNc?si=TOp8lARr7rP4cYaw
  - Yerevan PyData meetup in 2022: https://youtu.be/OxAKSVuW2Yk?si=5s_G0hm7FvFHXx0u

Of the more remarkable results:

  - 1000x sorting speedup switching from NumPy to CuPy.
  - 50x performance improvements switching from Pandas to CuDF on the New York Taxi Rides queries.
  - 20x GEMM speedup switching from NumPy to CuPy.

ladberg11mo ago

The main release highlighted by the article is cuTile which is certainly about jitting kernels from Python code

almostgotcaughtOP11mo ago

> main release

there is no release of cutile (yet). so the only substantive thing that the article can be describing is cuda-core - which it does describe and is a recent/new addition to the ecosystem.

man i can't fathom glazing a random blog this hard just because it's tangentially related to some other thing (NV GPUs) that clearly people only vaguely understand.

throwaway31415511mo ago

christ man lighten the fuck up. there's zero need to be _so_ god damn patronizing and disrespectful.

yieldcrv11mo ago

I just want to see benchmarks. is this new one faster than CuPy or not

j / k navigate · click thread line to collapse