> The article is about the next wave of Python-oriented JIT toolchains
the article is content marketing (for whatever) but the actual product has literally has nothing to do with kernels or jitting or anything
https://github.com/NVIDIA/cuda-python
literally just cython bindings to CUDA runtime and CUB.
for once CUDA is aping ROCm:
i'm not making any such mistake - i'm just able to actually read and comprehend what i'm reading rather than perform hype:
> Over the last year, NVIDIA made CUDA Core, which Jones said is a “Pythonic reimagining of the CUDA runtime to be naturally and natively Python.”
so the article is about cuda-core, not whatever you think it's about - so i'm responding directly to what the article is about.
> CUDA Core has the execution flow of Python, which is fully in process and leans heavily into JIT compilation.
this is bullshit/hype about Python's new JIT which womp womp womp isn't all that great (yet). this has absolutely nothing to do with any other JIT e.g., the cutile kernel driver JIT (which also has absolutely nothing to do with what you think it does).
The evidence of that is lacking.
> so the article is about cuda-core, not whatever you think it's about
cuda.core (a relatively new, rapidly developing, library whose entire API is experimental) is one of several things (NVMath is another) mentioned in the article, but the newer and as yet unreleased piece mentioned in the article and the GTC announcement, and a key part of the “Native Python” in the headline, is the CuTile model [0]:
“The new programming model, called CuTile interface, is being developed first for Pythonic CUDA with an extension for C++ CUDA coming later.”
> this is bullshit/hype about Python's new JIT
No, as is is fairly explicit in the next line after the one you quote, it is about the Nvidia CUDA Python toolchain using in-process compilation rather than relying on shelling out to out-of-process command-line compilers for CUDA code.
[0] The article only has fairly vague qualitative description of what CuTile is, but (without having to watch the whole talk from GTC), one could look at this tweet for a preview of what the Python code using the model is expected to look like when it is released: https://x.com/blelbach/status/1902113767066103949?t=uihk0M8V...
Also the cuda-core JIT stuff has nothing to do with Python's new JIT, it's referring to integrating nvJitLink with python, which you can see an example of in cuda_core/examples/jit_lto_fractal.py
- San Francisco Python meetup in 2023: https://youtu.be/L9ELuU3GeNc?si=TOp8lARr7rP4cYaw
- Yerevan PyData meetup in 2022: https://youtu.be/OxAKSVuW2Yk?si=5s_G0hm7FvFHXx0u
Of the more remarkable results: - 1000x sorting speedup switching from NumPy to CuPy.
- 50x performance improvements switching from Pandas to CuDF on the New York Taxi Rides queries.
- 20x GEMM speedup switching from NumPy to CuPy.
CuGraph is also definitely worth checking out. At that time, Intel wasn't in as bad of a position as they are now and was trying to push Modin, but the difference in performance and quality of implementation was mind-boggling.there is no release of cutile (yet). so the only substantive thing that the article can be describing is cuda-core - which it does describe and is a recent/new addition to the ecosystem.
man i can't fathom glazing a random blog this hard just because it's tangentially related to some other thing (NV GPUs) that clearly people only vaguely understand.