undefined | Better HN

0 pointsalbertzeyer2y ago0 comments

Can you explain that?

My understand of Triton is more that this is an alternative to CUDA, but instead you write it directly in Python, and on a slightly higher-level, and it does a lot of optimizations automatically. So basically: Python -> Triton-IR -> LLVM-IR -> PTX.

https://openai.com/research/triton

0 comments

chillee2y ago

It's confusing, there's OpenAI Triton (what you're thinking of) and Nvidia Triton server (a different thing).

jerrygenser2y ago

Original comment is referring to Nvidia triton inference server

j / k navigate · click thread line to collapse