But isn't the link shared by that comment doing exactly that
https://sulbhajain.medium.com/why-llms-arent-truly-determini...
>The Thinking Machines research team showed it’s possible to fix this. They built batch-invariant kernels for RMSNorm, matrix multiplication, and attention, integrating them into the open-source inference engine vLLM.
>The outcome: 1,000 identical prompts, 1,000 identical outputs. Perfect reproducibility.
??