undefined | Better HN

0 pointsDer_Einzige1y ago0 comments

Running temperature of zero/greedy sampling (what you call "argmax sampling") is EXTREMELY common.

LLMs are basically "deterministic" when using greedy sampling except for either MoE related shenanigans (what historically prevented determinism in ChatGPT) or due to floating point related issues (GPU related). In practice, LLMs are in fact basically "deterministic" except for the sampling/temperature stuff that we add at the very end.

0 comments

HarHarVeryFunny1y ago

> except for either MoE related shenanigans (what historically prevented determinism in ChatGPT)

The original ChatCPT was based on GPT-3.5, which did not use MoE.

j / k navigate · click thread line to collapse

0 comments

HarHarVeryFunny1y ago

> except for either MoE related shenanigans (what historically prevented determinism in ChatGPT)

The original ChatCPT was based on GPT-3.5, which did not use MoE.

j / k navigate · click thread line to collapse