Do you mean you cannot guarantee the result based on a task request with a random query? Or something else? I was under the impression that LLMs are very deterministic if you provide a fixed seed for the samplers, fixed model weights, and fixed context. In cloud providers you can't guarantee this because of how they implement this (batching unrelated requests together and doing math). Now you can't guarantee the quality of the result from that and changing the seed or context can result in drastically different quality. But maybe you really mean non-deterministic but I'm curious where this non-determinism would come from.