The issue you’re describing is more to do with correctness and performance (two critical elements of a good compiler), not nondeterminism.
If a natural language compiler can output correct performant code, nondeterminism shouldn’t matter.
For example, take a script that randomly invokes either gcc or clang, maybe randomly sets the optimization level. Multiple invocations will output vastly differently, but we can be confident the output is correct and to some degree performant.