Do LLM's generate words on the fly or can they sort of "go back" and correct themselves? stackghost brought up a good point I didn't think about before
Beam search generates multiple potential completions and scores multiple tokens by likelihood, the picks the most likely after some threshold or length, which is close to a "go back and try again".