I imagine closing the loop (using the TS compiler to restrict token output weights) is in the works, though it's probably not totally trivial. You'd need:
* An incremental TS compiler that could report "valid" or "valid prefix" (ie, valid as long as the next token is not EOF)
* The ability to backtrack the model
Idk how hard either one piece is.