Training a large model to guess when it doesn't know the answer results in fiction. They need to do something else to get nonfiction.
By contrast, for Go the model was trained not to make illegal moves, because checking for that as part of the training is easy and cheap.
Anyway I already told you the answer. The AI will need a series of trainable belief systems to verify whether statements are internally consistent. The strange part about this is that the AI would need to have a way to obtain validation and each prompt would have to derive a new belief system which you must use in the next prompt.
In other words, the model must be able to learn continuously. That is something that these single shot AI models are not capable of.
Problem is, they didn't do that
The AI doesn't know the best move. It just knows a good move.
That doesn't mean transportation is solved.