I find this perspective both scary and exciting. I'm curious, how do you validate the LLM's output? If you have a way to do this, and it's working. Then that's amazing. If you don't, how are you gauging "work best"?
I gauge what work's best if I can already do what I am asking it to do, and that comes from years of studying and trial and error experience without LLMs. I have no way of verifying what's a hallucination unless I am an expert