Asking LLM from things they learned in training mostly result in hallucinations and in general makes you unable to detect by which amount they are hallucinating: these models are unable to reflect on their output, and average output token probability is a lousy proxy for confidence scoring their results.
On the other hand, no amount of prompt engineering seems to make these LLM able to do question and answer over source documents which is the only realistic way by which factual information can be retrieved
You're welcome to bring examples of it tho if you're so confident.