I try to explain a subject to the AI as if I am trying to teach a friend who has no knowledge on the subject. Then I am judging how well I can communicate the idea. In many cases as soon as I try to explain the idea I realize that my own knowledge isn't deep. So I am judging myself, not the quality of the AI.
An analogy is the Rubber Duck method of program debugging. Most programers have been in the situation where when they try to explain an answer to another programmer the answer will suddenly pop into their head. The value isn't the knowledge in the other programmers head, it is in the act of trying to explain. LLMs are patient and unopinionated and make for a good recipient.