Again, this is not about being able to write the prompt in a way that allows GPT to find the answer. I’m not doubting its ability to do so. It’s that a human can reason through why the answer should be different, despite any common priors, and arrive at the correct judgment.
It indicates that there’s still something a human does that the machine doesn’t, even if we’re not able to place what it is. This is neither an argument for nor against progress towards AGI, just an observation. It’s interesting regardless (to me).