> The question is not underspecified, because the point of it is to demonstrate how the llm will never tell you it doesn't know
I don't see how "is not underspecified" follows from the point you were trying to demonstrate. Yes, you wanted it to be well specified, because otherwise the point doesn't work. But you actually failed to generate such a question cause both people and AI interpret it differently.