I think the question that adds a random cat factoid at the end is going to trip up a lot fewer humans than you think. At the very least, they could attempt to tell you after the fact why they thought it was relevant.
And ignoring that, obviously we should be holding these LLMs to a higher standard than “human with extraordinary intelligence and encyclopedic knowledge that can get tripped up by a few irrelevant words in a prompt.” Like, that should _never_ happen if these tools are what they’re claimed to be.
A human would probably note it as a trick in their reply.
The way LLMs work it could bias their replies in weird ways by changing their replies in unexpected ways beyond seeing it as a trick.