1. A thought is a representation of a situation
2. A representation generates entailments of that situation
3. Language is many-to-one translation from these representations to symbols
4. Understanding language is reversing these symbols into thoughts (ie., reprs)
So,
5. If agent A understands sentence X then A forms the relevant representation of X.
6. If agent has a representation it can state entailments of S (eg., counter-facutals).
Now, split X into Xc = "canonical descriptions of S" and trivial permutations Xp.
(st. distribution of Xc,Xp is low, but the tokens of Xp are common)
Form entailments of X, say Y -- sentences that are cannonically implied by the truth of X.
7. If the LLM understood that X entails Y, it would be via constructing the repr S -- which entails S regardless of which sentence in X was used.
8. Train an LLM on Xc and it's accuracy on judging Y entailed by Xp is random.
9. Since using Xp sentences cause it to fail, it does not predict Y via S.
QED.
And we can say,
1. Appearing to judge Y entailed-by X is possible via simple sampling of (X, Y) in historical cases. 2. LLMs are just such a sampling.
so,
3. +Inference to the best explanation:
4. LLMs sample historical cases rather than form representations.
Incidentally, "sampling of historical cases" is already something we knew -- so this entire argument is basically unnecessary. And only necessary because PhDs have been turned into start-up hype men.