undefined | Better HN

0 pointsdauhak1y ago0 comments

And people with short term memory loss nevertheless have theory of mind just fine. Nothing about LLM's dropping context over big enough windows implies they don't have theory of mind, it just shows they have limitations - just like humans even with "normal" memory will lose track over a huge context window.

Like there are plenty of shortcomings of LLMs but it feels like people are comparing them to some platonic ideal human when writing them off

0 comments

Arkhaine_kupo1y ago

> Nothing about LLM's dropping context over big enough windows implies they don't have theory of mind

ToM is a large topic, but most people, when talking about an entity X, they have a state in memory about that entity, almost like an Object in a programming language. Thta Object has attributes, and conditions etc that exist beyond the context window of the observer.

If you have a friend Steve, who is a doctor. And you don't see him for 5 years, you can predict he will still be working at the hospital, because you have an understanding of what Steve is.

For an LLM you can define a concept of Steve, and his profession and it will adequately mimic replies about him. But in 5 years that LLMs would not be able to talk about Steve. It would recreate a different conversation, possibly even a convincing simulacrum of remembering Steve. But internally, there is no Steve, nowhere in the nodes of the LLM does Steve exist or have ever existed.

That inability to have a world model means that an LLM can replicate the results of a theory of mind but not posses one.

Humans lose track of information, but we have a state to keep track of elements that are ontologicaly distinct. LLMs do not, and treat them as equal.

For a human, the sentence Alice and bob go to the market, when will they be back? is different than Bob and Alice went to the market, when will they be back?

Because Alice and Bob are real humans, you can imagine them, you might have even met them. But to an LLM those are the same sentence. Even outside of the argument about The Red Room/ Mary's room there simply are enough gaps in the way a LLM is constructed to be considered a valid owner of a ToM

dauhakOP1y ago

ToM is about being able to model the internal beliefs/desires etc of another person as being entirely distinct from yours. You're basically bringing up a particular implementation of long-term memory as a necessary component of it, which I've never once seen? If someone has severe memory issues, they could forget who Steve is every few minutes, but still be able to look at Steve doing something and model what Steve must want and believe given his actions

I don't think we have any strong evidence on whether LLMs have world-models one way or another - it feels like a bit of a fuzzy concept and I'm not sure what experiments you'd try here.

I disagree with your last point, I think those are functionally the same sentence

Arkhaine_kupo1y ago

> ToM is about being able to model the internal beliefs/desires etc of another person as being entirely distinct from yours.

In that sentence you are implying that you have the "ability to model ... another". An LLM cannot do that, it can't have an internal model that is consistent beyond its conversational scope. Its not meant to. Its a statistics guesser, its probabilistic, holds no model, and its anthropomorphised by our brains because the output is incredibly realistic not because it actually has that ability

The ability to mimic the replies of someone with that ability, is the same of Mary being able to describe all the qualities of Red. She still cannot see red, despite her ability to pass any question in relation to its characteristics.

> I don't think we have any strong evidence on whether LLMs have world-models one way or another

They simply cannot by their architecture. Its a statistical language sampler, anything beyond the scope of that fails. Local coherance is why they pick the next right token not because they can actually model anything.

> I think those are functionally the same sentence

Functionally and literally are not the same thing though. Its why we can run studies as to why some people might say Bob and Alice (putting the man first) or Alice and Bob (alphabetical naming) and what human societies and biases affect the order we put them on.

You could not run that study on an LLM because you will find that statistically speaking the ordering will be almost identical to the training data. If the training data overwhelmingly puts male names first or whether the training data orders list alphabetically you will see that reproduced on the output of the llm because Bob and Alice are not people, they are statistical probably letters in order.

LLM seem to trigger borderline mysticism in people who are otherwise insanely smart, but the kind of "we cant know its internal mind" sounds like reading tea leaves, or horoscopes by people with enough Phds to have their number retired on their university like Michael Jordan.

1 more reply

fragmede1y ago

With computer use, you can get Claude to read and write files and have some persistence outside of the static LLM model. If it writes a file Steve.txt, that it can pull up later, does it now have ToM?

j / k navigate · click thread line to collapse

0 comments

Arkhaine_kupo1y ago

> Nothing about LLM's dropping context over big enough windows implies they don't have theory of mind

If you have a friend Steve, who is a doctor. And you don't see him for 5 years, you can predict he will still be working at the hospital, because you have an understanding of what Steve is.

That inability to have a world model means that an LLM can replicate the results of a theory of mind but not posses one.

Humans lose track of information, but we have a state to keep track of elements that are ontologicaly distinct. LLMs do not, and treat them as equal.

For a human, the sentence Alice and bob go to the market, when will they be back? is different than Bob and Alice went to the market, when will they be back?

dauhakOP1y ago

I don't think we have any strong evidence on whether LLMs have world-models one way or another - it feels like a bit of a fuzzy concept and I'm not sure what experiments you'd try here.

I disagree with your last point, I think those are functionally the same sentence

Arkhaine_kupo1y ago

> ToM is about being able to model the internal beliefs/desires etc of another person as being entirely distinct from yours.

> I don't think we have any strong evidence on whether LLMs have world-models one way or another

> I think those are functionally the same sentence

1 more reply

fragmede1y ago

j / k navigate · click thread line to collapse