undefined | Better HN

0 pointsvundercind1y ago0 comments

I thought maybe they were on the right track until I read Attention Is All You Need.

Nah, at best we found a way to make one part of a collection of systems that will, together, do something like thinking. Thinking isn’t part of what this current approach does.

What’s most surprising about modern LLMs is that it turns out there is so much information statistically encoded in the structure of our writing that we can use only that structural information to build a fancy Plinko machine and not only will the output mimic recognizable grammar rules, but it will also sometimes seem to make actual sense, too—and the system doesn’t need to think or actually “understand” anything for us to, basically, usefully query that information that was always there in our corpus of literature, not in the plain meaning of the words, but in the structure of the writing.

0 comments

SturgeonsLaw1y ago

> at best we found a way to make one part of a collection of systems that will, together, do something like thinking

This seems like the most viable path to me as well (educational background in neuroscience but don't work in the field). The brain is composed of many specialised regions which are tuned for very specific tasks.

LLMs are amazing and they go some way towards mimicking the functionality provided by Broca's and Wernicke's areas, and parts of the cerebrum, in our wetware, however a full brain they do not make.

The work on robots mentioned elsewhere in the thread is a good way to develop cerebellum like capabilities (movement/motor control), and computer vision can mimic the lateral geniculate nucleus and other parts of the visual cortex.

In nature it takes all these parts working together to create a cohesive mind, and it's likely that an artificial brain would also need to be composed of multiple agents, instead of just trying to scale LLMs indefinitely.

youoy1y ago

Don't get caught in the superficial analysis. They "understand" things. It is a fact that LLMs experience a phase transition during training, from positional information to semantic understanding. It may well be the case that with scale there is another phase transition from semantic to something more abstract that we identify more closely with reasoning. It would be an emergent property of a sufficiently complex system. At least that is the whole argument around AGI.

Jensson1y ago

They understand sentences but not words.

youoy1y ago

What do you mean by that? We have the monosemanticity results [0]

[0] https://transformer-circuits.pub/2024/scaling-monosemanticit...

foxglacier1y ago

> think or actually “understand” anything

It doesn't matter if that's happening or not. That's the whole point of the Chinese room - if it can look like it's understanding, it's indistinguishable from actually understanding. This applies to humans too. I'd say most of our regular social communication is done in a habitual intuitive way without understanding what or why we're communicating. Especially the subtle information conveyed in body language, tone of voice, etc. That stuff's pretty automatic to the point that people have trouble controlling it if they try. People get into conflicts where neither person understands where they disagree but they have emotions telling them "other person is being bad". Maybe we have a second consciousness we can't experience and which truly understands what it's doing while our conscious mind just uses the results from that, but maybe we don't and it still works anyway.

Educators have figured this out. They don't test students' understanding of concepts, but rather their ability to apply or communicate them. You see this in school curricula with wording like "use concept X" rather than "understand concept X".

vundercindOP1y ago

There’s a distinction in behavior of a human and a Chinese room when things go wrong—when the rule book doesn’t cover the case at hand.

I agree that a hypothetical perfectly-functioning Chinese room is, tautologically, impossible to distinguish from a real person who speaks Chinese, but that’s a thought experiment, not something that can actually exist. There’ll remain places where the “behavior” breaks down in ways that would be surprising from a human who’s actually paying as much attention as they’d need to be to have been interacting the way they had been until things went wrong.

That, in fact, is exactly where the difference lies: the LLM is basically always not actually “paying attention” or “thinking” (those aren’t things it does) but giving automatic responses, so you see failures of a sort that a human might also exhibit when following a social script (yes, we do that, you’re right), but not in the same kind of apparently-highly-engaged context unless the person just had a stroke mid-conversation or something—because the LLM isn’t engaged, because being-engaged isn’t a thing it does. When it’s getting things right and seeming to be paying a lot of attention to the conversation, it’s not for the same reason people give that impression, and the mimicking of present-ness works until the rule book goes haywire and the ever-gibbering player-piano behind it is exposed.

foxglacier1y ago

> the “behavior” breaks down in ways that would be surprising from a human who’s actually paying as much attention as they’d need to be to have been interacting the way they had been until things went wrong.

That's an interesting angle. Though of course we're not surprised by human behavior because that's where our expectations of understanding come from. If we were used to dealing with perfectly-correctly-understanding super-intelligences, then normal humans would look like we don't understand much and our deliberate thinking might be no more accurate than the super-intelligence's absent-minded automatic responses. Thus we would conclude that humans are never really thinking or understanding anything.

I agree that default LLM output makes them look like they're thinking like a human more than they really are. I think mistakes are shocking more because our expectation of someone who talks confidently is that they're not constantly revealing themselves to be an obvious liar. But if you take away the social cues and just look at the factual claims they provide, they're not obviously not-understanding vs humans are-understanding.

nuancebydefault1y ago

I would argue maybe people also are not thinking but simply processing. It is known that most of what we do and feel goes automatically (subconsciously).

But even more, maybe consciousness is an invention of our 'explaining self', maybe everything is automatic. I'm convinced this discussion is and will stay philosophical and will never get any conclusion.

1 more reply

kenjackson1y ago

> but it will also sometimes seem to make actual sense, too

When I read stuff like this it makes me wonder if people are actually using any of the LLMs...

disgruntledphd21y ago

The RLHF is super important in generating useful responses, and that's relatively new. Does anyone remember gpt3? It could make sense for a paragraph or two at most.

hackinthebochs1y ago

I see takes like this all the time and its so confusing. Why does knowing how things work under the hood make you think its not on the path towards AGI? What was lacking in the Attention paper that tells you AGI won't be built on LLMs? If its the supposed statistical nature of LLMs (itself a questionable claim), why does statistics seem so deflating to you?

fullstackchris1y ago

Comments like these are so prevalent and yet illustrate very well the lack of understanding of the underlying technology. Neural nets, once trained, are static! You'll never get dynamic "through-time" reasoning like you can with a human-like mind. It's simply the WRONG tool. I say human-like because I still think AGI could be acheived in some digital format, but I can assure you it wont be packaged in a static neural net.

Now, neural nets that have a copy of themselves, can look back at what nodes were hit, and change through time... then maybe we are getting somewhere

hackinthebochs1y ago

The context window of LLMs gives something like "through time reasoning". Chain of thought goes even further in this direction.

vundercindOP1y ago

> Why does knowing how things work under the hood make you think its not on the path towards AGI?

Because I had no idea how these were built until I read the paper, so couldn’t really tell what sort of tree they’re barking up. The failure-modes of LLMs and ways prompts affect output made a ton more sense after I updated my mental model with that information.

hackinthebochs1y ago

Right, but its behavior didn't change after you learned more about it. Why should that cause you to update in the negative? Why does learning how it work not update you in the direction of "so that's how thinking works!" rather than, "clearly its not doing any thinking"? Why do you have a preconception of how thinking works such that learning about the internals of LLMs updates you against it thinking?

1 more reply

fragmede1y ago

But we don't know how human thinking works. Suppose for a second that it could be represented as a series of matrix math. What series of operations are missing from the process that would make you think it was doing some fascimile of thinking?

chongli1y ago

Because it can't apply any reasoning that hasn't already been done and written into its training set. As soon as you ask it novel questions it falls apart. The big LLM vendors like OpenAI are playing whack-a-mole on these novel questions when they go viral on social media, all in a desperate bid to hide this fatal flaw.

The Emperor has no clothes.

hackinthebochs1y ago

>As soon as you ask it novel questions it falls apart.

What do you mean by novel? Almost all sentences it is prompted on are brand new and it mostly responds sensibly. Surely there's some generalization going on.

1 more reply

alexashka1y ago

Because AGI is magic and LLMs are magicians.

But how do you know a magician that knows how to do card tricks isn't going to arrive at real magic? Shakes head.

j / k navigate · click thread line to collapse

0 comments

SturgeonsLaw1y ago

> at best we found a way to make one part of a collection of systems that will, together, do something like thinking

LLMs are amazing and they go some way towards mimicking the functionality provided by Broca's and Wernicke's areas, and parts of the cerebrum, in our wetware, however a full brain they do not make.

youoy1y ago

Jensson1y ago

They understand sentences but not words.

youoy1y ago

What do you mean by that? We have the monosemanticity results [0]

[0] https://transformer-circuits.pub/2024/scaling-monosemanticit...

foxglacier1y ago

> think or actually “understand” anything

vundercindOP1y ago

There’s a distinction in behavior of a human and a Chinese room when things go wrong—when the rule book doesn’t cover the case at hand.

foxglacier1y ago

nuancebydefault1y ago

I would argue maybe people also are not thinking but simply processing. It is known that most of what we do and feel goes automatically (subconsciously).

1 more reply

kenjackson1y ago

> but it will also sometimes seem to make actual sense, too

When I read stuff like this it makes me wonder if people are actually using any of the LLMs...

disgruntledphd21y ago

The RLHF is super important in generating useful responses, and that's relatively new. Does anyone remember gpt3? It could make sense for a paragraph or two at most.

hackinthebochs1y ago

fullstackchris1y ago

Now, neural nets that have a copy of themselves, can look back at what nodes were hit, and change through time... then maybe we are getting somewhere

hackinthebochs1y ago

The context window of LLMs gives something like "through time reasoning". Chain of thought goes even further in this direction.

vundercindOP1y ago

> Why does knowing how things work under the hood make you think its not on the path towards AGI?

hackinthebochs1y ago

1 more reply

fragmede1y ago

chongli1y ago

The Emperor has no clothes.

hackinthebochs1y ago

>As soon as you ask it novel questions it falls apart.

What do you mean by novel? Almost all sentences it is prompted on are brand new and it mostly responds sensibly. Surely there's some generalization going on.

1 more reply

alexashka1y ago

Because AGI is magic and LLMs are magicians.

But how do you know a magician that knows how to do card tricks isn't going to arrive at real magic? Shakes head.

j / k navigate · click thread line to collapse