Stochastic Parrot (opens in new tab)

(en.wikipedia.org)

125 pointsmiobrien2y ago161 comments

161 comments

I worry that the "stochastic parrot" was premature, an idea sown early in development that will now carry along through any advances made.

Basically there is this innate idea that if the basic building blocks are simple systems with deterministic behavior, then the greater system can never be more than that. I've seen this is spades within the AI community, "It's just matrix multiplication! It's not capable of thinking or feeling!"

Which to me always felt more like a hopeful statement rather than a factual one. These guys have no idea what consciousness is (nobody does) nor have any reference point for what exactly is "thinking" or "feeling". They can't prove I'm not a stochastic parrot anymore than they can prove whatever cutting edge LLM isn't.

So while yes, present LLMs likely are just stochastic parrots, the same technology scaled might bring us a model that actually is "something that is something to be like", and we'll have everyone treating it with reckless carelessness because "its just a stochastic parrot".

nathan_compton2y ago

"These guys have no idea what consciousness is (nobody does)"

Where do people get off saying no one has any idea what consciousness is? I agree that there is a significant sliver of a philosophical problem which remains stubborn (how precisely does physical activity produce qualia), but neuroscience knows quite a bit about what physical processes underlie our behavior from the behavior of individual neurons to the activity of the entire brain.

I object to the wholesale dismissal of neuroscience because thinking about the brain relative to LLMs is genuinely informative about what sorts of things you could expect to be going on in an LLM. And, to my mind, a real appraisal of the differences between brains and LLMs makes the case pretty strongly that LLMs experience nothing and are, furthermore, fairly well characterized as stochastic parrots.

"They can't prove I'm not a stochastic parrot anymore than they can prove whatever cutting edge LLM isn't." Prove is a very strong word, but I think its actually quite possible to demonstrate via scientific observation that you differ in many, significant, and relevant to the question of "being a stochastic parrot", ways, from LLMs. It astounds me that people routinely suggest that human brains and LLMs are somehow indistinguishable.

SamBam2y ago

> I agree that there is a significant sliver of a philosophical problem which remains stubborn (how precisely does physical activity produce qualia)

But that IS the definition of consciousness! This is like saying "We understand practically everything about airplanes, except how they stay in the air."

Nothing discussed in neuroscience is relevant to understanding what consciousness IS (which is the question posed above). Finding out that stimulating such and such a region makes us sad, or that this bundle of nerves activates before we're consciously aware of a decision doesn't tell us anything about consciousness itself. We've known for hundreds of years that there is a relationship between the brain and consciousness, finding out more details doesn't answer the question.

(Now, whether consciousness is necessary for AGI is a separate question.)

munificent2y ago

> This is like saying "We understand practically everything about airplanes, except how they stay in the air."

Even that is a sufficient level of understanding to correctly determine that a motorcycle is not an airplane.

While we might not have a complete picture of what consciousness entails, we can at least list some necessary conditions for it to arise. Any system that lacks those conditions can at least be proven to not be conscious.

With LLMs specifically, I think there is a very strong argument that they are not and cannot be conscious at all, regardless of how big of a corpus you throw at it or how many parameters it has. Emily Bender explains it well here:

https://medium.com/@emilymenonbender/thought-experiment-in-t...

1 more reply

nathan_compton2y ago

I believe it is eminently credible draw a strong association between consciousness and the physical activity of the brain since it is relatively well backed up by scientific observation that there is a one to one correspondence between conscious experience and brain activity. Although we still don't understand precisely how the physical activity creates qualia, I think its perfectly reasonable to say that studying and understanding brain activity constitutes studying and understanding consciousness.

2 more replies

ben_w2y ago

> Where do people get off saying no one has any idea what consciousness is?

I'm not "getting off" saying that, but I do say it often.

For me, it's important to know:

If we think an artificial neutral network can have consciousness and we're wrong, then there is a risk of all the people who want to have their minds uploaded having a continued existence no better than the one of TV stars reproduced on VHS tape. There is also a risk of this being done as a standard treatment for lesser injuries, especially if it's cheaper.

If we think an artificial neutral network can't have consciousness and we're wrong, then there is a risk of creating a new slave class that makes real the fears of the Haitian slaves in the form of the Vodou concept of a zombie — not even death will free them from eternal slavey.

nathan_compton2y ago

Well, that is an entirely different question. My personal view is that nothing prevents an artificial neural network from having a consciousness (I don't think it makes sense to believe there is anything magical about human brains).

What I am saying is that we emphatically know things about the physical processes that (almost certainly) generate consciousness and that we should take that knowledge seriously when examining artificial neural networks. People eager to attribute more to these networks than they plausibly constitute love to dismiss all this knowledge so as to muddy the waters of comparison.

1 more reply

panda-giddiness2y ago

> I agree that there is a significant sliver of a philosophical problem which remains stubborn (how precisely does physical activity produce qualia)

But that "sliver" of a problem is known as the hard problem of consciousness for a reason [1], which is exactly the sort of problem neuroscience can only address in a limited capacity. Understanding how nerves propagate a signal to produce a sensory input (an "easy" problem of consciousness) doesn't inform us as to why certain physical mechanisms result in conscious experience (or more fundamentally what it even means to have a conscious experience).

To return to the topic at hand, a stochastic parrot generates grammatical, sensible language without understanding its underlying meaning. Of course, you can debate what it means to understand something; but for a person to vocalize an idea they understand, they must first somehow consciously process that idea. This is firmly a hard problem to which neuroscience offers limited guidance.

Of course, I'd agree that human beings aren't stochastic parrots -- if human beings were stochastic parrots, then what would it even mean to understand something? But I doubt you could use neuroscience to ascertain whether large language models are or aren't stochastic parrots. Indeed, depending on your definition of "understanding", consciousness might not even be a prerequisite, making the comparison to neuroscience moot.

---

[1] https://en.wikipedia.org/wiki/Hard_problem_of_consciousness

emptysongglass2y ago

Unfortunately, this is the thread that all these arguments begin: demonstrative ignorance of either the science, the research, or both. They originate in ignorance and because of ignorance, the response is terrible fear and prophecies of the End Times. It's no different than any other brand of fear.

There is no evidence that processing power = mind. None. There is no evidence that the human condition is any way related to some kind of terra firma of logic. In fact, there's considerable evidence that feelings are so entangled in the experience of humanness that the idea of divorce or separation is a false one. "Being human" is primarily a feeling experience that drives narratives, motivations: it underlies every single activity we engage in.

This is why people like Eliezer Yudkowsky and his ilk are so totally off the mark: it's no coincidence that the Less Wrong community and AI doomsayers can often be found on the same side of the aisle. Both camps believe in and idealize a distinct logic mind that can be attained. Funnily enough, it's still fear, a very human feeling, that is the basis for all these proclamations.

My worry is this camp garners enough influence to convince someone an AI doomsday is right around the corner unless immediate action is taken.

fnordpiglet2y ago

Buddhism says that this isn’t at all what being a human is. It says that the feelings and narratives are an aspect of the human mind, but not core to it. My conception is they are like feedback sources/optimizers and constraint engines. The worries, narratives, voices, chattering are self important feeling and trick us into believing they are who we are. However, a key aspect of vipassana and related mindfulness is building the ability to be aware of those processes from a part of us that has no voice, has no experienced feelings, but is able to be aware of them independently and control them. This source is where our agency derives from, and it does not have to be driven by feelings or the narratives, it is in fact able to suspend them and simply exist as it is. This is what is known as nirvana, which has attained weird mystical meanings in our culture, but essentially means attaining and maintaining a mind that has fully subordinated the “self” driving narrative and emotional soup. The loudness of feelings and our chattering monkey mind self support themselves internally as being “who we are,” but, again my own conception, their importance is an illusion they create internally. All this said, they are certainly a part of what makes us human, in so far as a foot makes us human. But you can function and live a full and complete life in the state of nirvana without losing anything.

In fact in my 30 year practice at one point I was scared to bring the practice into my daily lived life fearing being uncompelled by these processes and having a clear mind would make a robot or something - but the opposite was true. At some core level I knew my experiences and connections deeper than a feeling, and the people around me felt I was finally with them for the first time.

My point here is that the western conception of what it means to be a human is not particularly simple and it’s not the case, assuming thousands of years of Buddhist practice isn’t a crock, that our feelings and thoughts are the core of what it is to be human. Further - if they are illusions and feedback systems, they can be simulated as constraining feedback systems in an artificial mind just as easily.

I think the nature of what is human is much deeper in our minds, but because it’s not easy to examine like feelings and thoughts, I think we really do not understand it very well. This leads me to my long labored point - I agree with the original poster that we don’t understand consciousness. I believe we over estimate our understanding of what it means to be human. I do not however think our machines will achieve it either. But I don’t know why we need to make an artificial human. AI means intelligence, not human. A natural human takes 9 months and we have too many of them, let’s try for something different.

1 more reply

melagonster2y ago

they are preparing for a probably scenario. most of company try to implement it.

ARandomerDude2y ago

Likewise, the nobody knows what consciousness is mindset is smuggling in the idea that in order to know something at all, you must know it comprehensively. I know exactly what consciousness is based on my experience with it, even though I could not possibly give a comprehensive account of everything consciousness entails.

By analogy, I've been married for just shy of 20 years. I know my wife very well. I certainly do not know everything there is to know about her, but I do know her.

Symmetry2y ago

Scientific study of consciousness comes from ignoring our subjective impressions of our own consciousnesses, which might be illusory, and going only by what can be seen by other people. So you have experiments doing things like showing subjects subliminal images trying to probe the boundaries between conscious experience and unconscious experience. You start with results like "If we show this image to subjects for 50 ms it only has a slight effect on their behavior which fades out after a second, but if we show it to them for 60 ms it has a large effect for the rest of the experiment including being able to talk about it" and then you keep going from there.

1 more reply

mo_422y ago

> I know exactly what consciousness is based on my experience with it, even though I could not possibly give a comprehensive account of everything consciousness entails.

Not sure if personal experiences count. Generally, we laugh at people who talk about esoteric experiences.

So a simple explanation could be that consciousness is an illusion?

Or put differently, is there any phenomenon that needs the assumption of consciousness?

The way I experience myself could be just the history of experiences. So there is something that the brain can refer to.

2 more replies

eimrine2y ago

> I know exactly what consciousness is based on my experience with it, even though I could not possibly give a comprehensive account of everything consciousness entails.

What is the price of that kind of knowledge? You don't even know where is the border between your knowledge and your absence of knowledge. How much can you tell about consciousness without stepping on "absence of knowledge" field? Pretty nothing, isn't it?

> By analogy, I've been married for just shy of 20 years. I know my wife very well. I certainly do not know everything there is to know about her, but I do know her.

I have a better example. I speak English for 20 years if to start counting from my first English lesson when I learned my first English word. You can find plenty of silly mistakes in my comments. But at least I know what I can express or understand and what I can not.

JohnFen2y ago

> I know exactly what consciousness is based on my experience with it

OK, then what is it?

fnovd2y ago

Here's a thought experiment: what are the qualities of a parrot that would make its consciousness different than that of a stochastic parrot? For that matter, what are the qualities that separate a parrot's consciousness from a human's? Or a human's from a pig's? Given the way we treat pigs, we clearly don't think consciousness in and of itself is worthy of any formal consideration, as long as the benefit we derive from exploiting it is high enough.

So, with AI, is it fair to say that anyone really cares whether it will develop qualities that make it seem as though it is an emergent consciousness? Why would we treat digital consciousness any better than we treat organic consciousness? What is the point of pontificating whether or not the type of thinking an AI does crosses an arbitrary threshold when that threshold only exists as a tool for creating useful outgroups?

However sophisticated the thing that our thinking is, it exists on a scale and we sit at an arbitrary spot. We treat thinking that occurs further down the scale as functionally irrelevant not because of any real distinction but because doing so has a high utility for our species.

So, the question of how we will treat a "truly conscious and sentient" AI has already been answered. Look at how we treat pigs. Good luck out there, HAL.

nathan_compton2y ago

The way we treat pigs is truly blood curdling. If there were a god, the whole lot of us would be damned, I'm sure.

1 more reply

avg_dev2y ago

thank you for this. i am not well up on consciousness (or machine learning) and i have seen chatbots/llms hallucinate and such, and i have also seen them do amazing things. i have wondered to myself a few times lately: how do i know that im dissimilar in nature from these things?

so i ask you a followup question: what are some easy to understand ways in which a human's thought process would differ from an llm's behavior?

also for anyone else wondering: https://www.merriam-webster.com/dictionary/qualia

nathan_compton2y ago

There are a lot. First of all, these LLMs do not learn in-situ. They are entirely static (apart from the prompt). To teach an LLM something new is an ex-situ process, more or less totally unrelated to the way it predicts. Contrast that with a brain: brains are constantly learning (in fact, it is difficult to imagine how a brain as we understand it could work without constantly learning).

In a related way, because we learn on-line and constantly, our brains have to also maintain goals, rewards and punishments, etc etc. We have neurons for all of the trivia of keeping us moving, seeking new input, generalizing it, throwing away bad information, etc. For an LLM all of that is external. The LLM doesn't have any reason to even distinguish between generation and training. All the weight updates are calculated by a (relatively simple) external process. Furthermore, LLMs are entirely _feed forward_. The input comes in, a lot of numbers are crunched, and then output comes out. There is no rumination (again, the analogy for rumination in an LLM is in the training process, which is not embodied in the LLM).

Much of the content of our consciousness is perceptions relating to all of these things. I think its possible that artificial neural networks may one day do enough of these things that I would admit they are conscious, but architecturally and fundamentally, I don't see any reason that an LLM would have them.

I also don't think even GPT4 is that intelligent (fantastic recall, though). It does an impression of a cognitive process (literally by printing out steps) but that doesn't seem compelling enough for me to imagine a theory of mind underneath. A model of text, sure, but not a mind.

1 more reply

vernon992y ago

This is a very interesting point that requires some examples and further elaboration to have value for the readers. It refutes but doesn't provide arguments. Can you please elaborate?

loandbehold2y ago

Is there ANY evidence that "qualia" is a real thing? That sounds like vitalism that was debunked a long time ago.

jakelazaroff2y ago

I mean… you experience stuff, right? "Qualia" is the word for that. We can tweak the definition, but I think it's pretty obvious that "subjective, conscious experience" [1] does in fact exist.

[1] https://en.wikipedia.org/wiki/Qualia

nathan_compton2y ago

I have some basic observational evidence that it does exist and presumably you do as well. I do suspect that we are thinking about it in some fundamentally wrong way, but I reserve judgment.

cjbprime2y ago

I think that some LLMs (mainly just GPT-4) should be considered as refutations to the Stochastic Parrot idea, which was published in March 2021 and claims no LLM can have "any model of the world". It was a reasonable (though perhaps overconfident) paper for authors who had only used GPT-3 to publish, but there is now ample evidence of world modeling, including published academic evidence, for GPT-4. I think the following claim from the paper is also deeply incorrect and confused:

> an LM is a system for haphazardly stitching together sequences of linguistic forms it has observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning: a stochastic parrot.

Next token prediction is a function of tokens/words, but that doesn't preclude that prediction depending on meaning, and the best predictions obviously do depend on meaning. It is not clear, at least to me, that next token prediction leads to any kind of upper bound on intelligence. It is always possible to incorporate more of the descriptions of the world obtained through the training data into your predictions to improve them.

But I think you've missed an important distinction. The stochastic parrot claim can be false, not because LLMs can or will ever feel or be conscious, but because they can (today) reason and solve novel problems (the capability is there, but it is unreliable). LLMs are not probabilistically regurgitating their training sets; they're applying the learning they took away from those training sets.

I think GPT-4 can reason today, but I don't think it can feel or is conscious, and I don't expect it to be capable of those things in its current architecture.

FabHK2y ago

The notion of the stochastic parrot is that the system produces plausible sounding words, but does not understand, does not exhibit intelligence.

Consciousness is orthogonal to the discussion (the term doesn't appear in the linked article).

1 more reply

cjalmeida2y ago

Indeed. It's like saying life is just long strands of DNA composed of only four simple molecules. Or economy is just people trading their surpluses.

Emergent behavior... emerges. It's hard to predict or explain from constituents. Scale changes everything.

1 more reply

ac2u2y ago

Agreed. I think the stochastic parrot concept is useful to ground our expectations on LLMs for now, but it could outlive it's usefulness if there ends up being multiple jumps in sophistication similar to that of GPT-2 to 4 in the next 10 years.

If that happens, then stochastic parrot as an argument as to why a machine isn't thinking can be made pretty useless if One chooses to drag the argument further into philosophy.

1 more reply

abeppu2y ago

I disagree. The criticism is _not_ that basic building blocks cannot be combined to produce something richer. The issue is the "without any reference to meaning" part of the quoted definition from Bender in that article. Models which are _only_ trained on text do not have a grounding to relate linguistic forms to anything else. When you know what an apple is, it's in part because you've seen and touched and tasted and eaten one. The model only knows how people talk about apples, and which texts are plausible, but not which ones are true.

But we're already getting past this with multi-modal models! Some really great work is being done which ties language processing with visual perception and in some cases robot action planning. A model can know how we talk about apples, can see where an apple is in a scene, can navigate to and retrieve an apple, etc. This lets us get at truth ("Is the claim 'the apple is on the book' true of this scene?") in a way which text-only models fundamentally cannot have. The point is, the way you get past the "stochastic parrot" phase requires qualitative structural changes to incorporate different kinds of information -- not just scaling up text-only models.

> They can't prove I'm not a stochastic parrot anymore than they can prove whatever cutting edge LLM isn't.

I can't prove you're not a stochastic parrot by only talking to you via text. But in person I can toss you an object and you can catch it which shows that you understand how to interact with a dynamic 3D environment. I can ask you a question about something in our shared environment, and you can give an answer which is _true_, rather than which is a plausible-sounding sentence. This is the difference between knowing what English texts or English conversations look like, versus knowing what states of the world are referred to by statements.

ambrozk2y ago

By your definition, is a blind person capable of reasoning about visual data? Is a deaf person capable of reasoning about auditory data? Can a physicist understand the molecules, atoms, & subatomic particles which he or she can only interact with via a fundamentally textual theory? I would submit that there's no fundamental reason why an LLM needs access to more than text to derive human-level world models.

I'm not saying that the current LLMs have derived human-level world models (they haven't). It's just that, to me, the theory that textual data is categorically not enough to do so is necessarily empirical. To back up the assertion, you'd need to construct metrics which present text-only LLMs fail to succeed with, and then you need to show how multi-modal LLMs did succeed with those same metrics. So far, I don't think adding multi-modality to LLMs actually has improved their general-purpose reasoning ability, which I consider evidence against this theory. But then I read people online just asserting it as though it's an obvious truth derivable from philosophical first-principles. It's odd to me.

NoGravitas2y ago

> I disagree. The criticism is _not_ that basic building blocks cannot be combined to produce something richer. The issue is the "without any reference to meaning" part of the quoted definition from Bender in that article.

Right. People think the stochastic parrot description is about the Chinese Room thought experiment, but it's not. It's about the Thai Library thought experiment: https://medium.com/@emilymenonbender/thought-experiment-in-t...

abeppu2y ago

Thanks for pointing to that! I'm weirded out b/c this article from Bender in late May seemed so familiar. Here's a conversation from Feb in which a very similar argument is made, also using Thai text as an example: https://news.ycombinator.com/item?id=34732971

circuit102y ago

You can say the same about humans, we only experience an approximation of the real world via our senses, never the “real thing”, so can we “truly understand” it? Yes, in the sense that we can reason about it and make and test predictions about the parts we can understand. The world we experience is based on our senses, and that’s what what we understand. A LLM’s world is text, and there’s no reason it doesn’t “truly understand” the concepts that it’s using any less than humans do

sfpotter2y ago

Stochastic parrots have nothing directly to do with consciousness. You might consider reading the paper or at least the definition on Wikipedia more carefully.

As far as your statement regarding consciousness goes, it's glib to say that no one has any reference point for what consciousness, thinking, or feeling are. We all have our own lived experience to draw on for intuition and guidance to inform our thinking, which is invaluable. We can relate our qualitative perception of these phenomena to other things in the world, where a reasonable person can form the hypothesis that "matrix multiplication" is unlikely to be conscious, to think, or to feel by dint of it being an abstract mathematical concept, since there is no precedence for an abstract mathematical concept exhibiting any of these qualities. Indeed, the only things in our lived experience which can plausibly be said to be conscious, to think, or to feel are biological organisms, of which a computer is not.

Joeri2y ago

In a sense LLM's are a Rorschach test for people's beliefs about consciousness. If you believe consciousness to be an emergent phenomenon derived from simple deterministic biological processes, then it is not a big leap to believe LLM's to be on a roadmap to consciousness. If you instead believe consciousness is a supernatural phenomenon, then you will discard the very idea of a computer having a consciousness, because a machine could never be imbued with one by mere algorithms.

Tell me what your view is on the ability of LLM's to become AGI, and I'll tell you whether you believe in an immortal soul.

piloto_ciego2y ago

Oh, I definitely don’t think this is correct because I at least for me the contra positive is not true.

I believe machines that could be imbued with consciousness and I do not rule out that there could be supernatural elements to consciousness.

Or at least things that fall outside the realm of strictly testable science.

shimfish2y ago

Roger Penrose wrote about this decades ago, arguing that computers cannot contain consciousness and neither can anything in currently understood physics.

https://www.amazon.com/Emperors-New-Mind-Concerning-Computer...

dekhn2y ago

he never came up with any useful experiments that would demonstrate support for his position. Nor did he make a convincing theoretical argument.

gisely2y ago

It's weird seeing comments like this that argue simultaneously: 1) LLMs aren't stochastic parrots anymore! 2) You can't prove humans aren't stochastic parrots!

It pretty clear the whole point is minimize the difference between us and AI, but it does feel like you are undermining you argument by trying to work it from both sides. It reminds me some accused of crime who say both "I didn't do it!" and "If did it, it wasn't wrong!".

Humans aren't stochastic parrots. You can't "prove" this because it's not mathematical fact, but there is plenty evidence from study how to brain works to show this. Hell, it's even readily apparent from introspection if you'd bother to check. LLMs on the hand basically are stochastic parrots because they just autoregressive token predictors. They might become less so due to architectural changes made by the companies working on them, but it isn't going to just creep up on us like some goddamn emergence boogeyman.

Workaccount22y ago

Believe it or not, actual deep introspection (meditation, mindfulness) will make you realize you are more like a stochastic parrot, not less like one.

pseudotrash2y ago

> These guys have no idea what consciousness is (nobody does) ...

On that, there is a great "In Our Time" episode: https://open.spotify.com/episode/5oln4RwbhsKwjlZuxPuYYB?si=5...

Unless it is able to feel pain it remains a stochastic parrot and I wouldn't call it conscious or alive in any philosophical sense nor can one say it is capable of "feeling".

PartiallyTyped2y ago

Frankly speaking, I think this is more on us for thinking that we are special or that living forms are special. This is just our ego talking.

Under the view that we are all just complexity arising out of an unfathomably large universe, then we can accept that LLMs are just that, like us, but weaker, and that is fine.

They will improve, we can leverage them, we can live with them. It's almost as if we have created a new species that exists only abstractly; and arises out of silicon and electrons.

pizza2342y ago

> Basically there is this innate idea that if the basic building blocks are simple systems with deterministic behavior, then the greater system can never be more than that. I've seen this is spades within the AI community

I'm very surprised by this, because in essence, it's a flat-out denial of the emergence concept, no different from denying that atoms can ultimately lead to biological entities.

itairal2y ago

Totally agree.

There is also a problem to me that "stochastic parrot" is too clever. It is too good of a name and evokes such a strong mental image. It is a great name for branding purposes but because of that it is a terrible name if we are actually trying to discover truth. It can't but help to become a blunt, unthinking, intellectual weapon and rhetorical device.

me_me_me2y ago

I have noticed people look at animal or machines and try to figure out if they are as complex as humans in order to figure out if they are conscious or not.

But almost never you see logic applied other way around, maybe we are just bunch of simple mechanisms convinced we are something way more complex.

There are quite few hints that the second option is the actual reality.

fnordpiglet2y ago

The future is made by those that look past what is and see what might be and fail half way to achieving it. The rest are mired in their attachments and will never escape a murky prison of what today appears to be.

ChatGTP2y ago

I think you know what consciousness is ? It’s basically your life ?

isp2y ago

Topical tweet from 2018:

> Optimist: AI has achieved human-level performance!

> Realist: “AI” is a collection of brittle hacks that, under very specific circumstances, mimic the surface appearance of intelligence.

> Pessimist: AI has achieved human-level performance.

https://twitter.com/dmimno/status/949302857651671040

mach1ne2y ago

>"stochastic parrot" is a term coined by Emily M. Bender in the 2021 artificial intelligence research paper "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?"

This might be the first time the term was seen in an ’official’ context, but is it really the origin? It feels like the term has been hovering around for longer, and even Google Trends shows significant search trends way before 2021

mtlmtlmtlmtl2y ago

I see only a few peaks starting in 2008 that are in the double digit numbers. Could these just be queries containing the words "stochastic" and "parrot"?

For instance, there's this ecology paper from 2014: Influence of stochastic processes and catastrophic events on the reproductive dynamics of the endangered Maroon‐fronted Parrot Rhynchopsitta terrisi

Not sure what happens under the hood, but it wouldn't surprise me if people searching for this paper would show up under "stochastic parrot" in google trends even if that's not what they literally searched for.

naillo2y ago

I feel this way too. Maybe there's some very similar term that we're both thinking of though that is just at the tip of the tongue because I can't find what it'd be

post-it2y ago

The Chinese room, maybe? https://en.wikipedia.org/wiki/Chinese_room

krapp2y ago

Most people have never actually dealt with something like modern LLMs, so we haven't really developed the proper language to describe them and how they behave. It's either too simplistic and reductive (stochastic parrot, xerox machine) or presupposes sentience and intent ("fabricates","hallucinates", etc.)

visarga2y ago

Also "a blurry jpeg of the internet". LOL, all we need is "a series of tubes, not a truck" and we're set.

I think we are focusing on the model too much and miss the real hero - language. The corpus of text these models are trained on is a marvel of human creativity. This cultural artefact is the diff between primitive and modern humans. And it is the diff between a random initialisation and a trained GPT-4. Maybe the brain or the model don't matter, but what you train them on.

Even more, language is special. Ideas are self replicators, they have a lifecycle, they have evolutionary pressure to improve. Ideas travel a lot. No single human can recreate this knowledge, it is the result of massive search. I'd say more than 99% of human intelligence is based on applying ideas invented by someone else. So let's be more lenient on the parroting accusations. AIs can be smart if they get feedback, like AlphaZero, but without feedback they of course have to parrot.

canjobear2y ago

Their paper was the first time I heard the term.

ttpphd2y ago

If you find something, post it. Otherwise this sounds like sour grapes that women coined the term.

cubefox2y ago

I don't think this coinage is something to be proud of, given that the stochastic parrot analysis is now rejected by most top AI researchers, like Geoffrey Hinton or Andrew Ng. Even LLM skeptic Yann LeCun says LLMs have some level of understanding:

https://twitter.com/ylecun/status/1667947166764023808

infimum2y ago

As far as I know, none of these 3 work specifically in NLP, most of their work is in image processing and to the best of my knowledge none of them have any background in linguistics.

1 more reply

mtlmtlmtlmtl2y ago

There is nothing in OP's comment to indicate misogyny.

samgilb2y ago

Fun fact: philosopher Regina Rini referred to GPT-3 as a "statistical parrot" six months before the Bender et al paper came out: https://dailynous.com/2020/07/30/philosophers-gpt-3/#rini

rsynnott2y ago

> They go on to note that because of these limitations, a learning machine might produce results which are "dangerously wrong"

I was initially thinking "well, yes, Nobel Prize for Stating the Obvious there", but looks like the paper was written in the far distant past of 2021, when LLMs were largely still in their babbling obvious nonsense stage, rather than the current state of the art, where they babble dangerously convincing nonsense, so, well, fair enough I suppose.

Amazing how fast progress has been there, though it's progress in an arguably rather worrying direction, of course.

sharikous2y ago

Not to reduce the value of the insight, but since she coauthored the paper with Google employees she probably had access to models more advanced than those which were available to the general public

rsynnott2y ago

I do wonder what the state of Google's stuff in 2021 was. Here's something produced by the 2020 version of GPT-3: https://www.aiweirdness.com/roses-are-red/

At that point, OpenAI was still fairly clearly at the babbling obvious nonsense phase; I would wonder was Google's stuff much better.

I also wonder if the original authors would have been surprised to learn that, by 2023, lawyers would be citing fake precedent made up by a machine. The progression to "dangerous nonsense" really does seem to have been worryingly fast.

dekhn2y ago

I was really impressed with the work that Noam Shazeer was doing at Google before he left (I worked on TPUs and frequently had to debug problems at scale for researchers). It was clear he was making some pretty impressive improvements, but the results weren't super obvious even to most people inside google, and they didn't translate to externally visible projects.

This isn't that dissimilar to working at any sufficiently advanced R&D outfit, which strongly demonstrates the principle "the future is already here but isn't evenly distributed".

daniel_reetz2y ago

Thanks for pointing this out. I've spent years in R&D and awareness always lags technology.

seydor2y ago

LLMs are not stochastic though, they are deterministic and dont even require random numbers, right?

The term in general seems to be unfortunate because the models seem to do more than parroting. LLMs are more like central pattern generators of the nervous systems, able to flexibly create well coordinated patterns when guided appropriately

dudebro3142y ago

Simulations of Brownian motion are not stochastic though, they are deterministic if you fix their random seed, right?

seydor2y ago

Stochasticty is mandatory for modeling brownian motion.

Actually transformers do not require ramndomness at all, so not at all

FabHK2y ago

My understanding is the opposite. The entire process results in a "score" over all output tokens, which is then converted into a probability of being picked, using a softmax that takes a temperature as a parameter. With a temperature of zero, the "best" token is always picked, but interestingly enough, that does not give optimal results. So sometimes you want the second or even third best. Thus, a "good" (GPT-like) LLM is intrinsically random.

To put it differently: You can make them deterministic by using a temperature of zero (then the output would be pretty bad and repetitive), or having a "better" temperature and fixing a random seed (then the output would be better, but it would only be deterministic in the same sense as a simulation of Brownian motion with fixed random seed).

https://ai.stackexchange.com/questions/32477/what-is-the-tem...

Section 3.3 in https://www.lesswrong.com/posts/pHPmMGEMYefk9jLeh/llm-basics...

https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-...

1 more reply

constantcrying2y ago

All LLMs have some random aspects.

Training alone relies hugely on many factors (e.g. initialization of paramters, order of training data, hyper paramters, etc.).

In evaluation (afaik this applies to recent models as well) you pick the continuation based on chance and not always the "best". But evaluation is the result of the training process, so all the randomness from that factors in as well.

enragedcacti2y ago

They are stochastic in the domain of meaning. Minor syntactic changes to the prompt or changes to the seed can result in substantial* changes to the meaning of the response.

*substantial as in nontrivial, not substantial as in massive

8note2y ago

Isn't that rather "unstable" or "poorly conditioned" ?

enragedcacti2y ago

Sure, I don't think those are mutually exclusive with stochastic. A stable or well-conditioned model may just have an acceptably small standard deviation for the task at hand.

seydor2y ago

Similar prompts give similar continuations, not wildly diverging, so no

enragedcacti2y ago

By definition most continuations won't be wildly divergent under a stochastic model because its following a bell curve.

1 more reply

dekhn2y ago

The real question to me is: in the next decade, as ML researchers roll out progressively more sophisticated systems, we can expect that generative systems- which may actually be "only stochastic parrots"- are going to create works that would fool any reasonable human being.

At what point does a stochastic parrot fake it till it makes it? Does it even matter? We can imagine that, within 10 years, we'll have a fully synthetic virtual human simulator- a generative AI combined with knowledge base, language parsing, audio and video recognition, basically a talking head that could join your next technical meeting and look like full contributor. If that happens, will the Timnits and the Benders of the world admit that, perhaps, systems which are indistinguishable from a human may not just be parrots, or perhaps, we are just sufficiently advanced parrotS?

Seen from that perspective, the promoters of stochastic parrots would seem to be luddites and close-minded, as well as discouraging legitimate, important, and valuable scientific research.

NoGravitas2y ago

Once you have a knowledge base connected to the language model, it's no longer a Stochastic Parrot, but something else entirely. The point of the paper is that simply continuing to scale up LLMs will not produce understanding, because a pure LLM has no connection between form and meaning. That link can provided in other ways, though (multimodal models, robot embodiment).

dekhn2y ago

But these language models are implicitly trained on knowledge by being fed large amounts of factual text, which (I presume) allows it to generate text that is factual (statistically more frequently than hallucinating nonfactual information). So probably recent models (which were being trained around the time the parrots paper came out) are really implict knowledge models already. Obviously they don't have embodiment, and it's still unclear to me what level of true embodiment in the actual, real, physical world is required to make these models more than just "parrots".

renewiltord2y ago

In the end, it turned out the actual innovation was doing the opposite of what this paper recommended: scaling up the LLM, improving quality by throwing lots of data at it rather than curating, and limiting bias by RLHF rather than picking the right datasets.

The organizations that listened to these people for even some amount of time got hosed in this situation. Google managed to oust this flock from within but not before their AIs were so lobotomized that they are wildly renowned for being the village idiot.

Ultimately, this paper is a triumph of branding over science. Read it if you'd like. But if you let these kinds of people into your organization, they'll cripple it. It costs a lot to get them out. Instead, simply never let them in.

jal2782y ago

The long-term impact of this paper has confused me from a technical lens, although I get it from a political lens. I'm glad it brings up the risks from LLMs but makes technical/philosophical claims which seemed poorly supported and empirically have not held up -- imo because they chose not to engage with RLHF at all (which was deployed through GPT-3 at the time; and enables grounding + getting around 'parrotness'), and uses over-the-top language ("stochastic parrot") which seems very poorly to capture what it feels like to meaningfully engage with e.g. models like GPT-4.

cratermoon2y ago

> limiting bias by RLHF rather than picking the right datasets

This is the same as curation and picking out the dataset, except as post-processing. The reason why RLHF has to happen (and traumatize the people <https://www.bigtechnology.com/p/he-helped-train-chatgpt-it-t...>) is to address the problems by censoring the model.

torginus2y ago

Is it though? If you wanted to teach humans so that they don't develop unfortunate beliefs, would it be a good approach to just keep them from reading material that you find objectionable?

If you read a book that you disagree with, or one that contains falsehoods and bad reasoning as far as you can tell, would that make you believe those things?

cratermoon2y ago

A reminder that LLM transformers aren't humans, they don't learn the way humans learn.

Sunhold2y ago

The word "trauma" is getting overused. The idea of someone being traumatized by reading fictional text is just silly. It's unpleasant or gross at worst unless you already have other issues.

the84722y ago

The first step to defeating a tiger is to realize that it cannot hurt you, for it is only made of simple atoms.

cubefox2y ago

"Machine learning? It's just statistics bro."

rchaud2y ago

I've got another word for it: recipe-fication.

Everything we revile about online recipe websites that spend 1000 words about the history of cooking before getting to the point, will be part and parcel of AI-written anything. It won't be properly proofread or edited by a human, because that would defeat the purpose.

adamsmith1432y ago

Yoshua Bengio, Andrew Ng, Anrej Karpathy, and many other of the top researchers in the field do not believe these models are stochastic parrots, they believe they have internal world models and prompts are methods to probe those world models. Stochastic parrots is one of the dumbest takes in AI/ML.

cubefox2y ago

Yeah. See e.g.

https://arxiv.org/abs/2306.03341

> Our findings suggest that LLMs may have an internal representation of the likelihood of something being true, even as they produce falsehoods on the surface.

The problem here is that there is currently no reliable way to extract information from this hypothetical world model. Language models do not always say what they "believe", they might instead say what is politically correct, what sounds good etc. Researchers try to optimize (fine-tune) language models to be helpful, honest, and harmless, but honesty ("truthfulness") can't be easily optimized for.

1 more reply

dehrmann2y ago

Something good that came out of crypto was a lot of people thought about what money actually is. LLMs are doing the same with intelligence.

deeviant2y ago

Eh, the thing I feel most people (who lost a lot of money on crypto) learned about what money is, is that crypto is not money.

pydry2y ago

It wasn't particularly deep thinking though. The same is also true here.

brandly2y ago

Regardless it's a good thing! Many people have had no reason until recently to break out of thinking that money=usd or intelligence=humans.

IshKebab2y ago

Yeah but they don't seem to be thinking about it very much. People keep spouting "stochastic parrot" nonsense!

api2y ago

I’d argue that all these models are stochastic parrots because they’re not embodied in any way. There is no way they can actually understand what they are talking about in any way that is tied back to the physical world.

What these LLMs and diffusion models and such actually are is a lossy compression method that permits structural queries. The fact that they can learn structure as well as content allows them to reason as well, but only to the extent that the rules they’re following existed somewhere in the training data and its structure.

If one were given access to senses and memory and feedback mechanisms and learned language that way, it might be considered actually intelligent or even sentient if it exhibited autonomy and value judgments.

danbruc2y ago

I’d argue that all these models are stochastic parrots because they’re not embodied in any way.

I do not think that this would really change much in itself. If you tell the model that crimson is a shade of green, it will learn something wrong whether it has a body or not. What you need is feedback on whether a response is correct or not, factually correct, not grammatically correct. Alternatively you have to teach the model to perform its own fact checking and apply it to its responses.

AnimalMuppet2y ago

I think that maybe "truly understand" is anchored in the physical world. I don't exhaustively know what, say, grass is, but I know what it looks like, and I know what it feels like to walk on, and I know what it feels like to touch with my hands, and I know what it sounds like when I walk on it, and I know what it smells like when it's cut. And I know that there's a consistent correlation between "stuff that look like that" and "stuff that smells like that when it's cut".

And so if the topic of grass comes up, I have some firsthand knowledge to draw on - less than a botanist, but not nothing. I have some sense impressions that correlate to other sense impressions and to the word "grass". GPT, on the other hand, has some words that correlate to other words, and nothing more.

So it seems fair to say that I understand grass on a level that GPT does not, and cannot. Therefore it seems fair to say that GPT is at least closer to being a stochastic parrot than humans are.

grumbel2y ago

And yet if you see some AstroTurf you'd still call it grass. In the end there is no "true understanding", there are just predictions we make about the world. Depending on how deeply you look, they are often incorrect, but also generally good enough.

GPT isn't quite at the good-enough point and being limited to only text, makes it impossible to reason about aspects of the world that are difficult to describe in text or simply weren't in the training data.

And more generally speaking, the claim that LLMs don't understand anything really doesn't hold up given how much they are able to hallucinate. If a LLM truly wouldn't understand anything, it wouldn't be able to generate plausible text, it would either generate nonsense or be limited to whatever was in the training data, but that's not the case. The LLMs can predict past their trained knowledge and predict stuff they haven't seen yet. Those predictions will sometimes turn out wrong, but so will the humans prediction that the AstroTurf is grass when taking a closer look.

danbruc2y ago

Sure, you have experienced grass with more senses than just reading about it, but I do not think that this fundamentally changes anything. If I lied to you your entire life and told you that you are walking on or smelling grass while you were actually walking on moss, you would learn a similar mistake spanning several of your senses.

1 more reply

chrisnight2y ago

The idea that an entity can't be sentient because of a lack of senses has the problem where it invalidates the sentience of humans though. Do you consider a blind person having less sentience than a person that can see because they lack the sense of sight? Even if we consider sentience as an on/off switch, what about a person that has no senses at all (whether someone like that exists theoretically or in reality)? With no way to tie back thoughts to the real world, are they no longer sentient?

Obviously we don't know for certain if other humans are sentient, but it seems necessary to establish the premise they are in order to get anywhere in the argument for sentience of AIs. In this case, we need an argument about the sentience of AIs that coincides with our experiences of the sentience of humans, which this argument doesn't seem to do.

Even if we limit ourselves to thinking about people with all of their senses, there's still information that we cannot tie back to the physical world with our senses. Take someone who sits at a computer all day. They read news and talk about it online, without ever interacting with the news physically. Take someone who theoretically has never done anything outside of read and type on a computer all day. Are they not sentient because they've never physically interacted with the world outside of their computer?

grumbel2y ago

> Do you consider a blind person having less sentience than a person that can see because they lack the sense of sight?

They still interact with an external world. An LLM doesn't, at all, not even a little bit. That's the crucial difference. A person will know when things didn't go as predicted, as the real world will provide feedback they can sense. An LLM in contrast has no idea what is going on, its past actions don't exist for it. There is only the prompt and the unchanging base model.

That said, this is not to disparage the abilities of LLMs, they simply were never designed to be sentient. If one wants an LLM that is sentient, one has to build some feedback into the system that allows it to change and evolve depending on its past actions.

denial2y ago

A syntax-producing machine harnessing the power of duality will still never have access to semantic content. For this reason I have difficulty saying that it understands things beyond a colloquial sense.

jstx12y ago

> if it exhibited autonomy and value judgments.

Who wants this from ML systems? I want them to be useful, not to have autonomy and value judgments.

grumbel2y ago

It has to have some degree of autonomy to be useful. The current approach with ChatGPT to just have all the knowledge in the world directly in the base model not only doesn't scale, it would also run into issues with copyright if it could actually recite books and stuff word for word. A ChatGPT that can just use Google to look up the necessary information itself would be far more useful.

BingChat sort of tries that, but it doesn't really have any autonomy either, so it just summarizes the first Bing search result it gets. It would be far more useful if it could search around two or three layers depth into the search results to actually find what you are looking for.

In general current AI systems have the problem that you have to babysit them far to much. If you want to get specific answers, it's you that has to provide all the necessary context to make it happen, the AI can't figure out by itself what you want from past conversations.

kelseyfrog2y ago

I have a few projects in mind where that's a requirement.

Invictus02y ago

Feels like this wikipedia page is overly (self-?)promotional of the paper and its authors

constantcrying2y ago

One massive flaw of the Wikipedia modell is that the people who edit Wikipedia the most "aggressively" are the ones with the most emotional investment in the topic.

This can lead to very detailed articles written by very enthusiastic people. In other cases the people who are very pro/against the subject will be the ones who put in the most effort, especially on smaller/controversial subjects.

I have seen Wikipedia pages which basically read like ads for small companies.

ttpphd2y ago

Considering that men are taking credit for their work, maybe some over-correction is understandable.

GaggiX2y ago

Which men are taking credit for the work?

hackandthink2y ago

A nice paper:

"Meaning without reference in large language models"

"we argue that LLM likely capture important aspects of meaning, and moreover work in a way that approximates a compelling account of human cognition in which meaning arises from con- ceptual role"

https://arxiv.org/pdf/2208.02957.pdf

I remember Quine's meaning holism it seems to be related.

https://en.wikipedia.org/wiki/Semantic_holism

RHSman22y ago

What do you think parrots think about this? Insulted.

cubefox2y ago

GPT-3 was released less than a year before that, even though this now seems to be long ago. Time is moving fast with AI.

ChatGTP2y ago

Climate change moves fast too, what’s your point ?

aaroninsf2y ago

TL;DR: the focus on the implementation details, and descriptions like this, are detrimental, even perilous,

because such accounts are both accurate, and deeply misleading.

This is description, but it is neither predictive, nor explanatory.

It implies a false model, rather than providing one.

Evergreen:

Ximm's Law: every critique of AI assumes to some degree that contemporary implementations will not, or cannot, be improved upon. Lemma: any statement about AI which uses the word "never" to preclude some feature from future realization is false.

koalala2y ago

From the article: A "stochastic parrot", according to Bender, is an entity "for haphazardly stitching together sequences of linguistic forms … according to probabilistic information about how they combine, but without any reference to meaning."

It seems to me that the great success transformers are now enjoying is precisely due to the fact that 'probabilistic information about how they combine' _is_ meaning.

NoGravitas2y ago

It's really not. Read the National Library of Thailand thought experiment to understand the difference. But this isn't saying that AGI is impossible, only that it can't come purely from LLMs, and that pure LLMs will remain stochastic parrots no matter how they are scaled up.

IshKebab2y ago

I agree. There's a quote in that paper about how ML models can never access meaning (semantics of words) because they only see the form (syntax and letters) and the two are somehow completely divorced.

It's obvious nonsense. I can describe a new concept to you using only words and letters and you can understand it. Therefore you can build up knowledge using only syntax.

Nobody is saying that LLMs understand the layout of a bus or the feel of leather, but they understand that buses are vehicles with four wheels that transport people etc.

Face-slappingly poor philosophy.

nologic012y ago

Rehashed language imitating sequences is a term that does not denigrate parrots.

browningstreet2y ago

“stochastic” is to the tech forum as “sapiosexual” is to the online dating profile

constantcrying2y ago

This also relates to vision models. The existence of adversarial attacks (e.g. imperceptable changes in the image drastically changing the output) essentially demonstrate that the model has not reached the point at which the network "understands" the generalized concept it wants to disinguish.

zone4112y ago

The same argument could apply to humans. For example https://en.wikipedia.org/wiki/Change_blindness.

cubefox2y ago

That's something else. The OP was talking about small changes in pictures causing a very different classification.

constantcrying2y ago

Not really an example, there are many ways human vision is flawed and can be tricked, but none are on the level of these adversarial examples. There imperceptible differences between an image lead to a category error.

Human perception can be ambigous, but minimal changes never cause drastic category errors.

j / k navigate · click thread line to collapse

161 comments

Workaccount22y ago

I worry that the "stochastic parrot" was premature, an idea sown early in development that will now carry along through any advances made.

nathan_compton2y ago

"These guys have no idea what consciousness is (nobody does)"

SamBam2y ago

> I agree that there is a significant sliver of a philosophical problem which remains stubborn (how precisely does physical activity produce qualia)

But that IS the definition of consciousness! This is like saying "We understand practically everything about airplanes, except how they stay in the air."

(Now, whether consciousness is necessary for AGI is a separate question.)

munificent2y ago

> This is like saying "We understand practically everything about airplanes, except how they stay in the air."

Even that is a sufficient level of understanding to correctly determine that a motorcycle is not an airplane.

https://medium.com/@emilymenonbender/thought-experiment-in-t...

1 more reply

nathan_compton2y ago

2 more replies

ben_w2y ago

> Where do people get off saying no one has any idea what consciousness is?

I'm not "getting off" saying that, but I do say it often.

For me, it's important to know:

nathan_compton2y ago

1 more reply

panda-giddiness2y ago

> I agree that there is a significant sliver of a philosophical problem which remains stubborn (how precisely does physical activity produce qualia)

---

[1] https://en.wikipedia.org/wiki/Hard_problem_of_consciousness

emptysongglass2y ago

My worry is this camp garners enough influence to convince someone an AI doomsday is right around the corner unless immediate action is taken.

fnordpiglet2y ago

1 more reply

melagonster2y ago

they are preparing for a probably scenario. most of company try to implement it.

ARandomerDude2y ago

By analogy, I've been married for just shy of 20 years. I know my wife very well. I certainly do not know everything there is to know about her, but I do know her.

Symmetry2y ago

1 more reply

mo_422y ago

> I know exactly what consciousness is based on my experience with it, even though I could not possibly give a comprehensive account of everything consciousness entails.

Not sure if personal experiences count. Generally, we laugh at people who talk about esoteric experiences.

So a simple explanation could be that consciousness is an illusion?

Or put differently, is there any phenomenon that needs the assumption of consciousness?

The way I experience myself could be just the history of experiences. So there is something that the brain can refer to.

2 more replies

eimrine2y ago

> I know exactly what consciousness is based on my experience with it, even though I could not possibly give a comprehensive account of everything consciousness entails.

> By analogy, I've been married for just shy of 20 years. I know my wife very well. I certainly do not know everything there is to know about her, but I do know her.

JohnFen2y ago

> I know exactly what consciousness is based on my experience with it

OK, then what is it?

fnovd2y ago

So, the question of how we will treat a "truly conscious and sentient" AI has already been answered. Look at how we treat pigs. Good luck out there, HAL.

nathan_compton2y ago

The way we treat pigs is truly blood curdling. If there were a god, the whole lot of us would be damned, I'm sure.

1 more reply

avg_dev2y ago

so i ask you a followup question: what are some easy to understand ways in which a human's thought process would differ from an llm's behavior?

also for anyone else wondering: https://www.merriam-webster.com/dictionary/qualia

nathan_compton2y ago

1 more reply

vernon992y ago

This is a very interesting point that requires some examples and further elaboration to have value for the readers. It refutes but doesn't provide arguments. Can you please elaborate?

loandbehold2y ago

Is there ANY evidence that "qualia" is a real thing? That sounds like vitalism that was debunked a long time ago.

jakelazaroff2y ago

I mean… you experience stuff, right? "Qualia" is the word for that. We can tweak the definition, but I think it's pretty obvious that "subjective, conscious experience" [1] does in fact exist.

[1] https://en.wikipedia.org/wiki/Qualia

nathan_compton2y ago

I have some basic observational evidence that it does exist and presumably you do as well. I do suspect that we are thinking about it in some fundamentally wrong way, but I reserve judgment.

cjbprime2y ago

I think GPT-4 can reason today, but I don't think it can feel or is conscious, and I don't expect it to be capable of those things in its current architecture.

FabHK2y ago

The notion of the stochastic parrot is that the system produces plausible sounding words, but does not understand, does not exhibit intelligence.

Consciousness is orthogonal to the discussion (the term doesn't appear in the linked article).

1 more reply

cjalmeida2y ago

Indeed. It's like saying life is just long strands of DNA composed of only four simple molecules. Or economy is just people trading their surpluses.

Emergent behavior... emerges. It's hard to predict or explain from constituents. Scale changes everything.

1 more reply

ac2u2y ago

If that happens, then stochastic parrot as an argument as to why a machine isn't thinking can be made pretty useless if One chooses to drag the argument further into philosophy.

1 more reply

abeppu2y ago

> They can't prove I'm not a stochastic parrot anymore than they can prove whatever cutting edge LLM isn't.

ambrozk2y ago

NoGravitas2y ago

abeppu2y ago

circuit102y ago

sfpotter2y ago

Stochastic parrots have nothing directly to do with consciousness. You might consider reading the paper or at least the definition on Wikipedia more carefully.

Joeri2y ago

Tell me what your view is on the ability of LLM's to become AGI, and I'll tell you whether you believe in an immortal soul.

piloto_ciego2y ago

Oh, I definitely don’t think this is correct because I at least for me the contra positive is not true.

I believe machines that could be imbued with consciousness and I do not rule out that there could be supernatural elements to consciousness.

Or at least things that fall outside the realm of strictly testable science.

shimfish2y ago

Roger Penrose wrote about this decades ago, arguing that computers cannot contain consciousness and neither can anything in currently understood physics.

https://www.amazon.com/Emperors-New-Mind-Concerning-Computer...

dekhn2y ago

he never came up with any useful experiments that would demonstrate support for his position. Nor did he make a convincing theoretical argument.

gisely2y ago

It's weird seeing comments like this that argue simultaneously: 1) LLMs aren't stochastic parrots anymore! 2) You can't prove humans aren't stochastic parrots!

Workaccount22y ago

Believe it or not, actual deep introspection (meditation, mindfulness) will make you realize you are more like a stochastic parrot, not less like one.

pseudotrash2y ago

> These guys have no idea what consciousness is (nobody does) ...

On that, there is a great "In Our Time" episode: https://open.spotify.com/episode/5oln4RwbhsKwjlZuxPuYYB?si=5...

Unless it is able to feel pain it remains a stochastic parrot and I wouldn't call it conscious or alive in any philosophical sense nor can one say it is capable of "feeling".

PartiallyTyped2y ago

Frankly speaking, I think this is more on us for thinking that we are special or that living forms are special. This is just our ego talking.

Under the view that we are all just complexity arising out of an unfathomably large universe, then we can accept that LLMs are just that, like us, but weaker, and that is fine.

They will improve, we can leverage them, we can live with them. It's almost as if we have created a new species that exists only abstractly; and arises out of silicon and electrons.

pizza2342y ago

I'm very surprised by this, because in essence, it's a flat-out denial of the emergence concept, no different from denying that atoms can ultimately lead to biological entities.

itairal2y ago

Totally agree.

me_me_me2y ago

I have noticed people look at animal or machines and try to figure out if they are as complex as humans in order to figure out if they are conscious or not.

But almost never you see logic applied other way around, maybe we are just bunch of simple mechanisms convinced we are something way more complex.

There are quite few hints that the second option is the actual reality.

fnordpiglet2y ago

ChatGTP2y ago

I think you know what consciousness is ? It’s basically your life ?

isp2y ago

Topical tweet from 2018:

> Optimist: AI has achieved human-level performance!

> Realist: “AI” is a collection of brittle hacks that, under very specific circumstances, mimic the surface appearance of intelligence.

> Pessimist: AI has achieved human-level performance.

https://twitter.com/dmimno/status/949302857651671040

mach1ne2y ago

>"stochastic parrot" is a term coined by Emily M. Bender in the 2021 artificial intelligence research paper "On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?"

mtlmtlmtlmtl2y ago

I see only a few peaks starting in 2008 that are in the double digit numbers. Could these just be queries containing the words "stochastic" and "parrot"?

naillo2y ago

I feel this way too. Maybe there's some very similar term that we're both thinking of though that is just at the tip of the tongue because I can't find what it'd be

post-it2y ago

The Chinese room, maybe? https://en.wikipedia.org/wiki/Chinese_room

krapp2y ago

visarga2y ago

Also "a blurry jpeg of the internet". LOL, all we need is "a series of tubes, not a truck" and we're set.

canjobear2y ago

Their paper was the first time I heard the term.

ttpphd2y ago

If you find something, post it. Otherwise this sounds like sour grapes that women coined the term.

cubefox2y ago

https://twitter.com/ylecun/status/1667947166764023808

infimum2y ago

As far as I know, none of these 3 work specifically in NLP, most of their work is in image processing and to the best of my knowledge none of them have any background in linguistics.

1 more reply

mtlmtlmtlmtl2y ago

There is nothing in OP's comment to indicate misogyny.

samgilb2y ago

Fun fact: philosopher Regina Rini referred to GPT-3 as a "statistical parrot" six months before the Bender et al paper came out: https://dailynous.com/2020/07/30/philosophers-gpt-3/#rini

rsynnott2y ago

> They go on to note that because of these limitations, a learning machine might produce results which are "dangerously wrong"

Amazing how fast progress has been there, though it's progress in an arguably rather worrying direction, of course.

sharikous2y ago

Not to reduce the value of the insight, but since she coauthored the paper with Google employees she probably had access to models more advanced than those which were available to the general public

rsynnott2y ago

I do wonder what the state of Google's stuff in 2021 was. Here's something produced by the 2020 version of GPT-3: https://www.aiweirdness.com/roses-are-red/

At that point, OpenAI was still fairly clearly at the babbling obvious nonsense phase; I would wonder was Google's stuff much better.

dekhn2y ago

This isn't that dissimilar to working at any sufficiently advanced R&D outfit, which strongly demonstrates the principle "the future is already here but isn't evenly distributed".

daniel_reetz2y ago

Thanks for pointing this out. I've spent years in R&D and awareness always lags technology.

seydor2y ago

LLMs are not stochastic though, they are deterministic and dont even require random numbers, right?

dudebro3142y ago

Simulations of Brownian motion are not stochastic though, they are deterministic if you fix their random seed, right?

seydor2y ago

Stochasticty is mandatory for modeling brownian motion.

Actually transformers do not require ramndomness at all, so not at all

FabHK2y ago

https://ai.stackexchange.com/questions/32477/what-is-the-tem...

Section 3.3 in https://www.lesswrong.com/posts/pHPmMGEMYefk9jLeh/llm-basics...

https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-...

1 more reply

constantcrying2y ago

All LLMs have some random aspects.

Training alone relies hugely on many factors (e.g. initialization of paramters, order of training data, hyper paramters, etc.).

enragedcacti2y ago

They are stochastic in the domain of meaning. Minor syntactic changes to the prompt or changes to the seed can result in substantial* changes to the meaning of the response.

*substantial as in nontrivial, not substantial as in massive

8note2y ago

Isn't that rather "unstable" or "poorly conditioned" ?

enragedcacti2y ago

Sure, I don't think those are mutually exclusive with stochastic. A stable or well-conditioned model may just have an acceptably small standard deviation for the task at hand.

seydor2y ago

Similar prompts give similar continuations, not wildly diverging, so no

enragedcacti2y ago

By definition most continuations won't be wildly divergent under a stochastic model because its following a bell curve.

1 more reply

dekhn2y ago

Seen from that perspective, the promoters of stochastic parrots would seem to be luddites and close-minded, as well as discouraging legitimate, important, and valuable scientific research.

NoGravitas2y ago

dekhn2y ago

renewiltord2y ago

jal2782y ago

cratermoon2y ago

> limiting bias by RLHF rather than picking the right datasets

torginus2y ago

Is it though? If you wanted to teach humans so that they don't develop unfortunate beliefs, would it be a good approach to just keep them from reading material that you find objectionable?

If you read a book that you disagree with, or one that contains falsehoods and bad reasoning as far as you can tell, would that make you believe those things?

cratermoon2y ago

A reminder that LLM transformers aren't humans, they don't learn the way humans learn.

Sunhold2y ago

The word "trauma" is getting overused. The idea of someone being traumatized by reading fictional text is just silly. It's unpleasant or gross at worst unless you already have other issues.

the84722y ago

The first step to defeating a tiger is to realize that it cannot hurt you, for it is only made of simple atoms.

cubefox2y ago

"Machine learning? It's just statistics bro."

rchaud2y ago

I've got another word for it: recipe-fication.

adamsmith1432y ago

cubefox2y ago

Yeah. See e.g.

https://arxiv.org/abs/2306.03341

> Our findings suggest that LLMs may have an internal representation of the likelihood of something being true, even as they produce falsehoods on the surface.

1 more reply

dehrmann2y ago

Something good that came out of crypto was a lot of people thought about what money actually is. LLMs are doing the same with intelligence.

deeviant2y ago

Eh, the thing I feel most people (who lost a lot of money on crypto) learned about what money is, is that crypto is not money.

pydry2y ago

It wasn't particularly deep thinking though. The same is also true here.

brandly2y ago

Regardless it's a good thing! Many people have had no reason until recently to break out of thinking that money=usd or intelligence=humans.

IshKebab2y ago

Yeah but they don't seem to be thinking about it very much. People keep spouting "stochastic parrot" nonsense!

api2y ago

danbruc2y ago

I’d argue that all these models are stochastic parrots because they’re not embodied in any way.

AnimalMuppet2y ago

So it seems fair to say that I understand grass on a level that GPT does not, and cannot. Therefore it seems fair to say that GPT is at least closer to being a stochastic parrot than humans are.

grumbel2y ago

danbruc2y ago

1 more reply

chrisnight2y ago

grumbel2y ago

> Do you consider a blind person having less sentience than a person that can see because they lack the sense of sight?

denial2y ago

jstx12y ago

> if it exhibited autonomy and value judgments.

Who wants this from ML systems? I want them to be useful, not to have autonomy and value judgments.

grumbel2y ago

kelseyfrog2y ago

I have a few projects in mind where that's a requirement.

Invictus02y ago

Feels like this wikipedia page is overly (self-?)promotional of the paper and its authors

constantcrying2y ago

One massive flaw of the Wikipedia modell is that the people who edit Wikipedia the most "aggressively" are the ones with the most emotional investment in the topic.

I have seen Wikipedia pages which basically read like ads for small companies.

ttpphd2y ago

Considering that men are taking credit for their work, maybe some over-correction is understandable.

GaggiX2y ago

Which men are taking credit for the work?

hackandthink2y ago

A nice paper:

"Meaning without reference in large language models"

"we argue that LLM likely capture important aspects of meaning, and moreover work in a way that approximates a compelling account of human cognition in which meaning arises from con- ceptual role"

https://arxiv.org/pdf/2208.02957.pdf

I remember Quine's meaning holism it seems to be related.

https://en.wikipedia.org/wiki/Semantic_holism

RHSman22y ago

What do you think parrots think about this? Insulted.

cubefox2y ago

GPT-3 was released less than a year before that, even though this now seems to be long ago. Time is moving fast with AI.

ChatGTP2y ago

Climate change moves fast too, what’s your point ?

aaroninsf2y ago

TL;DR: the focus on the implementation details, and descriptions like this, are detrimental, even perilous,

because such accounts are both accurate, and deeply misleading.

This is description, but it is neither predictive, nor explanatory.

It implies a false model, rather than providing one.

Evergreen:

koalala2y ago

It seems to me that the great success transformers are now enjoying is precisely due to the fact that 'probabilistic information about how they combine' _is_ meaning.

NoGravitas2y ago

IshKebab2y ago

It's obvious nonsense. I can describe a new concept to you using only words and letters and you can understand it. Therefore you can build up knowledge using only syntax.

Nobody is saying that LLMs understand the layout of a bus or the feel of leather, but they understand that buses are vehicles with four wheels that transport people etc.

Face-slappingly poor philosophy.

nologic012y ago

Rehashed language imitating sequences is a term that does not denigrate parrots.

browningstreet2y ago

“stochastic” is to the tech forum as “sapiosexual” is to the online dating profile

constantcrying2y ago

zone4112y ago

The same argument could apply to humans. For example https://en.wikipedia.org/wiki/Change_blindness.

cubefox2y ago

That's something else. The OP was talking about small changes in pictures causing a very different classification.

constantcrying2y ago

Human perception can be ambigous, but minimal changes never cause drastic category errors.

j / k navigate · click thread line to collapse