undefined | Better HN

0 pointsmabbo2y ago0 comments

If you mean the "If you don't know" part, oh no, they have a much bigger problem they're solving.

The LLM will absolutely lie if it doesn't know and you haven't made it perfectly clear that you'd rather it did not do that.

LLMs seem to be trying to give answers that make you happy. A good lie will make you happy. Unless it understands that you will not be happy with a lie.

Is this anthropomorphizing? Yep. But that's the best way I've found to reason about them.

0 comments

dragonwriter2y ago

A less anthropomorphic approach might be to say that LLMs can predict the correct “shape” of an answer even when they don’t have data that gives them a clear right answer for the correct content, and since their basic design is to provide the best response they can, they’ll provide an answer of the correct shape with fairly random content if all they have good information to predict is the shape and not the content.

lysozyme2y ago

I think parent has hit on the how and GP has hit on the why.

How LLMs are able to give convincing wrong answers: they “can predict the correct ‘shape’ of an answer” (parent).

Why LLMs are able to give convincing wrong answers is a little more complicated, but basically it’s because the model is tuned by human feedback. The reinforcement learning from human feedback (RLHF) that is used to tune LLM products like ChatGPT is a system based on human ranking. It’s a matter of getting exactly what you ask for.

If you tune a model by having humans rank the outputs, despite your best efforts to instruct the humans to be dispassionate and select which outputs are most convincing/best/most informative, I think what you’ll get is a bias towards answers humans like. Not every human will know every answer, so sometimes they’ll select one that’s wrong but likable. And that’s what’s used to tune the model.

You might be able to improve this with curated training data (maybe something a little more robust than having graders grade each other). I don’t know if it’s entirely fixable though.

The brilliant thing about the parent’s comment about the “shape” of the answer is that it reveals how much humans have (uh, historically, now, I guess) relied on the shape of information to convey its trustworthiness. Expand the notion of “shape” a bit to include the medium. If somebody bothered to take the time to correctly shape an answer, we take that as a sign of trustworthiness, like how you might trust something written in a carefully-typeset book more than this comment.

Surely no one would take the time to write a whole book on a topic they know nothing about. Implies books are trustworthy. Look at all the effort that went in. Proof of effort. When perfectly-shaped answers in exactly the form you expected are presented in a friendly way and commercial context, they certainly read as trustworthy as Campbell’s soup cans. But LLMs can generate books worth of nonsense in exactly the right shapes without effort, so we as readers can no longer use the shape of an answer to hint at its trustworthiness.

So maybe the answer is just to train on books only, because they are the highest quality source of training data. And carefully select and accredit the tuning data, so the model only knows the truth. It’s a data problem, not a model problem

jstarfish2y ago

Cool, thanks for tying a neat ribbon around it.

> The brilliant thing about the parent’s comment about the “shape” of the answer is that it reveals how much humans have (uh, historically, now, I guess) relied on the shape of information to convey its trustworthiness.

This is the basis of Rumor. If you tell a story about someone that is entirely false but sounds like something they're already suspected of or known to do, people will generally believe it without verification since the "shape" of the story fits people's expectations of the subject.

To date I've decried the choice of "hallucination" instead of "lies" for false LLM output, but it now seems clear to me that LLMs are a literal rumor mill.

flagrant_taco2y ago

What's the point of the technology if it will provide an answer regardless of the accuracy? And what prevents this from being dangerous when the factual and ficticious answers are indistinguishable?

TeMPOraL2y ago

We have the same problem with people. Somehow, we've managed to build a civilization that can, occasionally, fly people to the Moon and get them back.

Even if LLMs never get any more reliable than your average human, they're still valuable because they know much more than any single human ever could, run faster, only eat electricity, and can be scaled up without all kinds of nasty social and political problems. That's huge on its own.

Or, put another way, LLMs are kind of an concentrated digital extract of human cognitive capacity, without consciousness or personhood.

2 more replies

skybrian2y ago

Yes, it could be dangerous if you blindly rely on its reliability for something safety-related. But many creative processes are unreliable. For example, coming up with bad ideas while brainstorming is pretty harmless if nobody misunderstands it.

Generally, you want some external way of verifying that you have something useful. Sometimes that happens naturally. Ask a chatbot to recommend a paper to read and then search for it, and you’ll find out pretty quick if it doesn’t exist.

1 more reply

lynx232y ago

These are the 1-million dollar questions when it comes to LLMs. How useful is it to talk to a human who likes to talk, and prefers to say something over admiting they dont know? And if you have a person with münchhausensyndrome in your circles, how dangerous is it to listen to them and accidentally picking up a lie? LLMs with temp > 0.5 are effectively like these people.

2 more replies

dragonwriter2y ago

> What's the point of the technology if it will provide an answer regardless of the accuracy?

The purpose is to serve as a component of a system which also includes features, such as the prompt structure upthread, that mitigates the undesired behavior while keeping the useful behaviors.

twelve402y ago

for one, telling people something they like to hear is an amazing marketing tactic

eternalban2y ago

I suggest using 'form' instead of 'shape'; the latter is mainly concerned with external form. In context of LLMs, form would be the internal mapping, and shape the decoded text that is emitted.

slim2y ago

those things are anthropomorphic by design. there's no point in being cautious, unless it's from an ideological stand point

eternalban2y ago

I think the social concerns around attributing personhood to LLMs transcend ideological concerns.

teawrecks2y ago

I think of it more like a pachinko machine. You put your question in the top, it bounces around through a bunch of biased obstacles, but intevitably it will come out somewhere at the bottom.

By telling it not to lie to you, you're biasing it toward a particular output in the event that its confidence is low. Otherwise, low confidence results just fall out somewhere mostly random.

cbm-vic-202y ago

> By telling it not to lie to you, you're biasing it toward a particular output in the event that its confidence is low.

This is something I really don't understand about LLMs. I think I understand how the generative side of them work, but "asking" it to not lie baffles me. LLMs require a massive corpus of text to train the model, how much of that text contains tokens that translate to "don't lie to me", and scores well enough to make its way into the output?

TeMPOraL2y ago

> Is this anthropomorphizing? Yep. But that's the best way I've found to reason about them.

My take? It's like a high-schooler being asked a question by the teacher and having to answer on the spot. If they studied the material well, they'll give a good and correct answer. If they (like me, more often than I'd care to admit) only half-listened to the lectures and maaaaybe skimmed some cliff's notes before class, they will give an answer too - one strung together out of few remembered (or misremembered) facts, an overall feel for the problem space (e.g. writing style, historical period, how people behave), with lots and lots of interpolation in between. Delivered confidently, it has more chance of avoiding a bad mark (or even scoring a good one) than flat-out saying, "I don't know".

Add to that some usual mistakes out of carelessness and... whatever it is that makes you forget a minus sign and realize it half a page of equations later - and you get GPT-4. It's giving answers like a person who just blurts out whatever thoughts pop into their head, without making a conscious attempt at shaping or interrogating them.

gwd2y ago

> Is this anthropomorphizing? Yep. But that's the best way I've found to reason about them.

I think it might be more accurate to say, "LLMs are writing a novel in which a very smart AI answers everyone's questions." If you were writing a sci fi novel with a brilliant AI, and you knew the answer to some question or other, you'd put in the right answer. But if you didn't know, you'd just make up something that sounded plausible.

Alternately, you can think of the problem as the AI taking an exam. If you get an exam question you're a bit fuzzy on, you don't just write "I don't know". You come up with the best answer you can given the scraps of information you do know. Maybe you'll guess right, and in any case you'll get some partial credit.

The first one ("writing a novel") is useful I think in contextualizing emotions expressed by LLMs. If you're writing a novel where some character expresses an emotion, you aren't experiencing that emotion. Nor is the LLM when they express emotions: they're just trying to complete the text -- i.e., write a good novel.

j / k navigate · click thread line to collapse

0 comments

dragonwriter2y ago

lysozyme2y ago

I think parent has hit on the how and GP has hit on the why.

How LLMs are able to give convincing wrong answers: they “can predict the correct ‘shape’ of an answer” (parent).

You might be able to improve this with curated training data (maybe something a little more robust than having graders grade each other). I don’t know if it’s entirely fixable though.

jstarfish2y ago

Cool, thanks for tying a neat ribbon around it.

To date I've decried the choice of "hallucination" instead of "lies" for false LLM output, but it now seems clear to me that LLMs are a literal rumor mill.

flagrant_taco2y ago

What's the point of the technology if it will provide an answer regardless of the accuracy? And what prevents this from being dangerous when the factual and ficticious answers are indistinguishable?

TeMPOraL2y ago

We have the same problem with people. Somehow, we've managed to build a civilization that can, occasionally, fly people to the Moon and get them back.

Or, put another way, LLMs are kind of an concentrated digital extract of human cognitive capacity, without consciousness or personhood.

2 more replies

skybrian2y ago

1 more reply

lynx232y ago

2 more replies

dragonwriter2y ago

> What's the point of the technology if it will provide an answer regardless of the accuracy?

The purpose is to serve as a component of a system which also includes features, such as the prompt structure upthread, that mitigates the undesired behavior while keeping the useful behaviors.

twelve402y ago

for one, telling people something they like to hear is an amazing marketing tactic

eternalban2y ago

I suggest using 'form' instead of 'shape'; the latter is mainly concerned with external form. In context of LLMs, form would be the internal mapping, and shape the decoded text that is emitted.

slim2y ago

those things are anthropomorphic by design. there's no point in being cautious, unless it's from an ideological stand point

eternalban2y ago

I think the social concerns around attributing personhood to LLMs transcend ideological concerns.

teawrecks2y ago

I think of it more like a pachinko machine. You put your question in the top, it bounces around through a bunch of biased obstacles, but intevitably it will come out somewhere at the bottom.

By telling it not to lie to you, you're biasing it toward a particular output in the event that its confidence is low. Otherwise, low confidence results just fall out somewhere mostly random.

cbm-vic-202y ago

> By telling it not to lie to you, you're biasing it toward a particular output in the event that its confidence is low.

TeMPOraL2y ago

> Is this anthropomorphizing? Yep. But that's the best way I've found to reason about them.

gwd2y ago

> Is this anthropomorphizing? Yep. But that's the best way I've found to reason about them.

j / k navigate · click thread line to collapse