undefined | Better HN

0 pointshalJordan3mo ago0 comments

Is this the new "r's in strawberry"? Are you going (stochastically) parrot this until it's been trained out?

0 comments

> trained out

No need. Just add one more correction to the system prompt.

It's amusing to see hardcore believers of this tech doing mental gymnastics and attacking people whenever evidence of there being no intelligence in these tools is brought forth. Then the tool is "just" a statistical model, and clearly the user is holding it wrong, doesn't understand how it works, etc.

rockinghigh3mo ago

It's a lot simpler. These models are not optimized for ambiguous riddles.

imiric3mo ago

There's nothing ambiguous about this question[1][2]. The tool simply gives different responses at random.

And why should a "superintelligent" tool need to be optimized for riddles to begin with? Do humans need to be trained on specific riddles to answer them correctly?

[1]: https://news.ycombinator.com/item?id=47054076

[2]: https://news.ycombinator.com/item?id=47037125

crimsoneer3mo ago

I mean, the flipside is that we have been tricking humans with this sort of thing for generations. We've all seen a hundred variations on "A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. How much does the ball cost?" or "If 5 machines take 5 minutes to make 5 widgets, how long do 100 machines take to make 100 widgets?" or even the whole "the father was the surgeon" story.

If you don't recognise the problem and actively engage your "system 2 brain", it's very easy to just leap to the obvious (but wrong) answer. That doesn't mean you're not intelligent and can't work it out if someone points out the problem. It's just the heuristics you've been trained to adopt betray you here, and that's really not so different a problem to what's tricking these llms.

imiric3mo ago

But this is not a trick question[1]. It's a straightforward question which any sane human would answer correctly.

It may trigger a particularly ambiguous path in the model's token weights, or whatever the technical explanation for this behavior is, which can certainly be addressed in future versions, but what it does is expose the fact that there's no real intelligence here. For all its "thinking" and "reasoning", the tool is incapable of arriving at the logically correct answer, unless it was specifically trained for that scenario, or happens to arrive at it by chance. This is not how intelligence works in living beings. Humans don't need to be trained at specific cognitive tasks in order to perform well at them, and our performance is not random.

But I'm sure this is "moving the goalposts", right?

[1]: https://news.ycombinator.com/item?id=47060374

crimsoneer3mo ago

But this one isn't a trick question either right... it's just basic maths, and a quirk of how our brain works that means plenty of people don't engage the part of their brain that goes "I should stop and think this through", and just rush to the first number that pops into their head. But that number is wrong, and is a result of our own weird "training" (in that we all have a bunch of mental shortcuts we use for maths, and sometimes they lead us astray).

"A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. How much does the ball cost?"

And yet 50% of MIT students fall for this sort of thing[1]. They're not unintelligent, it's just a specific problem can make your brain fail in weird specific ways. Intelligence isn't just a scale from 0-100, or some binary yes or no question, it's a bunch of different things. LLMs probably are less intelligent on a bunch of scales, but this one specific example doesn't tell you much that they have weird quirks just like we do.

[1] https://www.aeaweb.org/articles?id=10.1257/08953300577519673...

1 more reply

valdork593mo ago

and how many variations of trick questions do you think the LLM has seen?

j / k navigate · click thread line to collapse

0 comments

imiric3mo ago

> trained out

No need. Just add one more correction to the system prompt.

rockinghigh3mo ago

It's a lot simpler. These models are not optimized for ambiguous riddles.

imiric3mo ago

There's nothing ambiguous about this question[1][2]. The tool simply gives different responses at random.

And why should a "superintelligent" tool need to be optimized for riddles to begin with? Do humans need to be trained on specific riddles to answer them correctly?

[1]: https://news.ycombinator.com/item?id=47054076

[2]: https://news.ycombinator.com/item?id=47037125

crimsoneer3mo ago

imiric3mo ago

But this is not a trick question[1]. It's a straightforward question which any sane human would answer correctly.

But I'm sure this is "moving the goalposts", right?

[1]: https://news.ycombinator.com/item?id=47060374

crimsoneer3mo ago

"A bat and a ball cost $1.10 in total. The bat costs $1.00 more than the ball. How much does the ball cost?"

[1] https://www.aeaweb.org/articles?id=10.1257/08953300577519673...

1 more reply

valdork593mo ago

and how many variations of trick questions do you think the LLM has seen?

j / k navigate · click thread line to collapse