undefined | Better HN

0 pointscantor_S_drug10mo ago0 comments

Is the model thinking what is cat doing here? Then start thinking it is being tested?

0 comments

Even if the model "ignores" it. Won't the presence of the irrelevant text alter the probability of its output in some way?

wongarsu10mo ago

I have no clue what the model is thinking, and as far as I can tell the paper also makes no attempt at answering that. It's also not really the point, the point is more that the claim in the paper that humans would be unaffected is unsubstantiated and highly suspect. I'd even say more likely wrong than right

xienze10mo ago

> It's also not really the point, the point is more that the claim in the paper that humans would be unaffected is unsubstantiated and highly suspect.

I think the question that adds a random cat factoid at the end is going to trip up a lot fewer humans than you think. At the very least, they could attempt to tell you after the fact why they thought it was relevant.

And ignoring that, obviously we should be holding these LLMs to a higher standard than “human with extraordinary intelligence and encyclopedic knowledge that can get tripped up by a few irrelevant words in a prompt.” Like, that should _never_ happen if these tools are what they’re claimed to be.

lawlessone10mo ago

I'm sure humans would be affected in some way. But not al all the same way an LLM would.

A human would probably note it as a trick in their reply.

The way LLMs work it could bias their replies in weird ways by changing their replies in unexpected ways beyond seeing it as a trick.

cantor_S_drugOP10mo ago

They should prompt the model to ignore irrelevant information and test if the model performs better and is good at ignoring those statements?

Detrytus10mo ago

I wonder if the problem here is simply hitting some internal quota on compute resources? Like, if you send the model on wild goose chase with irrelevant information it wastes enough compute time on it that it fails to arrive at correct answer to main question.

cantor_S_drugOP10mo ago

Possibly. But could indicate that initial tokens set the direction or the path model could go down into. Just like when a person mentions two distinct topics in conversation nearby, the listener decides which topic to continue with.

j / k navigate · click thread line to collapse

0 comments

lawlessone10mo ago

Even if the model "ignores" it. Won't the presence of the irrelevant text alter the probability of its output in some way?

wongarsu10mo ago

xienze10mo ago

> It's also not really the point, the point is more that the claim in the paper that humans would be unaffected is unsubstantiated and highly suspect.

lawlessone10mo ago

I'm sure humans would be affected in some way. But not al all the same way an LLM would.

A human would probably note it as a trick in their reply.

The way LLMs work it could bias their replies in weird ways by changing their replies in unexpected ways beyond seeing it as a trick.

cantor_S_drugOP10mo ago

They should prompt the model to ignore irrelevant information and test if the model performs better and is good at ignoring those statements?

Detrytus10mo ago

cantor_S_drugOP10mo ago

j / k navigate · click thread line to collapse