undefined | Better HN

0 pointstripletao1y ago0 comments

> AFAICT the participants' goal was just to judge the humanness of a single witness, not to maximize their long-term likelihood of judging correctly over many trials

Why do you think this matters? Even in a single trial, I would judge very differently if I knew the population to be 99% human vs. 1% human. Wouldn't you? If you were judging whether a single mushroom was poisonous or not, then would you not care whether it was found in a forest (mostly poisonous) or a supermarket (mostly not)?

The question of whether probabilities are meaningful for non-repeated events was controversial in the eighteenth century, but I thought it was pretty settled by now. Bookmakers manage to estimate a probability that a given team will win the Super Bowl, with no requirement for the same pair of teams to play multiple times.

> If they had wanted "indistinguishable" as a threshold, then obviously their pass criteria would have been for the machine and human pass rates to be equal within an error bar, right?

The title of the paper is literally "People cannot distinguish GPT-4 from a human in a Turing test". They're very clear that they think that's because 50% means indistinguishable:

> A baseline of 50% is better justified since it indicates that interrogators are not better than chance at identifying machines [French, 2000].

That statement is true for a Turing test with a binary choice, but false for theirs. I agree that "for the machine and human pass rate to be equal within an error bar" would be closer to a correct criterion, and they weren't:

> humans’ pass rate was significantly higher than GPT-4’s (z = 2.42, p = 0.017)

So do you think their paper is correctly titled?

0 comments

fenomas1y ago

> Why do you think this matters?

I said that in passing, maybe I should have omitted it - the main point with the priors is that the respondents didn't know them. It's normal for a study to compare N things to a control by testing N+1 similarly-sized groups, because subjects are not biased by priors they don't know about.

> So do you think their paper is correctly titled?

I didn't say anything about that, and have no strong opinion. (I'm not here to defend every aspect of the paper!)

tripletaoOP1y ago

Can you clarify what you think the paper says that's correct? If you're unwilling to provide an opinion on the paper's literal headline claim, then I don't know what we're discussing here.

> because subjects are not biased by priors they don't know about

It feels like you want the subjects to be "unbiased", "without any prior"? That concept doesn't exist, though. If no prior is supplied, then the subjects will make their best guess based on general past experience; but that's still a prior, just an idiosyncratic personal one. Very few people would put numbers to the forest vs. supermarket mushroom, but it's the same general thought process.

If that prior matches the actual distribution, then good. If the actual distribution contains more machines than expected, then the interrogator is more likely to misjudge a machine as a human. By analogy, if I think mistakenly that a mushroom came from a supermarket but it actually came from a forest, then I'm more likely to misjudge it as non-poisonous.

The binary choice version makes it obvious that the prior is 50%, and forces the interrogator to respect that. The paper's version has sent us into this epistemological tarpit, which seems strictly worse to me.

j / k navigate · click thread line to collapse