undefined | Better HN

0 pointsvrighter3mo ago0 comments

So? The point is that humans do it much less often.

Let's say there are 10 subtasks that need to be done.

Let's say a human has 99% chance of getting each one of them right, by doing the proper testing etc. And let's say that the AI has a 95% chance of getting it right (being very generous here).

0.99^10 = a 90% chance of the human getting it to work properly. 0.95^10 = only a 60% chance. Almost a coin toss.

Even with 98% success rate, the compounding success rate still goes down to about 81%.

The thing is that LLM's aren't just "a little bit" worse than humans. In comparison they're cavemen.

0 comments

raw_anon_11113mo ago

So humans do it much less often yet we have 30 years of evidence to the contrary? Humans still can’t figure out how to write code not subject to sql injection after 25 years or how to write code and commit it to GitHub without exposing admin credentials

vrighterOP3mo ago

yes, over time there are a lot of cases. Doesn't change the fact that llms make them more often. You want to look at the rate, not the absolute number.

raw_anon_11113mo ago

And of course you have evidence of this?

1 more reply

j / k navigate · click thread line to collapse

0 pointsvrighter3mo ago0 comments

So? The point is that humans do it much less often.

Let's say there are 10 subtasks that need to be done.

Let's say a human has 99% chance of getting each one of them right, by doing the proper testing etc. And let's say that the AI has a 95% chance of getting it right (being very generous here).

0.99^10 = a 90% chance of the human getting it to work properly. 0.95^10 = only a 60% chance. Almost a coin toss.

Even with 98% success rate, the compounding success rate still goes down to about 81%.

The thing is that LLM's aren't just "a little bit" worse than humans. In comparison they're cavemen.

0 comments

raw_anon_11113mo ago

vrighterOP3mo ago

yes, over time there are a lot of cases. Doesn't change the fact that llms make them more often. You want to look at the rate, not the absolute number.

raw_anon_11113mo ago

And of course you have evidence of this?

1 more reply

j / k navigate · click thread line to collapse