undefined | Better HN

0 pointssobellian1y ago0 comments

Okay, but you still need to debug the program. If your program must give correct results you still need to check the program output against every case. There's no free lunch there.

0 comments

energy1231y ago

Speaking generally: The program doesn't always have to give correct results. The program just needs to reduce 30k documents down to 200 documents for human review.

You're comparing LLMs to a hypothetical alternative where a human reviews all 30k documents in detail. But the real alternative is often just a worse quality sieve where more errors blunder their way through the existing flawed processes. LLMs can improve on that.

sobellianOP1y ago

The epistemology problem never goes away. How should I have any confidence that it's correctly flagging things for review? I need to go through 28800 documents to see if it missed anything.

You're right, I am comparing it to that alternative. There are fields and applications where this is necessary. I do not know if drilling reports are one of them. If you can tolerate a large false negative rate then great. But if you need to be catching 99.99% of problems then IMO you should at least be able to show your work. Taking black box output and throwing it over the wall sounds so sketchy in engineering contexts.

energy1231y ago

You can't have confidence, but my point is you often don't need confidence. All you need is an improvement on the flawed status quo.

bongodongobob1y ago

Yeah I mean I had to move some big folders from server to server last week, maybe about 400. It was too random to script (would take longer to write the script) and I, as a human, doing it manually, still fucked up about 10%. 30k to 200 is exactly the stuff I'm talking about. The other people's existential dread is showing in this thread.

bongodongobob1y ago

You're right. That's why to be sure I don't use software. All paper and pencil. So I can be sure. I have no idea what your point is.

sobellianOP1y ago

I'm fine with writing software. I do so for a living. Usually when I'm responsible for a piece of software being correct, I'm the one who wrote it and not a black box. I use AI to autocomplete my code all the time and it very frequently suggests the wrong thing and attempts to insert random bugs.

So if my ass was on the line for the output of an AI-written program being correct for 30k cases of parsing unstructured or mixed data I would be extremely careful. That is my point.

bongodongobob1y ago

Autocomplete is not in the same ballpark as intentionally prompting software.

1 more reply

j / k navigate · click thread line to collapse

0 comments

energy1231y ago

Speaking generally: The program doesn't always have to give correct results. The program just needs to reduce 30k documents down to 200 documents for human review.

sobellianOP1y ago

The epistemology problem never goes away. How should I have any confidence that it's correctly flagging things for review? I need to go through 28800 documents to see if it missed anything.

energy1231y ago

You can't have confidence, but my point is you often don't need confidence. All you need is an improvement on the flawed status quo.

bongodongobob1y ago

You're right. That's why to be sure I don't use software. All paper and pencil. So I can be sure. I have no idea what your point is.

sobellianOP1y ago

So if my ass was on the line for the output of an AI-written program being correct for 30k cases of parsing unstructured or mixed data I would be extremely careful. That is my point.

bongodongobob1y ago

Autocomplete is not in the same ballpark as intentionally prompting software.

1 more reply

j / k navigate · click thread line to collapse