undefined | Better HN

0 pointsNewsaHackO4mo ago0 comments

>establish benchmarks that make sense and are reliable

How aren't current LLM coding benchmarks reliable?

0 comments

They're manipulated.

Unless you are going to be more specific, that criticism applies to all benchmarks that are connected to a positive gain, not just AI coding benchmarks.

j / k navigate · click thread line to collapse

0 comments

Papazsazsa4mo ago

They're manipulated.

NewsaHackOOP4mo ago

Unless you are going to be more specific, that criticism applies to all benchmarks that are connected to a positive gain, not just AI coding benchmarks.

j / k navigate · click thread line to collapse