undefined | Better HN

0 pointsskybrian2y ago0 comments

How do you make an objective statement about how well GPT-4 does logical reasoning?

Running benchmarks seems like a reasonable way to do it. The objective statements are the benchmark results. They are there. That's the main result of the paper.

0 comments

Closi2y ago

You can make objective statements by benchmarking, but by the nature of benchmarking you need something to benchmark lower to be able to conclude that something is performing poorly.

Benchmarking is comparative - that’s the whole point - so the conclusions aren’t actually backed up by the paper.

j / k navigate · click thread line to collapse

0 pointsskybrian2y ago0 comments

How do you make an objective statement about how well GPT-4 does logical reasoning?

Running benchmarks seems like a reasonable way to do it. The objective statements are the benchmark results. They are there. That's the main result of the paper.

0 comments

Closi2y ago

You can make objective statements by benchmarking, but by the nature of benchmarking you need something to benchmark lower to be able to conclude that something is performing poorly.

Benchmarking is comparative - that’s the whole point - so the conclusions aren’t actually backed up by the paper.

j / k navigate · click thread line to collapse