undefined | Better HN

0 pointsgordonhart2mo ago0 comments

Modern reasoning models are actually pretty good at arithmetic and almost certainly would have caught this error if asked.

Source: we benchmark this sort of stuff at my company and for the past year or so frontier models with a modest reasoning budget typically succeed at arithmetic problems (except for multiplication/division problems with many decimal places, which this isn't).

0 comments

RobotToaster2mo ago

Interesting, how have you found they have been performing at more complex things like calculus and analysis?

speedgoose2mo ago

It’s on the front page of HN once in a while.

j / k navigate · click thread line to collapse

0 pointsgordonhart2mo ago0 comments

Modern reasoning models are actually pretty good at arithmetic and almost certainly would have caught this error if asked.

0 comments

RobotToaster2mo ago

Interesting, how have you found they have been performing at more complex things like calculus and analysis?

speedgoose2mo ago

It’s on the front page of HN once in a while.

j / k navigate · click thread line to collapse