undefined | Better HN

0 pointsJensson1y ago0 comments

> What they’ve proven here is that it can be done.

No they haven't, these results do not generalize, as mentioned in the article:

"Furthermore, early data points suggest that the upcoming ARC-AGI-2 benchmark will still pose a significant challenge to o3, potentially reducing its score to under 30% even at high compute"

Meaning, they haven't solved AGI, and the task itself do not represent programming well, these model do not perform that well on engineering benchmarks.

0 comments

whynotminot1y ago

Sure, AGI hasn’t been solved today.

But what they’ve done is show that progress isn’t slowing down. In fact, it looks like things are accelerating.

So sure, we’ll be splitting hairs for a while about when we reach AGI. But the point is that just yesterday people were still talking about a plateau.

peepeepoopoo971y ago

About 10,000 times the cost for twice the performance sure looks like progress is slowing to me.

whynotminot1y ago

Just to be clear — your position is that the cost of inference for o3 will not go down over time (which would be the first time that has happened for any of these models).

1 more reply

j / k navigate · click thread line to collapse

0 comments

whynotminot1y ago

Sure, AGI hasn’t been solved today.

But what they’ve done is show that progress isn’t slowing down. In fact, it looks like things are accelerating.

So sure, we’ll be splitting hairs for a while about when we reach AGI. But the point is that just yesterday people were still talking about a plateau.

peepeepoopoo971y ago

About 10,000 times the cost for twice the performance sure looks like progress is slowing to me.

whynotminot1y ago

Just to be clear — your position is that the cost of inference for o3 will not go down over time (which would be the first time that has happened for any of these models).

1 more reply

j / k navigate · click thread line to collapse