undefined | Better HN

0 pointsprng20211y ago0 comments

I’m confused about the excitement. Are people just flat out ignoring the sentences below? I don’t see any breakthrough towards AGI here. I see a model doing great in another AI test but about to abysmally fail a variation of it that will come out soon. Also, aren’t these comparisons completely nonsense considering it’s o3 tuned vs other non-tuned?

> Note on "tuned": OpenAI shared they trained the o3 we tested on 75% of the Public Training set. They have not shared more details. We have not yet tested the ARC-untrained model to understand how much of the performance is due to ARC-AGI data.

> Furthermore, early data points suggest that the upcoming ARC-AGI-2 benchmark will still pose a significant challenge to o3, potentially reducing its score to under 30% even at high compute (while a smart human would still be able to score over 95% with no training).

0 comments

oakpond1y ago

Me too. This looks to me like a holiday PR stunt. Get everybody to talk about AI during the Christmas parties.

j / k navigate · click thread line to collapse