undefined | Better HN

0 pointsHarHarVeryFunny1y ago0 comments

Obviously adding more data is a game of diminishing returns.

Going from 10% to 50% (500% more) complete coverage of common sense knowledge and reasoning is going to feel like a significant advance. Going from 90% to 95% (5% more) coverage is not going to feel the same.

Regardless of what Altman says, its been two years since OpenAI released GPT-4, and still no GPT-5 in sight, and they are now touting Q-star/strawberry/GPT-o1 as the next big thing instead. Sutskever, who saw what they're cooking before leaving, says that traditional scaling has plateaeud.

0 comments

famouswaffles1y ago

>Regardless of what Altman says, its been two years since OpenAI released GPT-4, and still no GPT-5 in sight.

It's been 20 months since 4 was released. 3 was released 32 months after 2. The lack of a release by now in itself does not mean much of anything.

HarHarVeryFunnyOP1y ago

By itself, sure, but there are many sources all pointing to the same thing.

Sutskever, recently ex. OpenAI, one of the first to believe in scaling, now says it is plateauing. Do OpenAI have something secret he was unaware of? I doubt it.

FWIW, GPT-2 and GPT-3 were about a year apart (2019 "Language models are Unsupervised Multitask Learners" to 2020 "Language Models are Few-Shot Learners").

Dario Amodei recently said that with current gen models pre-training itself only takes a few months (then followed by post-training, etc). These are not year+ training runs.

famouswaffles1y ago

>Sutskever, recently ex. OpenAI, one of the first to believe in scaling, now says it is plateauing.

Blind scaling sure (for whatever reason)* but this is the same Sutskever who believes in ASI within a decade off the back of what we have today.

* Not like anyone is telling us any details. After all, Open AI and Microsoft are still trying to create a 100B data center.

In my opinion, there's a difference between scaling not working and scaling becoming increasingly infeasible. GPT-4 is something like x100 the compute of 3 (Same with 2>3).

All the drips we've had of 5 point to ~x10 of 4. Not small but very modest in comparison.

>FWIW, GPT-2 and GPT-3 were about a year apart (2019 "Language models are Unsupervised Multitask Learners" to 2020 "Language Models are Few-Shot Learners").

Ah sorry I meant 3 and 4.

>Dario Amodei recently said that with current gen models pre-training itself only takes a few months (then followed by post-training, etc). These are not year+ training runs.

You don't have to be training models the entire time. GPT-4 was done training in August 2022 according to Open AI and wouldn't be released for another 8 months. Why? Who knows.

1 more reply

j / k navigate · click thread line to collapse

0 pointsHarHarVeryFunny1y ago0 comments

Obviously adding more data is a game of diminishing returns.

0 comments

famouswaffles1y ago

>Regardless of what Altman says, its been two years since OpenAI released GPT-4, and still no GPT-5 in sight.

It's been 20 months since 4 was released. 3 was released 32 months after 2. The lack of a release by now in itself does not mean much of anything.

HarHarVeryFunnyOP1y ago

By itself, sure, but there are many sources all pointing to the same thing.

Sutskever, recently ex. OpenAI, one of the first to believe in scaling, now says it is plateauing. Do OpenAI have something secret he was unaware of? I doubt it.

FWIW, GPT-2 and GPT-3 were about a year apart (2019 "Language models are Unsupervised Multitask Learners" to 2020 "Language Models are Few-Shot Learners").

Dario Amodei recently said that with current gen models pre-training itself only takes a few months (then followed by post-training, etc). These are not year+ training runs.

famouswaffles1y ago

>Sutskever, recently ex. OpenAI, one of the first to believe in scaling, now says it is plateauing.

Blind scaling sure (for whatever reason)* but this is the same Sutskever who believes in ASI within a decade off the back of what we have today.

* Not like anyone is telling us any details. After all, Open AI and Microsoft are still trying to create a 100B data center.

In my opinion, there's a difference between scaling not working and scaling becoming increasingly infeasible. GPT-4 is something like x100 the compute of 3 (Same with 2>3).

All the drips we've had of 5 point to ~x10 of 4. Not small but very modest in comparison.

>FWIW, GPT-2 and GPT-3 were about a year apart (2019 "Language models are Unsupervised Multitask Learners" to 2020 "Language Models are Few-Shot Learners").

Ah sorry I meant 3 and 4.

>Dario Amodei recently said that with current gen models pre-training itself only takes a few months (then followed by post-training, etc). These are not year+ training runs.

You don't have to be training models the entire time. GPT-4 was done training in August 2022 according to Open AI and wouldn't be released for another 8 months. Why? Who knows.

1 more reply

j / k navigate · click thread line to collapse