undefined | Better HN

0 pointsgodelski2y ago0 comments

> Other animals with larger brains might have other bottlenecks

>>> What’s the main difference between an ape’s brain and a human brain? Scale.

Your argument is inconsistent. Very clearly everything isn't scale or we'd use other things besides transformers. Different architectures scale in different ways and everything has different inductive biases. No one doubts scale is important, but there's a lot more.

0 comments

p1esk2y ago

Scale is all we need for transformers (so far). It might also be all we need for ape brains. It’s not all we need for whatever elephant or dolphin brains evolved from.

When this stops being the case for transformers, we will need something else. I’m just pointing out it’s not the case yet.

godelskiOP2y ago

I see no evidence of this in biology nor in ML. I've read those scale papers. I've worked on scale myself. I'll bet the farm that scale isn't all you need. But I won't be surprised if people say that it is all scale.

If you really think it is all scale, train a 7T ResNet MLP based model for NLP. If scale is all you need, make a LLM without DPO or RLHF. If scale is all you need, make SD3 with a GAN. Or what about a VAE, Normalizing Flow, HMM? Do it with different optimizers. Do it with gradient free methods. Do it with different loss functions.

The bitter lesson wasn't "scale is all you need." That's just a misinterpretation.

Edit: It's fine to disagree. We can compete on the ideas and methods. That's good for our community. So continue down yours, and I'll continue down mine. All I ask is that since your camp is more popular, you don't block those on my side. If you're right, we'll get to AGI soon. If we're right, we still might. But if we're right and you all block us, we'll get another AI winter in between. If we're right and you all don't block us, we can progress without skipping a beat. Just don't put all your eggs in one basket. It isn't good for anyone.

p1esk2y ago

I said “scale is all you need for transformers”. That has been true since GPT1. The best way to improve our best model today still seems to be “make it larger and train it on more data”.

If you disagree please suggest a better way, or at least provide evidence that scaling up no longer works for transformers.

1 more reply

j / k navigate · click thread line to collapse

0 comments

p1esk2y ago

Scale is all we need for transformers (so far). It might also be all we need for ape brains. It’s not all we need for whatever elephant or dolphin brains evolved from.

When this stops being the case for transformers, we will need something else. I’m just pointing out it’s not the case yet.

godelskiOP2y ago

The bitter lesson wasn't "scale is all you need." That's just a misinterpretation.

p1esk2y ago

I said “scale is all you need for transformers”. That has been true since GPT1. The best way to improve our best model today still seems to be “make it larger and train it on more data”.

If you disagree please suggest a better way, or at least provide evidence that scaling up no longer works for transformers.

1 more reply

j / k navigate · click thread line to collapse