And yes, by this definition, LLMs pass with flying colours.
Firstly humans have not been evolving for “billions” of years.
Homo sapiens have been around for maybe 300’000 years, and the “homo” genus has been 2/3 million years. Before that we were chimps etc and that’s 6/7 million years ago.
If you want to look at the entire brain development, ie from mouse like creatures through to apes and then humans that’s 200M years.
If you want to think about generations it’s only 50/75M generations, ie “training loops”.
That’s really not very many.
Also the bigger point is this, for 99.9999% of that time we had no writing, or any kind of complex thinking required.
So our ability to reason about maths, writing, science etc is only in the last 2000-2500 years! Ie only roughly 200 or so generations.
Our brain was not “evolved” to do science, maths etc.
Most of evolution was us running around just killing stuff and eating and having sex. It’s only a tiny tiny amount of time that we’ve been working on maths, science, literature, philosophy.
So actually, these models have a massive, massive amount of training more than humans had to do roughly the same thing but using insane amounts of computing power and energy.
Our brains were evolved for a completely different world and environment and daily life that the life we lead now.
So yes, LLMs are good, but they have been exposed to more data and training time than any human could have unless we lived for 100000 years and still perform worse than we do in most problems!
I realise upon reading the OP's comment again that they may have been referring to "extrapolation", which is hugely problematic from the statistical viewpoint when you actually try to break things down.
My argument for compression asserts that LLMs see a lot of knowledge, but are actually quite small themselves. To output a vast amount of information in such a small space requires a large amount of pattern matching and underlying learned algorithms. I was arguing that humans are actually incredible compressors because we have many years of history in our composition. It's a moot point though, because it is the ratio of output to capacity that matters.
They can attempt to mimic the results for small instances of the problem, where there are a lot of worked examples in the dataset, but they will never ever be able to generalize and actually give the correct output for arbitrary sized instances of the problem. Not with current architectures. Some algorithms simply can't be expressed as a fixed-size matrix multiplication.
Tell Boston Dynamics how to do that.
Mice inherited brain from their ancestors. You might think you don't need a working brain to reason about math, but that's because you don't know how thinking works, it's argument from ignorance.
People argue that humans have had the equivalent of training a frontier LLM for billions of years.
But training a frontier LLM involves taking multiple petabytes of data, effectively all of recorded human knowledge and experience, every book ever written, every scientific publication ever written, all of known maths, science, encylopedias, podcasts, etc. And then training that for millions of years worth of GPU-core time.
You cannot possibly equate human evolution with LLM training, it's ridiculous.
Our "training" time didn't involve any books, maths, science, reading, 99.9999% of our time was just in the physical world. So you can quite rationally argue that our brains ability to learn without training is radically better and more efficient that the training we do for LLMs.
Us running around in the jungle wasn't training our brain to write poetry or compose music.
Were mammals the first thing? No. Earth was a ball of ice for a billion years - all life at that point existed solely around thermal vents at the bottom of the oceans... that's inside of you, too.
Evolution doesn't forget - everything that all life has ever been "taught" (violently had programmed into us over incredible timelines) all that has ever been learned in the chain of DNA from the single cell to human beings - its ALL still there.