LLM are way past us at languages for instance. Calculators passed us at calculating, etc.
A calculator is extremely useful, but it is not intelligent.
A computer is extremely useful, but it is not intelligent.
Airplanes don't have wings, but they're damn sure useful, and also not intelligent.
If LLMs cannot learn to beat not-that-difficult of games better than young teens, they are not intelligent.
They are extremely useful. But they are not AGI.
Words matter.
I agree, with unresolved questions. Does it count if the LLM writes code which trains a neural network to play the game, and that neural network plays the game better than people do? Does that only count if the LLM tries that solution without a human prompting it to do so?
The idea that we haven't taught LLMs to come up with new answers... That doesn't even sound plausible. Just crank up the temperature, and an LLM will throw out so many ideas you'll exhaust yourself trying to sort through them.
So what haven't we taught LLMs?
- Have we not taught them to "filter"? We just haven't equipped them with experience and intuition, because we only feed them either "absolute fakes" or "verified facts." We don't feed them the actual path of problem-solving and research; those datasets simply don't exist.
- Have we not taught them to "double-check"? They are already excellent at verifying the credibility of our work.
- Have we not taught them to "defend" their ideas? They can justify ironclad logic and spot potentially "flaky" logic better than any human.
- Have we not taught them to "publish" and "present to the scientific community"? It's just that the previous steps aren't fully polished yet.
And if you look at the question of "creating completely new ideas" from this angle and in this level of detail... To me personally, it doesn't seem at all like LLMs are incapable of this kind of work.
We simply haven't taught them how to do it yet, purely because we don't have a sufficient volume of the right training materials.
???
Just to drive that thought further.
What are you suggesting, should we rename it. To me the fundamental question is this.
Do we still have tasks that humans can do better than AIs?.
I like the question. I think another good test is "make money". There are humans that can generate money from their laptop. I don’t think AI will be net positive.
I’ve tried to create a Polymarket trading bot with Opus 4.6. The ideas were full of logical fallacies and many many mistakes.
But also I’m not sure how they would compare against an average human with no statistics background..
I think it’s really to establish if we by AGI mean better than average human or better than best human..
The "things that currently make money" definition is interesting. Bc they are the things that automation can't currently do, because could be automated, then price would tend to 0 and and couldn't make money at it.
Let's be honest: we are giving LLMs and humans the exact same tasks, but are we putting them on an equal playing field? Specifically, do they have access to the same resources and behavioral strategies?
- LLMs don't have spatial reasoning.
- LLMs don't have a lifetime of video game experience starting from childhood.
- LLMs don't have working memory or the ability to actually "memorize" key parameters on the fly.
- LLMs don't have an internal "world model" (one that actively adapts to real-world context and the actual process of playing a game).
... I could go on, but I've outlined the core requirements for beating these tests above.
So, are we putting LLMs and humans in the same position? My answer is "no." We give them the same tasks, but their approach to solving them—let alone their available resources—is fundamentally different. Even Einstein wouldn't necessarily pass these tests on the first try. He’d first have to figure out how to use a keyboard, and then frantically start "building up new experience."
P.S. To quickly address the idea that LLMs and calculators are just "useful tools" that will never become AGI—I have some bad news there too. We differ from calculators architecturally; we run on entirely different "processors." But with LLMs, we are architecturally built the same way: it is a Neural Network that processes and makes decisions. This means our only real advantage over them is our baseline configuration and the list of "tools" connected to our neural network (senses, motor functions, etc.). To me, this means LLMs don't have any fundamental "architectural" roadblocks. We just have a head start, but their speed of evolution is significantly faster.