Chess was a game for humans.
It was very briefly a game for humans and machines (Kasparov had a go at getting "Advanced Chess" off the ground as a competitive sport), but soon enough having a human in the team made the program worse.
But at least the evaluation functions were designed by humans, right? That lasted a remarkably long time; first Stockfish became the strongest engine in the world by using distributed hyperparameter search to tweak its piece-square tables, then AlphaZero came along and used a policy network + MCTS instead of alpha-beta search, then (with an assist from the Shogi community) Stockfish struck back with a completely learned evaluation function via NNUE.
So the last frontier of human expertise in chess is search heuristics, and that's going to fall too: https://arxiv.org/abs/2402.04494.
The common theme with all of this is that the stuff which we used before are, fundamentally, hacks to get around _not having enough compute_, but which make the system worse once you don't have to make those tradeoffs around inductive biases. Empirical evidence suggests that raw scaling has a long way to run yet.
AI greatly reminds me of the Library of Babel thought experiment. If we can imagine a library with every book that can possibly be written in any language, would it contain all human knowledge lost in a sea of noise? Is there merit or value in creating a system that sifts through such a library to attune hidden truths, or are we dooming ourselves to finding meaning in nothingness?
In a certain sense, there's immense value to developing concepts and ideas through intuition and thought. In another sense, a rose by any other name smells just as sweet; if an AI creates a perpetual motion device before a human does, that's not nothing. I don't expect AI to speed past human capability like some people do, but it's certainly displaced a lot of traditional computer-vision and text generation applications.
The work that system would be required to find those "hidden truths" is equivalent to re-deriving those truths from scratch.
Similar argument: an image is just a number; if you take e.g. a 800x600 24bpp picture, that's a number 1 440 000 bytes long; you could hypothetically start from 0 and generate every 1 440 000-byte number, thus generating every possible 800x600 24bit image. In that set, you'd find every historical event, photographed at every moment from every angle, and even photos of every fragment of every book from the Library of Babel. But good luck finding anything particular in there.
Similar argument 2: any movie or song is contained somewhere within digital expansion of the number Pi. But again, it's worthless unless you know how to find such works, which basically requires you to have them in the first place.
Maybe. The statistical models are definitely better at natural language processing now, but they still fail on analytical tasks.
Of course, human brains are statistical models, so there's an existence proof that a sufficiently large statistical model is, well, sufficient. But that doesn't mean that you couldn't do better with an intelligently designed co-processor. Even humans do better with a pocket calculator, or even a sheet of paper, than they do with their unaided brains.
Edt: btw, same for probabilistic inference, same for logical inference, and same for any other thing anyone's tried as the one true path to AI since the 1950's. Humans have consistently proven bad at everything computers are good at, and that tells us nothing about why humans are good at anything (if, indeed, we are). Let's not assume too much about brains until we find the blueprint, eh?
That depends on what you mean by being "bad at statistics." What brains do on a conscious level is very different than what they do at a neurobiological level. Brains are "bad at statistics" on the conscious level, but at the level of neurobiology that's all they do.
As an analogy, consider a professional tennis or baseball player. At the neurobiological level those people are extremely good at finding solutions to kinematic equations, but that doesn't mean that they would ace a physics test.
If CPUs are made of silicon, why are they so bad at simulating semiconductors? Or why CPUs are so bad at emulating CPUs?
If JavaScript runs on a CPU, why is it so bad at doing bitwise stuff?
Etc.
What the runtime is made of is entirely separate of what's running on it. Same is with human brain (substrate) and human consciousness (software), or humans (substrate) and bureaucracy (runtime) and corporations (software).
Being good at statistics is more of a knowledge graph of understanding concepts than a statistical model, I think.
Just like understanding a car engine.
There are only two problems with this: One, statistical machine learning systems have an extremely limited ability to encode expert knowledge. The language of continuous functions is alien to most humans and it's very difficult to encode one's intuitive, common sense knowledge into a system using that language [1]. That's what I mean when I say "sour grapes". Statistical machine learning folks can't use expert knowledge very well, so they pretend it's not needed.
Two, all the loud successes of statistical machine learning in the last couple of decades are closely tied to minutely specialised neural net architectures: CNNs for image classification, LSTMs for translation, Transformers for language, Difussion models and Ganns for image generation. If that's not encoding knowledge of a domain, what is?
Three, because of course three, despite point number two, performance keeps increasing only as data and compute increases. That's because the minutely specialised architectures in point number two are inefficient as all hell; the result of not having a good way to encode expert knowledge. Statistical machine learning folk make a virtue out of necessity and pretend that only being able to increase performance by increasing resources is some kind of achievement, whereas it's exactly the opposite: it is a clear demonstration that the capabilities of systems are not improving [2]. If capabilities were improving, we should see the number of examples required to train a state-of-the-art system either staying the same, or going down. Well, it ain't.
Of course the neural net [community] will complain that their systems have reached heights never before seen in classical AI, but that's an argument that can only be sustained by the ignorance of the continued progress in all the classical AI subjects such as planning and scheduling, SAT solving, verification, automated theorem proving and so on.
For example, and since planning is high on my priorities these days, see this video where the latest achievements in planning are discussed (from 2017).
https://youtu.be/g3lc8BxTPiU?si=LjoFITSI5sfRFjZI
See particularly around this point where he starts talking about the Rollout IW(1) symbolic planning algorithm that plays Atari from screen pixels with performance comparable to Deep-RL; except it does so online (i.e. no training, just reasoning on the fly):
https://youtu.be/g3lc8BxTPiU?si=33XSM6yK9hOlZJnf&t=1387
Bitter lesson my sweet little ass.
____________
[1] Gotta find where this paper was but none other than Vladimir Vapnik basically demonstrated this by trying the maddest experiment I've ever seen in machine learning: using poetry to improve a vision classifier. It didn't work. He's spent the last 20 years trying to find a good way to encode human knowledge into continuous functions. It doesn't work.
[2] In particular their capability for inductive generalisation which remains absolutely crap.
The main paper: https://gwern.net/doc/reinforcement-learning/exploration/act...
It sounds kinda crazy (is there really that much far transfer?), but you know, I think it would work... He just needed to use LLMs instead: https://arxiv.org/abs/2309.10668#deepmind
If I remember correctly, Vapnik's point is, we know that Big Data Deep Learning works; now, try to do the same thing with small data. Very much like my point that capabilities of models are not improving, only the scale increasing.
In other words; machine learned models are octopus brains (https://www.scientificamerican.com/article/the-mind-of-an-oc...) and that creeps you out. Fair enough, it creeps me out too, and we should honour our emotions — I'm no rationalist – but we should also be aware of the risks of confusing our emotional responses with reality.
Transformers, Diffusion for Vision, Image generation are really odd examples here. None of those architectures or training processes are tuned for Vision in mind lol. It was what? 3 years after Attention 2017 before the famous Vit paper. CNNs have lost a lot of favor to Vits, LSTMs are not the best performing translators today.
The bitter lesson is that less encoding of "expert" knowledge results in better performance and this has absolutely held up. The "encoding of knowledge" you call these architectures is nowhere near that of the GOFAI kind and even more than that, less biased NN architectures seem to be winning out.
>That's because the minutely specialised architectures in point number two are inefficient as all hell; the result of not having a good way to encode expert knowledge.
Inefficient is a whole lot better than can't even play the game, the story of GOFAI for the last few decades.
>If capabilities were improving, we should see the number of examples required to train a state-of-the-art system either staying the same, or going down. Well, they ain't.
The capabilities of models are certainly increasing. Even your example is blatantly wrong. Do you realize how much more data and compute it would take to train a Vanilla RNN to say GPT-3 level performance?
See e.g. my link above where GOFAI plays the game (Atari) very well indeed.
Also see Watson winning Jeopardy (a hybrid system, but mainly GOFAI - using frames and Prolog for knowledge extraction, encoding and retrieval).
And Deep Blue beating Kasparov. And MCTS still the SOTA search algo in Go etc.
And EURISCO playing Traveller as above.
And Pluribus playing Poker with expert game-playing knowledge.
And the recent neuro-symbolic DeepMind thingy that solves geometry problems from the maths olympiad.
etc. etc. [Gonna stop editing and adding more as they come to my mind here.]
And that's just playing games. As I say in my comment above planning and scheduling, SAT, constraints, verification, theorem proving- those are still dominated by classical systems and neural nets suck at them. Ask Yan LeCun: "Machine learning sucks". He means it sucks in all the things that classical AI does best and he means he wants to do them with neural nets, and of course he'll fail.
This is evidence _for_ the Bitter Lesson, not against it.