> So one fundamental difference is that AGI would not need some absurdly massive data dump to become intelligent.
The first 22 years of life for a “western professional adult” is literally dedicated to a giant bootstrapping info dump
The zero training version not only ended up dramatically outperforming the 'expert' version, but reached higher levels of competence exponentially faster. And that should be entirely expected. There were obviously tremendous flaws in our understanding of the game, and training on those flaws resulted in software seemingly permanently handicapping itself.
Minimal expert training also has other benefits. The obvious one is that you don't require anywhere near the material and it also enables one to ensure you're on the right track. Seeing software 'invent' fundamental arithmetic is somewhat easier to verify and follow than it producing a hundred page proof advancing, in a novel way, some esoteric edge theory of mathematics. Presumably it would also require orders of magnitude less operational time to achieve such breakthroughs, especially given the reduction in preexisting state.
The moment after human birth the human agent starts a massive information gathering process - that no other system really expects much output from in a coherent way - for 5-10 years. Aka “data dump” some of that data is good, and some of it is bad. This in turn leads to biases, it leads to poor thinking models; everything that you described, is also applicable to every intelligent system - including humans. So again you presupposing that there’s some kind of perfect information benchmark that couldn’t exist.
When that system comes out of the birth canal it already has embedded in it millions of years of encoded expectations predictability systems and functional capabilities that are going to grow independent of what the environment does (but will be certainly shaped in its interactions by the environment).
So no matter what, you have a structured system of interaction that must be loaded with previously encoded data (experience, transfer learning etc) with and it doesn’t matter what type of intelligent system you’re talking about there are foundational assumptions at the physical interaction layer that encode all previous times steps of evolution.
Said an easier way: a lobster, because of the encoded DNA that created it, will never have the same capabilities as a human, because it is structured to process information completely differently and their actuators don’t have the same type and level of granularity as human actuators.
Now assume that you are a lobster compared to a theoretical AGI in sensor-effector combination. Most likely it would be structured entirely differently than you are as a biological thing - but the mere design itself carries with it an encoding of structural information of all previous systems that made it possible.
So by your definition you’re describing something that has never been seen in any system and includes a lot of assumptions about how alternative intelligent systems could work - which is fair because I asked your opinion.
The next time your in the wilds, it's quite amazing to consider that your ancestors - millennia past, would have looked at, more or less, these exact same wilds but with so much less knowledge. Yet nonetheless they would discover such knowledge - teaching themselves, and ourselves, to build rockets, put a man on the Moon, unlock the secrets of the atom, and so much more. All from zero.
---
What your example and elaboration focus on is the nature of intelligence, and the difficulty in replicating it. And I agree. This is precisely we want to avoid making the problem infinitely more difficult, costly, and time consuming by dumping endless amounts of knowledge in the equation.