People want us to be at "Her" levels of AI, but we're at a far earlier stage. We can fake certain aspects of that (using TTS), but blindly trusting an AI to run everything is going to be a big mistake in the short-term. And in order for the inevitability of what you describe to take place, the predecessor(s) to that have to work in a way that doesn't scare people and businesses away.
The plowing of money and hype into the current forms of AI (not to mention the gaslighting about their ability) makes me think the real inevitability is a meltdown in the next 5-10 years which leads to AI-hesitancy on a mass scale.
The problem with your "close to the metal" assertion is that this has been parroted about every iteration of LLMs thus far. They've certainly gotten better (impressively so), but again, it doesn't matter. By their very nature (whether today or ten years from now), they're a big risk at the business level which is ultimately where the rubber has to hit the road.
So obviously completely full of shit.
> If you can find a way to make the context window on the scale of the human brain, you may be able to mostly mitigate this.
Human brains have a much smaller context window than AI do. We can't pay attention to the last 128,000 concepts that filtered past our sensory systems — our conscious considerations are for about seven things.
There's a lot of stuff that we don't yet understand well enough to reproduce with AI, but context length is the wrong criticism for these models.
You're right. What I'm getting at is the overall speed, efficiency, and accuracy of the storage, retrieval, and processing capability of the human brain.