Considering the sum total of data and computation that goes in to creating an intelligent human mind, including the forces of natural selection in creating our innate structure and dispositions, it's not obvious that any conclusions can be drawn from the fact that so much data and compute goes into training these models.
Has this transfer of knowledge from one domain to another really been demonstrated by these models/learning processes? I know transfer learning is a thing (I have a couple books on my shelf on it). But it seems far from what you are describing.
The AlphaZero algorithm swapped between board games pretty easily. OpenAI could also have been gesturing at this when they named the GPT paper "Language Models are Few-Shot Learners".
they mention in the demo video that the inspiration for codex came from GPT-3 users training it to respond to queries with code samples. I saw some pretty impressive demos of the original model creating SQL queries from plain questions. I'm not sure if that counts as switching domains, but it's something?