Alternative perspective: the science may not have been ready, so instead we brute-forced the problem, through training of LLMs. Consider what the overall goal function of LLM training is: it's predicting tokens that continue given input in a way that makes sense to humans - in fully general meaning of this statement.
It's a single training process that gives LLMs the ability to parse plain language - even if riddled with 1337-5p34k, typos, grammar errors, or mixing languages - and extract information from it, or act on it; it's the same single process that makes it equally good at writing code and poetry, at finding bugs in programs, inconsistencies in data, corruptions in images, possibly all at once. It's what makes LLMs good at lying and spotting lies, even if input is a tree of numbers.
(It's also why "hallucinations" and "prompt injection" are not bugs, but fundamental facets of what makes LLMs useful. They cannot and will not be "fixed", any more than you can "fix" humans to be immune to confabulation and manipulation. It's just the nature of fully general sytems.)
All of that, and more, is encoded in this simple goal function: if a human looks at the output, will they say it's okay or nonsense? We just took that and thrown a ton of compute at it.