> How do you calculate the "most likely character to appear next", if not by memorizing lots and lots of existing sentences?
Well that's how languages work right? Words are the most common sequence of letters.
But that doesn't mean it's regurgitating parts of sentences it had previously seen anymore than I'm regurgitating when I'm typing this.
Mechanically it has learnt both syntax of language and how concepts relate. So when it starts generating it makes sentence that are syntactically valid but also make sense in terms of concepts.
Thats really different to just combining bits of sentences, and it gives rise to abilities you wouldn't expect in something just cutting and pasting bits of sentences. For example, few shot learning is mostly driven by its conceptual understanding and can't be done by something with no way to relate concepts.