LLMs are trained to predict a bunch of tokens (GenAI produces text) from all the previous seen tokens, based on the data it was trained on. It does not understand anything about the spatial relationships like lines, objects etc in an image. "Not even the people who built them" - We have no real understanding of how LLMs work, yet. Traditional ML theory (classification/regression/clustering) largely does not apply to LLM's emergent capabilities like coding, arithmetic and reasoning. No such theory exists today. People are trying.
Yup, it's emergent behavior. This has been going for a while in ML, I believe. To be fair, we know how brains work, but we don't understand why consciousness either.
To be truly fair, we barely know how flatworm and fruit fly brains work… we haven’t the slightest clue how human brains work. Understanding consciousness is a long way off.