Simply memorizing sequences of steps is not how mathematics learning works, otherwise we would not see so much variation in outcomes. Me and Terence Tao on the same exact math training data would not yield two mathematicians of similar skill.
While it's true that memorization of properties, structure, operations and what should be applied when and where is involved, there is a much deeper component of knowing how these all relate to each other. Grasping their fundamental meaning and structure, and some people seem to be wired to be better at thinking about and picking out these subtle mathematical relations using just the description or based off of only a few examples (or be able to at all, where everyone else struggles).
> I think you're wrong. The research on grokking shows that LLMs transition from memorization to generalized circuits
It's worth noting that for composition, key to abstract reasoning, LLMs failed to generalize to out of domain examples on simple synthetic data.
From: https://arxiv.org/abs/2405.15071
> The levels of generalization also vary across reasoning types: when faced with out-of-distribution examples, transformers fail to systematically generalize for composition but succeed for comparison.