Supposedly 10T scale. Literally the next big thing. A bit like what OpenAI tried with GPT-4.5 - but Anthropic actually made it work with MoE, reasoning, tool use, RLVR, etc.
It matters because the "g factor" of today's LLMs is at least in part a function of raw scale. Larger models are just smarter - assuming you can handle the training and inference at this increased scale.