I agree that Opus almost definitely isn't anywhere near that big, but AWS throughput might not be a great way to measure model size.
According to OpenRouter, AWS serves the latest Opus and Sonnet at roughly the same speed. It's likely that they simply allocate hardware differently per model.
My understanding is that for MoE with top K architecture, model size doesn't really matter, as you can have 10 32GB experts or a thousand, if only 2-3 of them are active at the same time, your inference workload will be identical, only your hard drive traffic will incread.
Which seems to be the case, seeing how hungry the industry lately has been for hard drives.