It doesn’t make any sense to think you need the whole server to run one model. It’s much more likely that each server runs 10 instances of the model
1. It doesn’t make sense in terms of architecture. It’s one chip. You can’t split one model over 10 identical hardwire chips
2. It doesn’t add up with their claims of better power efficiency. 2.4kW for one model would be really bad.