Ah, gotcha, so the fact that its 10,000 chips for one dedicated cluster that makes it large, as opposed to Azure which has an order of magnitude more GPUS but rents many of those out.
High performance on a single task requires simultaneous computation and communication between nodes. If there's high latency between nodes, such as between nodes in different data centers, the communication costs can't be masked by computation.