the GH/GB compute has LPDDR5X - a single or dual GPU shares 480GB, depending if it's GH or GB, in addition to the HBM memory, with NVLink C2C - it's not bad!
Essentially, the Grace CPU is a memory and IO expander that happens to have a bunch of ARM CPU cores filling in the interior of the die, while the perimeter is all PHYs for LPDDR5 and NVLink and PCIe.
Sure, but 72x Neoverse V3 (approximately Cortex X3) is a choice that seems more driven by convenience than by any real need for an AI server to have tons of somewhat slow CPU cores.