Frankly, it sounds to me like they're having severe yield+reliability problems with the TPUv4s that aren't getting caught by wafer-level testing, and have binned the flakiest ones for use by outsiders.
A lot of yield issues show up as spontaneous resets/crashes.