But on the ryzen the vram allocation can be entirely dynamically allocated. I saw a review showing excellent full GPU usage during inference with the bios vram allocation set to the minimum level, using a very large model. So it's not so simple as you describe (I used to think this was the case too).
Beyond that, seems like the 395 in practice smashes the dgx spark in inference speeds for most models. I haven't seen nvfp4 comparisons yet and would be very interested to.