undefined | Better HN

0 pointsblurbleblurble5mo ago0 comments

But on the ryzen the vram allocation can be entirely dynamically allocated. I saw a review showing excellent full GPU usage during inference with the bios vram allocation set to the minimum level, using a very large model. So it's not so simple as you describe (I used to think this was the case too).

Beyond that, seems like the 395 in practice smashes the dgx spark in inference speeds for most models. I haven't seen nvfp4 comparisons yet and would be very interested to.

0 comments

justincormack5mo ago

Yes you can set it but in the BIOS, not dynamically as you need it.

I dont think there are any models supporting nvfp4 yet but we shall probably start seeing them.

blurbleblurbleOP5mo ago

That's what I'm saying, in the review video I saw they allocated as little memory as possible to the GPU in the bios, then used some kind of kernel level dynamic control.

j / k navigate · click thread line to collapse