undefined | Better HN

0 pointsjurgenburgen16d ago0 comments

Newer process nodes are not the main avenue of improvement. What those transistors are used for is more important and it’s plausible that improvements between generations can increase performance by multiples on a specific task. All of the improvements aren’t necessarily in the chip itself either.

E.g. the next gen might have hardware inference for lower bits, more memory bandwidth, etc.

0 comments

spwa415d ago

You could just give the TLDR: by far the biggest improvement in the different generations of nVidia chips is calculating faster at half the accuracy. For blackwell vs hopper it was "double performance". By which they mean blackwell can calculate with NXFP4 at twice the rate hopper can calculate at FP8. Then go back generations all the way until you arrive at FP64, where we started. They even made a slight detour to "FP128".

Decide for yourself if this is a real improvement. You should probably consider that nVidia did not just give the new chips, but also demonstrated training a neural net with NXFP4.

It's not the only improvement, but it is by far the biggest.

As for the future: nobody's gotten FP2 to work satisfactorily yet. But hey, maybe at nVidia's next conference. But, even NXFP4 is not actually 4 bits (meaning various parts of the computation don't actually happen at 4 bits), and neither was FP8 (you could use it like that but people didn't)

j / k navigate · click thread line to collapse

0 pointsjurgenburgen16d ago0 comments

E.g. the next gen might have hardware inference for lower bits, more memory bandwidth, etc.

0 comments

spwa415d ago

Decide for yourself if this is a real improvement. You should probably consider that nVidia did not just give the new chips, but also demonstrated training a neural net with NXFP4.

It's not the only improvement, but it is by far the biggest.

j / k navigate · click thread line to collapse