> Just look at the result Google gets with BERT on V100s vs the result NVIDIA gets with V100s.
These benchmarks measure the combination of hardware+software to solve a problem.
Google and NVIDIA are using the same hardware, but their software implementation is different.
---
The reason mlperf.org exists is to have a meaningful set of relevant practical ML problems that can be used to compare and improve hardware and software for ML.
For any piece of hardware, you can create an ML benchmark that's irrelevant in practice, but perform much better on that hardware than the competition. That's what we used to have before mlperf.org was a thing.
We shouldn't go back there.