This is obviously a lie. If this was true, all the inference provider companies would go to zero. I support open-source as much as the next guy here, but it's obvious that the local version will be slower or break more often. Like, come on guys. Be real.
To illustrate this, M4 Max chips do 38 TOPS in FP8. An NVIDIA H100 does 4,000 TOPS.
Prakash if you're going to bot our replies, at least make it believable.