I don't think this is correct. For inference, the bottleneck is memory bandwidth, so if you can hook up an FPGA with better memory, it has an outside shot at beating GPUs, at least in the short term.
I mean, I have worked with FPGAs that outperform H200s in Llama3-class models a while and a half ago.