While FP32 non tensor flops at least looks comparable, FP16/BF16 with tensor core(nowadays a default for any neural network including LLM) at 330 TFlops/s blows away M2.
Not on mobile, which is where Apple GPUs come from. FP32 is not necessary for many computations related to graphics, it is just simpler to deal with one data type.
Are you claiming that people aren't using FP32 on mobile, or are you claiming that people are using FP32 on mobile but could technically have gotten away with FP16?
If it's the latter, it's still correct to say that FP32 is king in mobile graphics.