Once again, it was just
more work to pipeline your data. That the same code worked well across both isn't my point. I worked on an engine team supporting both of those platforms, so I know. This kind of approach is pretty rare for desktop and mobile apps, which is what the M1 is used for.
Edit: Exceptions being typical DSP realms, such as video/image/sound processing, rendering packages, AI, which are all already targeting GPUs. Note that Final Cut Pro works faster on Intel setups with traditional (non-integrated) GPUs vs M1.