But CUDA -> PTX intermediate code has allowed for significantly more flexibility. For crying out loud, the entire machine code (aka SASS) of NVidia GPUs has been cycled out at least 4 times in the past decade (128-bit bundles, changes to instruction formats, acquire/release semantics, etc etc)
It's amazing what backwards compatibility NVidia has achieved in the past 15 years thanks to this architecture. SASS changes so dramatically from generation to generation but the PTX intermediate code has stayed highly competitive.