That's a fun idea. Qemu parses a binary into something very like a compiler IR, optimises it a bit, then writes it out as a binary for the same or another target in JIT like fashion. So that sort of thing can be built. Apple's rosetta is functionally similar, I expect it does the same sort of thing under the hood. Valgrind is another from the same architecture.
It would be a painful reverse engineering process - the cuda file format is sort of like elf, but with undocumented bonus constraints, and you'd have to reverse the instruction encoding to get sass, which isn't documented, or try to take it directly to ptx which is somewhat documented, and then convert that onward.
It would be far more difficult than compiling cuda source directly. I'm not sure anyone would pay for a cuda->amdgpu conversion tool, and it's hard to imagine AMD making one as part of ROCm.