Did you misunderstand the issue entirely?
The context here is the implementation of one of the inner loops of a high-performance infinite-precision arithmetic library (GMP), in RISCV the loop has 3x the instruction count it has in competing architectures.
“The compiler” is not relevant, this is by design stuff that the compiler is not supposed to touch because it’s unlikely to have the necessary understanding to get it as tight and efficient as possible.