It's actually much easier to "virtualize another OS" than to virtualize a hardware platform (and associated OS).
If you are running guest code that matches your host architecture, QEMU can run the code natively. If you are running foreign code (e.g., ARM on an Intel host), it has to dynamically recompile the code, which will hurt performance.
Actually, my research was simulating a different hardware architecture and the patch I have is for hardware-assisted virtualization under arbitrary hardware architectures. It does have to recompile the code, but the TCG does a pretty good job of that. The virtualization target can do a lot better by keeping TCG in mind while structuring its binary execution, which may be one thing that they don't try to do in Android.