Context: Ghidra is used for reverse engineering binary executables, complementing the usual disassembly view with function decompilation. Each supported architecture has a SLEIGH specification, which provides semantics for parsing and emulating instructions, not unlike the dispatch handlers you would find in interpreters written for console emulators.
Emulator devs have long had extensive test ROMs for popular consoles, but Ghidra only provides CPU emulation, so it can't run them without additional setup. What I did here is bridge the gap: by modifying a console emulator to instead delegate CPU execution to Ghidra, we can now use these same ROMs to validate Ghidra processor modules.
Previously [1], I went with a trace log diffing approach, where any hardware specific behaviour that affected CPU execution was also encoded in trace logs. However, it required writing hardware specific logic, and is still not complete. With the delegation approach, most of this effort is avoided, since it's easier to hook and delegate memory accesses.
I plan on continuing research in this space and generalizing my approaches, since it shows potencial for complementing existing test coverage provided by pcodetest. If a simple architecture like 6502 had a few bugs, who knows how many are in more complex architectures! I wasn't able to find similar attempts (outside of diffing and coverage analysis from trace logs), please let me know if I missed something, and any suggestions for improvements.
[1]: https://github.com/nevesnunes/ghidra-tlcs900h#emulation