The current state is _very_ fast in simulation to the point where it is uninteresting (there are other things to figure out) to write something as a behavioral model of the '181/'182.
~100 microcode instructions takes about 0.1 seconds to run.
I was thinking more of a behavioral model of the whole ALU, just so that the FPGA tools can map it onto a collection of the smaller ALUs built into each slice.
What clock speed does your latest design synthesize at?
There was already a design of CADR for FPGAs [1] that does synthesize (and boot), I don't know why amszmidt needed to start again from scratch or if his design is a modification of the earlier one.
A similar comment applies to lm-3. Maybe it is built on a fork of the previous repo, it is hard to tell.