They were cool. That processor is fairly maligned in my experience but it had some solid ideas. I built a CPU in an FPGA that used a construction similar to a cache line to hold registers as an experiment once. The one line (64 bytes) was 'reserved' for 16 registers (8 GP, 4 index, and 4 process specific (SP, PC, STATUS, MODE)), a context switch reloaded the registers as it reloaded the cache. That made context switching a bit faster than a full on push/pop stream but not as fast as having dedicated register banks ala SPARC.
These last two posts by Colin though really got me thinking about the who push for a 'trusted computing base' that folks had back in the 90s. Basically the same argument was used then to justify specific hardware as part of the system for implementing the crypto bits. At the time I thought it was overkill but I can see now how such a system can contain implementation faults into more detectable domains.