Hmm, I guess that if the load instruction doesn't change anything except the destination register (unlike, for example, postincrement addressing modes) and the delay-slot instruction also can't do anything that would change the effective address being loaded from before it faulted (and can't depend on the old value), then you're right that it wouldn't need any special fault handling support. I'd never tried to think this through before, but it makes sense. I appreciate it.
As for SH2, ouch! So SH2 got pretty badly screwed by delay slots, eh?