I think they just worded that poorly. I suspect their suggestion is not that you run with the MMU _off_ (as doing so would trash your perf anyways since everything becomes uncacheable!) but rather that you don't need to context switch the page tables, which can lead to some pretty decent performance gains given that you can (on some platforms) avoid TLB flushes. Nowadays though I seriously would not consider the page table switch to be a significant cost since (in ARMv8 anyways) you have ASIDs and so switching tables is just a single msr+isb.
Nope, I meant what I wrote.
You don't need an MMU, because you have no need for virtual memory or similar.
Checking an integer is within some bounds is a much simpler problem, and in some cases the check can be elided through analysis on the code.
unfortunately, then, you'll need new CPUs. You don't _need_ memory protection but CPUs today are built on the assumption that you are using it and have thus stapled many critical attributes about memory to it (such as cacheability and shareability).