I've always viewed RAM as a kind of register cache, necessitated because registers are expensive to build and RAM, though expensive, is cheaper. I've heard registers these days are just a small bit of SRAM, but reaching into my way back machine in college, I seem to remember them being a different kind of memory element.
But RAM and all the caches these days leading up to registers are all require fetch from somewhere, store in the register, do the work, then write back the result somewhere (even if the instruction set obfuscates that). If you had enough registers, the fetch and store parts of that work are pretty much gone, turning something like
mov 0xaddressh-1 RegA mov 0xaddressh-2 RegB add RegA RegB RegC mov RegC 0xaddressh-3
into
add 0xReg-1 0xReg-2 0xReg-3
where each mov we do today introduces a cascade down the cache and memory stack (perhaps even dipping into on-disk VM) just to copy a few bytes into a register. And we have to do that 3 times here. The number of adds we could do in the time it takes to do a mov is probably pretty high, but we simply can't do them because we're waiting on bits moving from one place to another.
So suppose money, power etc. weren't considered issues and engineering effort was put into a register-only approach, how much faster would that be? (one the reasons that the Von Neumann architecture became "the" way to do things was that registers were considered expensive to build, but what if we didn't care about money?)
I'd bet a general purpose system built this way would be an order of magnitude faster than anything we have today. But you're right, it would be an enormous resource hog and be expensive as a medium-sized mega yacht.