It's not even that bad. Core i7-12700K with DDR5 gives me ~1 word per second on llama-30b - that is fast enough for real-time chat, with some patience. And things are even better on M1/M2 Macs.
The critical factor seems to be the ability to fit the whole model in RAM (--mlock option in oobabooga). With Apple's RAM prices most M1/M2 owners probably don't have the 32 GB RAM required to fit a 4bit 30B model.