I mean, I am the strongest local LLM advocate you will find. I have my GPU loaded with a model pretty much all day, for recreation and work. My job, my livelihood involves running local LLMs.
But it's intense, even with a very finicky, efficient runtime on a strong desktop. Local LLM hosting is not something you want to impose on users unless they are acutely aware of it, or unless its a full stack hardware/software platform (like the Google Pixel) where the vendor can "hide" the undesirable effects on system performance.
I think that's a reasonable generalization to make.