What people want is something they can run on their own hardware without sending their queries to some third party service which is doing who knows what with them.
This is already possible if you want to mess around with green code that isn't in system repositories yet and buy expensive hardware to make it fast, but you can imagine why some people don't have the time or money for that.
I'm waiting for Intel or AMD to realize there would be a line out the door if they'd make a CPU with an iGPU that could use system memory and run these models at even a quarter of the speed of typical discrete GPUs.