You can run (much) smaller LLM models on consumer-grade GPUs though. A single Nvidia GPU with 8 GB RAM is enough to get started with models like Zephyr, Mistral or Llama2 in their smallest versions (7B parameters). But it will be both slower and lower quality than anything OpenAI currently offers.
It will definitely not be slower. Local inference with a 7b model on a 3090/4090 will outpace 3.5-turbo and smoke 4-turbo.
As for me, I’ve got other uses for $45k.