They've made a hardware LLM that reaches over 14k TPS on Llama 3.1 8B, and you can try it here: https://chatjimmy.ai/
So clearly hardware LLMs are the future, and the cost will be drastically reduced. But I know that all the AI labs want to create a perception of high prices forever.