I read through some of the sources in your link, but they don't paint as clear a picture as you claim. Yes, the cost of inference appears to be coming down, but we so far don't really know why that is and what the largest contributing factors are. With other costs rising (e.g. the rising cost of training, the cost of inference scaling with number of parameters, and reasoning models using more and more tokens), it means we can't yet make any certain claims about long-term economic viability. There just isn't enough data yet.
Taking a look at the sources in your link, the MIRI's "Observations About LLM Inference Pricing"report [0] seems like one of the least biased ones (forgive me if I don't believe everything a16z has to say about the economics of AI).
Some choice quotes from the report:
"Imagine you went to the gas station and the price was $4.00, and you look across the street at another gas station and the price is $40.00 — that’s basically the situation we currently see with LLM inference."
"Overall, LLMs do not appear to be priced like other commodities."
"It's possible that some providers are slightly modifying a model that they are serving for inference, for example by quantizing some of the computation"
"Unfortunately, it is difficult to make strong conclusions about the underlying costs of LLM inference because prices range substantially across providers. The data used in this analysis is narrow, so I recommend against coming to strong conclusions solely on its basis."
Another source you linked, Don't Panic Labs[1], seems to agree with Zitron:
"It is a little unclear as to why the price per token is dropping, and I am still a little worried that the price per token will, at some point, go up."
"According to another graph at Epoch AI, the cost to train a model doubles every eight months. This tends to align with the common wisdom that we are getting a really good deal right now, while everyone is fighting to build market share."
If your inference costs come down due to quantization, that doesn't count, since you're cutting costs by offering a worse service, and there's only so much you can do that before your customers walk away. If your inference costs come down due to subsidization, that doesn't count either, since that obviously won't last forever. If your inference costs come down but your training costs double every eight months, that poses a significant problem for your business. If your argument to that is "training costs won't continue to increase at this rate forever". Well, inference costs won't continue to come down at this rate forever, either.
From what I can tell, there still isn't enough data to draw a strong conclusion either way.
[0]: https://techgov.intelligence.org/blog/observations-about-llm...
[1]: https://dontpaniclabs.com/blog/post/2025/12/02/the-price-per...