Anthropic told the Department of War-nee-Defence that they'd made $5bln total, which is a lot LOT less than what they're spending.
We'll see what's in OpenAi's IPO later this year I guess. I'll be very surprised if they're losing less that $100bln a year.
You'll realize real quick its not profitible. You cant just say things you don't like to hear are unsubstantiated without verifying.
Not to mention, subscriptions.. $2mm in GPUs being given out for 5 hrs a day at a cost of $200 a month.
I could easily say that everyone who says its profitible is msking unsubstantiated claims lol.
Yes, once you have modeled the problem correctly and you know all the input parameters. This is not that: Session# * tps * 86400 (secs in a day) * 30 days.
I don't think there is enough public information to check Anthropic's claims regarding inference profitability. It depends not just on unknown technical factors but also on agreements they have with other companies.
Its possible you could pay off hardware for Kimi 2.6 after maybe 2-3 yrs (by providing low tps / high concurrency) but you're now out of warranty and have been running your machines full throttle 24/7 for 2-3 years.
This is why moonshot attempted to double the price when they released 2.6 but then it got driven down by North American capital subsidies.
Shouldn't we compare the API pricing, where we pay per token? The whole point of local inference is that we don't have any restrictions regarding product use or time limits, so it would only be fair if we compare it to a plan that offers the same. And even that is only a first approximation, because the commercial models are usually much more capable than the open weight models.
And people who don't understand the difference between capex and opex are making uneducated claims. It's not basic math.
Running an inference data center is a mix of variable and fixed costs. The fixed costs are currently in the billions of billions of dollars for pretty much any investment in this space. Many of those fixed costs have (currently) unknown refresh cycles. So, unless you have access to the financial books of these companies it's currently just speculation whether inference is profitable.
If the companies as a whole are destined to be profitable, or worth their valuations is a very different question. The only people who can truely answer that have time machines.
These 1T param models running at <$3.00 per 1mm are certainly not profitable.
As long as the power users are paying per token, everything is good.