undefined | Better HN

0 pointssurajkumar50501mo ago0 comments

I think two things are getting conflated in this discussion.

First: marginal inference cost vs total business profitability. It’s very plausible (and increasingly likely) that OpenAI/Anthropic are profitable on a per-token marginal basis, especially given how cheap equivalent open-weight inference has become. Third-party providers are effectively price-discovering the floor for inference.

Second: model lifecycle economics. Training costs are lumpy, front-loaded, and hard to amortize cleanly. Even if inference margins are positive today, the question is whether those margins are sufficient to pay off the training run before the model is obsoleted by the next release. That’s a very different problem than “are they losing money per request”.

Both sides here can be right at the same time: inference can be profitable, while the overall model program is still underwater. Benchmarks and pricing debates don’t really settle that, because they ignore cadence and depreciation.

IMO the interesting question isn’t “are they subsidizing inference?” but “how long does a frontier model need to stay competitive for the economics to close?”

0 comments

jmalicki1mo ago

I suspect they're marginally profitable on API cost plans.

But the max 20x usage plans I am more skeptical of. When we're getting used to $200 or $400 costs per developer to do aggressive AI-assisted coding, what happens when those costs go up 20x? what is now $5k/yr to keep a Codex and a Claude super busy and do efficient engineering suddenly becomes $100k/yr... will the costs come down before then? Is the current "vibe-coding renaissance" sustainable in that regime?

slopusila1mo ago

after the models get good enough to replace coders they will be able to start increasing the subscriptions back up

jmalicki1mo ago

At $100k/yr the joke that AI means "actual Indians" starts to make a lot more sense... it is cheaper than the typical US SWE, but more than a lot of global SWEs.

1 more reply

raincole1mo ago

> the interesting question isn’t “are they subsidizing inference?”

The interesting question is if they are subsidizing the $200/mo plan. That's what is supporting the whole vibecoding/agentic coding thing atm. I don't believe Claude Code would have taken off if it were token-by-token from day 1.

(My baseless bet is that they're, but not by much and the price will eventually rise by perhaps 2x but not 10x.)

BosunoB1mo ago

Dario said this in a podcast somewhere. The models themselves have so far been profitable if you look at their lifetime costs and revenue. Annual profitability just isn't a very good lens for AI companies because costs all land in one year and the revenue all comes in the next. Prolific AI haters like Ed Zitron make this mistake all the time.

jmalicki1mo ago

Do you have a specific reference? I'm curious to see hard data and models.... I think this makes sense, but I haven't figured out how to see the numbers or think about it.

BosunoB1mo ago

I was able to find the podcast. Question is at 33:30. He doesn't give hard data but he explains his reasoning.

https://youtu.be/mYDSSRS-B5U

1 more reply

lilytweed1mo ago

In his recent appearance on NYT Dealbook, he definitely made it seem like inference was sustainable, if not flat-out profitable.

https://www.youtube.com/live/FEj7wAjwQIk

rstuart41331mo ago

> It’s very plausible (and increasingly likely) that OpenAI/Anthropic are profitable on a per-token marginal basis

There any many places that will not use models running on hardware provided by OpenAI / Anthropic. That is the case true of my (the Australian) government at all levels. They will only use models running in Australia.

Consequently AWS (and I presume others) will run models supplied by the AI companies for you in their data centres. They won't be doing that at a loss, so the price will cover marginal cost of the compute plus renting the model. I know from devs using and deploying the service demand outstrips supply. Ergo, I don't think there is much doubt that they are making money from inference.

deaux1mo ago

> Consequently AWS (and I presume others) will run models supplied by the AI companies for you in their data centres. They won't be doing that at a loss, so the price will cover marginal cost of the compute plus renting the model.

This says absolutely nothing.

Extremely simplified example: let's say Sonnet 4.5 really costs $17/1M output for AWS to run yet it's priced at $15. Anthropic will simply have a contract with AWS that compensates them. That, or AWS is happy to take the loss. You said "they won't be doing that at a loss" but in this case it's not at all out of the question.

Whatever the case, that it costs the same on AWS as directly from Anthropic is not an indicator of unit economics.

waffletower1mo ago

In the case of Anthropic -- they host on AWS all the while their models are accessible via AWS APIs as well, the infrastructure between the two is likely to be considerably shared. Particularly as caching configuration and API limitations are near identical between Anthropic and Bedrock APIs invoking Anthropic models. It is likely a mutually beneficial arrangement which does not necessarily hinder Anthropic revenue.

freakynit1mo ago

Genuine question: Given Anthropic's current scale and valuation, why not invest in owning data centers in major markets rather than relying on cloud providers?

Is the bottleneck primarily capex, long lead times on power and GPUs, or the strategic risk of locking into fixed infrastructure in such a fast-moving space?

barrell1mo ago

> It’s very plausible (and increasingly likely) that OpenAI/Anthropic are profitable on a per-token marginal basis

Can you provide some numbers/sources please? Any reporting I’ve seen shows that frontier labs are spending ~2x on inference than they are making.

Also making the same query on a smaller provider (aka mistral) will cost the same amount as on a larger provider (aka gpt-5-mini) despite the query taking 10-100x longer on OpenAI.

I can only imagine that is OpenAI subsidizing the spend. GPUs cost by the second for inference. Either that or OpenAI hasn’t figured out how to scale but I find that much less likely

w10-11mo ago

"how long does a frontier model need to stay competitive"

Remember "worse is better". The model doesn't have to be the best; it just has to be mostly good enough, and used by everyone -- i.e., where switching costs would be higher than any increase in quality. Enterprises would still be on Java if the operating costs of native containers weren't so much cheaper.

So it can make sense to be ok with losing money with each training generation initially, particularly when they are being driven by specific use-cases (like coding). To the extent they are specific, there will be more switching costs.

j / k navigate · click thread line to collapse

0 comments

jmalicki1mo ago

I suspect they're marginally profitable on API cost plans.

slopusila1mo ago

after the models get good enough to replace coders they will be able to start increasing the subscriptions back up

jmalicki1mo ago

At $100k/yr the joke that AI means "actual Indians" starts to make a lot more sense... it is cheaper than the typical US SWE, but more than a lot of global SWEs.

1 more reply

raincole1mo ago

> the interesting question isn’t “are they subsidizing inference?”

(My baseless bet is that they're, but not by much and the price will eventually rise by perhaps 2x but not 10x.)

BosunoB1mo ago

jmalicki1mo ago

Do you have a specific reference? I'm curious to see hard data and models.... I think this makes sense, but I haven't figured out how to see the numbers or think about it.

BosunoB1mo ago

I was able to find the podcast. Question is at 33:30. He doesn't give hard data but he explains his reasoning.

https://youtu.be/mYDSSRS-B5U

1 more reply

lilytweed1mo ago

In his recent appearance on NYT Dealbook, he definitely made it seem like inference was sustainable, if not flat-out profitable.

https://www.youtube.com/live/FEj7wAjwQIk

rstuart41331mo ago

> It’s very plausible (and increasingly likely) that OpenAI/Anthropic are profitable on a per-token marginal basis

deaux1mo ago

This says absolutely nothing.

Whatever the case, that it costs the same on AWS as directly from Anthropic is not an indicator of unit economics.

waffletower1mo ago

freakynit1mo ago

Genuine question: Given Anthropic's current scale and valuation, why not invest in owning data centers in major markets rather than relying on cloud providers?

Is the bottleneck primarily capex, long lead times on power and GPUs, or the strategic risk of locking into fixed infrastructure in such a fast-moving space?

barrell1mo ago

> It’s very plausible (and increasingly likely) that OpenAI/Anthropic are profitable on a per-token marginal basis

Can you provide some numbers/sources please? Any reporting I’ve seen shows that frontier labs are spending ~2x on inference than they are making.

Also making the same query on a smaller provider (aka mistral) will cost the same amount as on a larger provider (aka gpt-5-mini) despite the query taking 10-100x longer on OpenAI.

I can only imagine that is OpenAI subsidizing the spend. GPUs cost by the second for inference. Either that or OpenAI hasn’t figured out how to scale but I find that much less likely

w10-11mo ago

"how long does a frontier model need to stay competitive"

j / k navigate · click thread line to collapse