undefined | Better HN

0 pointsai_fry_ur_brain6d ago0 comments

Deepseek V4 (not flash) trippled in price too by the way (from Deepseek). Get used to this pattern.

This is what you get for relying on the generosity of billionaires. Keep offshoring your thinking ability to a machine and let me know how competitive you. Hint, you wont be. There's nothing special about being able to use an LLM.

0 comments

npn6d ago

Unlike other providers, Deepseek does promise that they will lower the price when their Huawei cards arrive in a few more months.

flakiness6d ago

Give me a link. Cannot wait. One PSA is that they have 75% discount right now so it is already cheaper than the full price.

npn6d ago

Weird, last time I checked it was right on the pricing page.

But even when it happens I doubt it would be as cheap as it is right now. Enjoy it while it lasts!

ls6126d ago

Anyone can host Deepseek V4 on rented GPUs and sell inference on it. Price will very quickly converge to the marginal cost of inference. This is as close to a pure commodity as it gets in the AI space so competitive market economics will put in work. Same is true for any open-weights model.

ai_fry_ur_brainOP6d ago

You dont understand the costs involved to run inference at scale

Please go run some numbers.The hardware needed to Run Deepseek v4 flash at 20 tps for a single session is nowhere close to what is required to run it at 50tps for 5,000 concurrent sessions.

Imagine what it takes to be profitible when running at 150 tps for 30cents per 1mm. You make less than 1k per month and the hardware required to run that cost 10k a month to rent with hardly any concurrent session capability.

gpugreg6d ago

> Please go run some numbers.

- DeepSeek serves DeepSeek V4 Pro at 27 tps: https://openrouter.ai/deepseek/deepseek-v4-pro

- At 27 tps per user, a B300 GPUS will give you around 800 tokens per second (serving 30 users): https://developer-blogs.nvidia.com/wp-content/uploads/2026/0...

- That's 800 * 60 * 60 generated tokens per hour, at a cost of $0.87 per 1M tokens, or $2.50 per hour.

- For input and output tokens, the math is a bit more complicated because we have to make assumptions about their ratio. Using the published values from OpenCode, we get another $2.50 for cached tokens (which are almost free for DeepSeek) and another $3.40 for input tokens (which are a lot cheaper to compute than output tokens), which gives us a total of $8.50 per hour per B300 GPU.

- B300 GPUs can be rented for as low as $3.40 per hour, which is less than $8.50, so hosting DeepSeek V4 Pro is profitable.

You could also host it at fewer tps per user to raise the efficiency and therefore the profit even higher.

ls6126d ago

Even not assuming Blackwell inference the $3.50/hr price is likely close to the marginal cost. The Deepseek R0 model is a little more than a third of the size of V4 and cost around $1/Mtok to serve at scale based on deepseek's blogs last year and Hopper rental prices.

ls6126d ago

Yes it is more efficient in $/tok to run at scale than to run just for yourself. Everyone selling Deepseek V4 inference is selling an undifferentiated good. They have run the numbers on how much it costs and are competing against a dozen other outfits also selling undifferentiated open weights tokens. Whatever the dollar cost they face to rent those GPUs will be what they are able to charge in the competitive market. That is great for you and me because we can buy tokens at pretty much exactly what it costs to produce them.

ai_fry_ur_brainOP5d ago

They are selling it below costs and training on your tool calling, and potentially all your data. They're selling it for cheap to get your data dumbass.

drob5185d ago

Whoever purchased their RAM last month vs this month has the advantage, I suspect.

barrell6d ago

Actually, deepseek v4 was 1/3 promotional price for the first month or so. This was pretty clearly communicated. The promotions window just ended is all.

greenchair5d ago

thus proving ops point

barrell5d ago

If you run out of 50% coupons to your local pizza joint, did they double their prices? Does every company double or triple their prices after Black Friday?

There’s a pretty significant difference between saying someone tripled their prices, and a temporary promotion ended. It’s even more so the case if someone is using it as an example for raising prices as a trend.

I’m 100% in the camp that prices are going up and quality is going down; companies are retiring models and requiring you to use more expensive ones. This has happened to me and there are dozens of examples that one can point to.

But a promotion ending is a strawman argument and does the point a disservice.

breezybottom5d ago

Essentially yes. Perpetual "discounts" are common in some industries, like fast fashion, so you could consider that the normal price.

wraptile5d ago

> If you run out of 50% coupons to your local pizza joint, did they double their prices?

Yes. Did they double their msrp? no. They did double their effective price relative to me which is all that matters unless you're doing economic math or something.

1 more reply

zaptrem6d ago

V4-Pro is about 2.4× total params and 1.3× active params of V3.2.

creationcomplex6d ago

You're typing as your handwriting and letter sending abilities deteriorate to dust. Writing down information as your memory capacity decays. Remembering instead of living at the pure leading edge of perception dulling your reactions.

Smh, it's all downhill from the first unadulterated neuron.

dpoloncsak6d ago

Mate why are you so mad at people upset the price trippeled? It's a fair complaint that people built services using the cheaper ones with the expectation future models would be similarly priced. You can avoid 'offloading thinking' while still building ontop of these models

kuschku5d ago

> It's a fair complaint that people built services using the cheaper ones with the expectation future models would be similarly priced

Everyone could see this coming from miles away, everyone warned that this would happen again and again and again, and it always got dismissed.

dpoloncsak5d ago

I still think its reasonable to forsee this possibility and be upset when it comes to fruition

aurareturn6d ago

I think demand is too great and compute is not enough. Nothing to do with billionaires colluding to increase prices by 3x.

boutell6d ago

Actually, why should Google collude on pricing? They have deep pockets and could starve out the competition while keeping prices low, if they really wanted.

I think it is priced high because it's basically their smartest model as well as their fastest, so why shouldn't they?

You can still use earlier generations of Flash at a lower cost if you want "fast and cheap and just OK," which often makes sense. (Just checked)

I would predict they will lower this price when 3.5 High appears, but perhaps not all the way.

j / k navigate · click thread line to collapse

0 comments

npn6d ago

Unlike other providers, Deepseek does promise that they will lower the price when their Huawei cards arrive in a few more months.

flakiness6d ago

Give me a link. Cannot wait. One PSA is that they have 75% discount right now so it is already cheaper than the full price.

npn6d ago

Weird, last time I checked it was right on the pricing page.

But even when it happens I doubt it would be as cheap as it is right now. Enjoy it while it lasts!

ls6126d ago

ai_fry_ur_brainOP6d ago

You dont understand the costs involved to run inference at scale

Please go run some numbers.The hardware needed to Run Deepseek v4 flash at 20 tps for a single session is nowhere close to what is required to run it at 50tps for 5,000 concurrent sessions.

gpugreg6d ago

> Please go run some numbers.

- DeepSeek serves DeepSeek V4 Pro at 27 tps: https://openrouter.ai/deepseek/deepseek-v4-pro

- At 27 tps per user, a B300 GPUS will give you around 800 tokens per second (serving 30 users): https://developer-blogs.nvidia.com/wp-content/uploads/2026/0...

- That's 800 * 60 * 60 generated tokens per hour, at a cost of $0.87 per 1M tokens, or $2.50 per hour.

- B300 GPUs can be rented for as low as $3.40 per hour, which is less than $8.50, so hosting DeepSeek V4 Pro is profitable.

You could also host it at fewer tps per user to raise the efficiency and therefore the profit even higher.

ls6126d ago

ai_fry_ur_brainOP5d ago

They are selling it below costs and training on your tool calling, and potentially all your data. They're selling it for cheap to get your data dumbass.

drob5185d ago

Whoever purchased their RAM last month vs this month has the advantage, I suspect.

barrell6d ago

Actually, deepseek v4 was 1/3 promotional price for the first month or so. This was pretty clearly communicated. The promotions window just ended is all.

greenchair5d ago

thus proving ops point

barrell5d ago

If you run out of 50% coupons to your local pizza joint, did they double their prices? Does every company double or triple their prices after Black Friday?

But a promotion ending is a strawman argument and does the point a disservice.

breezybottom5d ago

Essentially yes. Perpetual "discounts" are common in some industries, like fast fashion, so you could consider that the normal price.

wraptile5d ago

> If you run out of 50% coupons to your local pizza joint, did they double their prices?

Yes. Did they double their msrp? no. They did double their effective price relative to me which is all that matters unless you're doing economic math or something.

1 more reply

zaptrem6d ago

V4-Pro is about 2.4× total params and 1.3× active params of V3.2.

creationcomplex6d ago

Smh, it's all downhill from the first unadulterated neuron.

dpoloncsak6d ago

kuschku5d ago

> It's a fair complaint that people built services using the cheaper ones with the expectation future models would be similarly priced

Everyone could see this coming from miles away, everyone warned that this would happen again and again and again, and it always got dismissed.

dpoloncsak5d ago

I still think its reasonable to forsee this possibility and be upset when it comes to fruition

aurareturn6d ago

I think demand is too great and compute is not enough. Nothing to do with billionaires colluding to increase prices by 3x.

boutell6d ago

Actually, why should Google collude on pricing? They have deep pockets and could starve out the competition while keeping prices low, if they really wanted.

I think it is priced high because it's basically their smartest model as well as their fastest, so why shouldn't they?

You can still use earlier generations of Flash at a lower cost if you want "fast and cheap and just OK," which often makes sense. (Just checked)

I would predict they will lower this price when 3.5 High appears, but perhaps not all the way.

j / k navigate · click thread line to collapse