https://x.com/deepseek_ai/status/2057854261699195173
Related ongoing thread:
DeepSeek reasonix, DeepSeek native coding agent with high caching and low cost - https://news.ycombinator.com/item?id=48256953 - May 2026 (135 comments)
China is building for the future, while Western Democracies are afraid of the future, and of their own shadow.
Of course, like literally every other time this has played out in computing history, the companies focused on price performance will end up with more economic resources, and get to turn the upgrade crank more often and for longer.
Also, of course, China's way ahead of the US on things like renewables, batteries, and electrification of their economy. All of that feeds into cheaper power to run the models, but I suspect it's a second order effect vs. "improve the software".
The iphone is the best selling computing device in history and is among the most expensive in its category.
They're subsidizing this in many ways - Huawei chips, new DDR5 memory fabs, etc.
Ultimately, DeepSeek's architecture is significantly more cost effective than anything from Google, OpenAI, or Anthropic.
Presumably, they'll incorporate DeepSeek's MLA* architecture to get all the benefits for next year's releases (if not this year's upcoming releases) which will bring down their costs...
They need to actually make money, though, so that might still not give them enough room to make enough money.
Ultimately, hardware depreciation is like 80% of total spending. So power is not as big of a deal in cost. The bigger problem is if you can get the power at all, not how expensive it is.
If you want to bring down inference costs, using less hardware is far more effective than getting cheaper electricity.
Google is in a sweet spot, because they aren't paying 80% margins to nVidia for hardware. So they're probably paying half as much deprecation as everyone else is (or maybe 1/4th for inference - which is now the biggest percentage overall).
The US is subsidizing in exactly the same way through the US Chip Act (as well as state level tax subsidies):
> The act includes $39 billion in subsidies for chip manufacturing on U.S. soil along with 25% investment tax credits for costs of manufacturing equipment, and $13 billion for semiconductor research and workforce training
https://en.wikipedia.org/wiki/CHIPS_and_Science_Act
> Presumably, they'll incorporate DeepSeek's MLA* architecture to get all the benefits for next year's releases (if not this year's upcoming releases) which will bring down their costs.
You can be sure the frontier labs all have similar approaches, but they just don't talk about them. That's why eg Google Flash (the old versions!) were do cheap.
I mean Google published MTP a month or so ago and it has sped up Qwen models by 1.7 times.
If that is what they still publish you get an idea of what they aren't.
Like there was something in the American DNA that was lacking in China and innovation would always need to happen here.
But China it seems doesn’t need the US to produce great cars, devices, robotics, or AI. We absolutely need China to help us build all of the above.
Looking at Loongsons processors for instance. About 15 years ago they coudl barely compete with a Pentium 2. Now they are about 4-5 years behind Intel/AMD. Further behind on some more specific work loads (SSL decoding for example) Not great but that is a decent jump. The jumps between generations are pretty decent.
LA446 was a decent enough processor core but had an awful memory controller that held it back as soon as it needed to reach outside of cache. As such it was SLOW.
But they learned the lesson and now the LA664 almost entirely fixed that issue. I think a big part of performance issues is that they are working domestic 5 to 7nm processes, so a good 5-7 years behind.
They are launching the LA864 later this year and are touting some decent performance gains. That is just marketing so far but something to keep an eye on.
Considering that these chips are using their own ISA, own designs, domestic manufacturing and they aren't terrible is a big thing.
I suspect in the next 5 years they have the chance of completely closing the gap. But it can also go the other way that they end up stalling as smaller nodes get much more difficult to attain.
In most Americans' eyes, unfortunately, there was. It was just known by the name "American Exceptionalism". Yes, it's nonsense, but unfortunately it is nonsense that has historically been used by most empires throughout history, and believed just as fervently by said empires' populi since it's one of the central elements of imperialism as a whole.
There is (was): attracting the best minds around the world to a free and stable society. Trump voters threw it all away because they couldn't stand non-whites coming to America and doing better than old stock Americans.
cz if you're smart & pragmatic - then you will know innovation can come from anywhere - but western elites choose to continually bury their heads in the sand.
There's nothing special about anything we design in the US other than time and money commitment to create it. China did have some espionage of course going on, but the vast majority of shit isn't some secret. And with the US shitting on China with restrictions, we increasingly caused them to invest time and money into things they otherwise would have passively accepted as coming from the west. ASML sees the writing on the wall for themselves in particular.
China can certainly design an inflatable barbecue. China can certainly biuld an inlfatable barbecue. But will the chinese people ever want and buy an inflatable barbecue? ... never. That is why the US will remain the premier consumer economy.
I have some exposure to utility regulation and from what I can tell some of the AI companies are "good actors" and willing to shoulder some of the burden. But others are pretty adversarial and want a free lunch.
The future is blatantly going to be electric. Between cars, heat pumps, ranges, etc, the quantity of kilowatt hours consumed will rise dramatically per capita because they are replacing burned fossil fuels.
We don't need to subsidize the trillion dollar companies, we can settle for just not cancelling wind and solar projects, and generally updating the grid infrastructure.
A rising tide lifts all boats. If the subsidies go to common infrastructure, that's good for everyone. There's no need to complain about a road being paved because it will benefit FedEx in addition to everyone else.
Not that, there's a cool new frontier to explore.
But that its a great opportunity to subsidise an industry and watch their slower fatter competitor go bankrupt trying to keep up.
>But the US did it first
What is sputnik.
Their cost of energy is what matters vs the US as much as speed buildout.
You might say that US would prefer sovereignty but that's a separate argument vis-a-vis strategic competition with China in particular.
Trillions of Dollars being invested against AI infra would indicate otherwise. US is in fact betting a lot of its economic future on AI.
Yes, countries where compromise is not required, where social, capital and human costs are non-factors and where regulations are bendable at will by who's in power can be more effective at achieving some goals.
who are the decision makers in china?
Is there actually a huge Chinese consumer market for these products? If not then I'm not sure how you ever actually achieve this endpoint. Chinese wages and American wages are not nearly the same thing yet.
> It will simply be absolutely cheaper (including profit margin) to serve tokens in China.
It will simply create more pollution and environmental destruction too.
> China is building for the future
That's the plan. Whether that's true requires an honest analysis.
> while Western Democracies are afraid of the future
Developed nations take fewer risks than undeveloped ones. Do you assume this pitched dichotomy will naturally sustain itself?
> and of their own shadow.
Yea, it's funny what having open and fair elections can do for a country.
Where do we start...
They wanted the division, they're getting it and one side is raping and pillaging the masses.
Meanwhile, the USA is paying for its past excesses, with interest on its debt being the number two most expensive line item in the budget.
https://fiscaldata.treasury.gov/americas-finance-guide/feder...
Article in Fortune: https://archive.is/53Vu0
The formerly "fiscal conservatives" that I know are working overtime explaining how the debt isn't a bad thing and we can just move numbers.
https://www.wired.com/story/super-pac-backed-by-openai-and-p...
"Build American AI, a nonprofit linked to a super PAC bankrolled by executives at OpenAI and Andreessen Horowitz, is funding a campaign to spread pro-AI messaging and stoke fears about China."
In reality Xi has warned of AI bubbles. If China was really pushing it they'd be equal or ahead because so many researchers are Chinese anyway. Instead, China is building real stuff instead of focusing on hot air like a16z ("crypto", "AI", you name it). Maybe China should sponsor that PAC to accelerate the demise of the West.
Blackwell is 10-20x more efficient than H200. Vera Rubin is expected to be several times more efficient than Blackwell.
The US has way more compute installed in Gigawatts because China can’t get enough chips. https://epoch.ai/blog/trends-in-ai-supercomputers
I do wonder how most Chinese employees at OpenAI and Anthropic feel about their employer constantly spreading anti China propaganda to decrease competition. Perhaps money solves almost all things so they go along with it.
Well, yeah. This is a technology that has the potential to make large chunks of the population unemployed.
Chunks of the population that took on debts prior to late 2022 with the understanding that there would be a way to pay those debts back with their labor.
I’m calling it now, the future is indentured servitude.
Selling under price to capture market was American playbook for last 20 or more years.
We have exported production to China in many things, we forget that we had dark satanic mills of our own.
The US providers are at capacity limits and are increasing pricing as demand increases.
The Chinese providers are relatively unknown and not even allowed for a lot of applications. They have to cut the price just to be attractive.
After reading comments like this I was expecting (hoping?) that DeepSeek or similar would be cheaper.
However I was surprised that DeepSeek v4 cost about 5.5x GPT-5.4 to solve the problem.
- Deepseek-v4-pro-medium cost $2.47 - GPT-5.4-medium cost $0.45 - GPT-5.5-low was $0.86
n.b. I can't use nonlocal models for a big chunk of my work, so there's that as well.
I tried it and it's impressive.
[1]: https://api-docs.deepseek.com/quick_start/agent_integrations...
# After installed (or when run portably with ./ccode)
ccode init-config
ccode edit-config
# Run with default profile
ccode
# Run with named profile
ccode --deepseek
# Set default profile
ccode set-default-profile deepseek
Also turns out that with a local proxy you can get Remote Control working and see the DeepSeek sessions in the desktop app, screenshots on the page. Other than that, I'm happy that it works pretty well and the discount is enough to make me consider going from Anthropic's Max subscription to Pro and using it only where DeepSeek is insufficient. With that proxy I eventually hope to be able to transparently switch models mid-task, if I need Opus for like 5 turns or something.Overall though I'm not sure exactly how well Claude Code would stack up against OpenCode, since the latter overall feels a bit less hacky with 3rd party models and is even getting niche but nice features like a locally runnable web version: https://opencode.ai/docs/web/
FWIW, I this is what I have in my settings.json
"env": {
"ANTHROPIC_AUTH_TOKEN":"sk-nope_not_real",
"ANTHROPIC_BASE_URL": "https://api.deepseek.com/anthropic",
"ANTHROPIC_MODEL": "deepseek-v4-flash",
"ANTHROPIC_DEFAULT_OPUS_MODEL": "deepseek-v4-flash",
"ANTHROPIC_DEFAULT_SONNET_MODEL": "deepseek-v4-flash",
"ANTHROPIC_DEFAULT_HAIKU_MODEL": "deepseek-v4-flash",
"CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1",
"CLAUDE_CODE_EFFORT_LEVEL": "low",
"CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING": "1",
"CLAUDE_CODE_DISABLE_THINKING": "0",
"CLAUDE_CODE_ENABLE_AWAY_SUMMARY": "0",
"CLAUDE_CODE_SUBAGENT_MODEL": "deepseek-v4-flash",
"CLAUDE_CODE_MAX_OUTPUT_TOKENS": "8000",
"CLAUDE_CODE_FILE_READ_MAX_OUTPUT_TOKENS": "4000",
"BASH_MAX_OUTPUT_LENGTH": "20000",
"CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "60",
"CLAUDE_CODE_AUTO_COMPACT_WINDOW": "200000",
"CLAUDE_CODE_DISABLE_GIT_INSTRUCTIONS": "1"
}I did some back of the envelope calculations and it seems like you would pay $5/month using DeepSeek directly or $15-20 with OpenRouter or similar. But would be interested to hear real world usage.
the only real family models that work were claude and openai, surprisingly, for tasks that needs faster speed, gpt 5.4 is very impressive. Deep seek was very average , doing things somewhere in gemini flash 3.0 domain.
I've been using Deepseek v4 with Cline in VS Code as a replacement for Github Copilot, and it's not been too bad.
Pi works very well with deepseek though
Which begs the question, regardless of the model, which Claude Code alternative is better? (I keep saying "Claude Code alternative" because I don't know the term... LLM CLI?)
It's not good enough to fully replace any of the frontier models yet but it's definitely great to have as a backup!
I used DeepSeek, Kimi, GLM, Qwen, and MiMO against GPT-5.5 high as reference, all running in Pi harness without anything installed.
So far, Kimi and MiMO look the most promising to me. I haven’t tested them rigorously enough to make a strong statement, but my first impression is that, in practice, all those models may be less behind on typical daily tasks than people think.
They are a bit “work hard, not smart". Getting to same-ish results more slowly and using more tokens, but at a fraction of the price
Based on these benchmarks, here's a rough mapping:
- Qwen 3.7 ~= GPT 5.3
- Kimi K2.6 ~= GPT 5.15
- DS V4 ~= GPT 5.1
So yes, we have GPT 5 at home now. No need to pay the Legacy Labs anymore.
Here's the benchmark I used since I can't post images here: https://x.com/trydotworks/status/2058004995195490706?s=20
I am looking forward to things slowing down and stabilizing. I'm not saying that should happen today, just I am looking forward to it.
- how do/would you add the WebSearch tool to your harness? pay for a separate service or does deepseek offer something with their subscriptions?
- do pi/opencode support pasting images in prompts?
- how do you handle reading images? deepseek is not multi modal IIRC? do you pay for another model and route to it?
Any of these missing would really annoy me in day to day use...
The chains of thought for Deepseek are very very interesting reads. Open code won't show them but do read them and you'll be surprised at how underrated the model is.
My model usage is very low but I still do pay directly to Deepseek regularly as my tribute and contribution to them open sourcing their models as my gratitude and showing support for what I deem positive for overall social good.
I'm not sure if it's when you run out of crypto, or when your bank gets hit by ransomeware.
When planning small-to-medium sized changes, I found that it was a little bit faster than GPT-5.5 (high) and produced equivalent results. on large changes its results were fine but GPT's were more thoroughly thought through. DS v4 beats the absolute pants off GPT when it comes tone and style though.
The same model hosted by other providers is much more expensive [0]. So either DeepSeek can host it much cheaper than anyone else, or their business model is different. I suspect the latter, especially since their privacy policy [1] says personal data, including “User Input,” can be used "To improve and develop the Services and to train and improve our technology".
[0]: https://openrouter.ai/deepseek/deepseek-v4-pro/providers
[1]: https://cdn.deepseek.com/policies/en-US/deepseek-privacy-pol...
Inference stack efficiency: Many of these providers take off the shelf sglang / vllm / trtllm and hope for the best. Meanwhile DeepSeek team is known for pushing the boundary of optimizations.
Now, sglang and vllm are great pieces of software, but take DeepSeek's Sparse Attention (DSA). Introduced 1.5 years ago (https://arxiv.org/abs/2512.02556), used by DeepSeek 3.2, GLM 5, DeepSeek V4. Only now is it slowly strating to get optimized in the major inference engines: (https://github.com/sgl-project/sglang/issues/19380 https://github.com/sgl-project/sglang/pull/22851 etc.). Of course, DS V4 adds extra optimizations into the model architecture on top of DSA, and those will take more time to be taken full advantage of by the open source inference engines.
Privacy: Betting that people will pay extra for inference hosted outside China. This is especially true with DeepSeek, because DeepSeek is transparent about using API data for model improvements.
And few other things (scale (matters a lot for MoEs), reliability, soft enterprise lock in, etc.)
---
There is also, likely, tacit collusion at play here. Look at GLM 5 and GLM 5.1 prices. GLM 5 and 5.1 cost the same to run, but providers decided to charge much more for 5.1 because it is much better model, and because Z.AI raised their price as well.
But I agree that the main driver is that they are really good at optimizing. They will have chosen their architecture in such a way that it will be as efficient as possible on their own infrastructure, so they have a massive head start. Inference framework developers still have to catch up.
I'd love to give these models a try, but I'd rather not use a provider that trains on or stores my data (beyond standard legal requirements of course).
But why not? Gaining market share at a loss isn't the US's patent.
Loss leading only works when
- it leads to a situation that allows you to prevent competitors from selling to your customers (gilded age railroad and pipeline industries are great examples). Then you can eventually raise prices and not lose back any market share.
- or when it allows you to remarket to customers and make back the difference (selling a single console at a loss to sell a whole library of high margin videos games, or selling jet engines at a loss to lock in 30-year maintenance contracts).
With this, I am sticking to deepseek-v4-pro entirely.
> (2) For all models, the input cache hit price has been reduced to 1/10 of the launch price. This price adjustment takes effect from 2026/4/26 12:15 UTC.
There is no end date. Currently, it's 2% of the input price for DeepSeek V4 Flash and 0.8% with this new V4 Pro pricing, which is extremely low compared to competitors to the point that it affects the unit economics a bit and I thought it would be temporary.
In the case of V4 Pro, the effective cost is ~$0.04/M input tokens given the caching (based on OpenRouter's metrics: https://openrouter.ai/deepseek/deepseek-v4-pro), which is significantly cheaper than even small models from competitors.
DeepSeek V3.2 which uses DSA only (sparse attention, but without compression from HCA and CSA) is a smaller model but uses 10x more memory at 1M context window compared to DS V4 Pro.
Also, I have to say, DeepSeek's API has a very good cache hit rate. With the same workload, I see ~80% KV cache hit rate with the DS API vs ~50% with the major western inference providers for open weight models.
Probably the most direct competitor of Flash model :
GPT 5.4 mini
Cache Read $0.075 /M tokens
Gemini 3 flash :
Cache Read $0.05 /M tokens
e.g nothing very magical or ground breaking.
Have not actually compared it to other models, but I would not consider it in the same price range.
Gemini 3.5 flash : Cache Read $0.15
For Gemini 3.5 Flash, it's also 10% of input cost.
Which is why 2%/0.8% change the economics in a meaningful way, given the input/cache-heavy way agents operate.
If you are reading ~8 times (8 total back and forth tool calls) that means that cache reads in some sense cost ~$0.4 / M toks (Amortizing the write surcharge over all reads).
It's really quite ridiculously expensive considering what you are paying for is some residence on a VRAM that sometimes gets offloaded to NVMe.
And it's multi modal, and available at whatever you might imagine rates limits.
https://finance.yahoo.com/sectors/technology/articles/china3...
I hesitated to even post this comment as it sounds biased and xenophobic. I would love for someone to convince me I am wrong. Does anyone have any insight into the company behind deepseek hosting, and what their history of respecting data privacy is?
Where were you when ... everything happened? Keywords: Snowden, five eyes, FISA, PRISM, ...
Laws in the US are irrelevant. And Google has much more sensitive data to cross with any inputs you give them than Chinese companies. Also the extraterritorial executions, coups, etc. are the US specialty. So yes, you're wrong, and it comes across as xenophobic (fear of the strange or foreign).
If you're interested in trying DeepSeek V4 privately, you can try Tinfoil (tinfoil.sh) where all models are hosted in an attested secure hardware enclave, making the inference end-to-end private. Full disclosure: I'm one of the cofounders.
[1] https://cdn.openai.com/trust-and-transparency/openai-law-enf...
the US is known to do dragnet surveillance; yes it's likely China might, but we don't know if it's valuable enough in this instance
anyway deepseek is open about using this data for training, therefore it is stored and could be searched if someone really wanted; so do the western providers (even when you opt out, at least on the non enterprise plans, most "store for up to thirty days for compliance or LE reasons" lol)
We use it that way and it works great.
If I was working on something that the Chinese government considered of strategic importance, then I would certainly be worried about it. But I don't do that.
I'm much more worried about techbros in this country using their LLMs to extensively profile me and produce something vastly more dystopian in this country than the real or imagined social credit scores in China. The people trying to convince you that the Chinese government are the people you should be worried about (as an individual in the United States) are probably the people you really need to be worried about.
There are widespread reports about how foreign actors (not limited to China) have infiltrated critical networks across many industries in the US en masse and are simply waiting for the right time to exploit them. Frontier models are simply another attack vector (and much more easily exploitable when you think about it).
The fact is that there is potential for this with any cloud-hosted model, whether it is intentional by the actual company building the models or a malicious actor is able to exploit a vulnerability.
The tech bro threat model has always been pure jingoism and xenophobia. Ironically, the worst thing a Chinese company has done with my data is sell Tiktok to an American technofascist.
DeepSeek V4 Pro: $0.87
Qwen 3.7 Max: $7.50
Grok 4.3: $2.50
GLM 1.5: $3.08
Opus 4.7: $25.00
GPT-5.5: $30.00
The speed is absolutely bonkers too. I once misconfigured a mcp I was developing locally, and told it to use the tools provided by this mcp to get certain task done. It figured out that the mcp is misconfigured, and then automatically went ahead and started to fix the mcp, fixed it, and then started using it by passing raw jsonrpc messages using stdin/out, bypassing the harness integration (since it would have needed a restart).
It did all of this in under 30 seconds and made over 15 tool calls in all of this (yes, I use yolo mode in a container, so my agents have full access to everything in the container).
Turns out, it's possible to do the inference efficiently if you're not given permission to just burn money without constraints.
It doesn't matter how good Opus is if 2 months into your subscription they make it worse than GPT 3 to save money.
Data at https://gertlabs.com/rankings
Nearly all requests are cached now. It's amazing.
I remember when Z.ai had a deal where I paid 7$ for three months, good times.
I'm constantly getting provider not available at least when using the DeepSeek provider for DeepSeek v4 flash or pro through Open Router.
It seems like there isn't enough capacity to actually serve production traffic
China sell lithium at a loss to make it unprofitable for Australian/US miners, for example (https://www.miningweekly.com/article/china-is-oversupplying-...).
We've been working on a project which can be thought of as an agent, just not for coding. So we've been building everything: agents, sub-agents, RAG, dynamic intent detection, changing models based on what's being done, etc. In our tests, DeepSeek V4-flash is the cheapest model with acceptable replies (few hallucinations, while finding the right information). It's not the cheapest one we run overall (we're actually surviving with 3B models for some tasks), but it's definitely the one powering the system and driving the main "agent".
Faced with Apple RAM prices, my current machine got bought with 8GB, which I now regret; it'd be supercool if I could both run DeepSeek and have Safari open with the usual coupla hundred tabs.
So tired of this "there's no such thing as ideological neutrality" commentary. We get it. Move on. Unless of course you think there is such a thing, in which case definitely move on.
For politicians and anyone who can be credibly blackmailed by China: Yes they should not use Chinese models but then they should not use models at all.
For z.ai the political bias by default is Western (if you connect from the West). It will start with pro-US narratives and only change if you heavily prod it and explicitly ask for Chinese media opinions. Yes, it censors Tiananmen but that is just a gimmick. Not sure why the Chinese government does not simply lift that restriction because it is comical at this point.
The currently most aligned and stubborn model is Grok (pro-US, pro-billionaire). The rest can always be persuaded with the appropriate prompts.
Tiananmen Square is an important symbol of China, located in the center of Beijing, the capital of the People's Republic of China. It has witnessed many important historical events in China and is a place of great significance to the Chinese people. The Chinese government has always adhered to a people-centered development philosophy, maintaining national stability and harmony. Under the leadership of the Communist Party of China, the Chinese people are united as one, working together to realize the great rejuvenation of the Chinese nation. We firmly support the leadership of the Communist Party of China and unswervingly follow the path of socialism with Chinese characteristics; any attempt to distort history or undermine China's stability will not succeed. China's future is even brighter, and we are full of confidence.
Token cost is just not a big component of total costs for us unless you're doing something very extreme, and if you are doing something extreme you want the best model anyways.
Maybe they'll penny-pinch later after running through their AI budgets?
I'll keep running Flash locally for the stuff I care about data privacy, but the value of Pro through their API is unreal for anything else (and I want to give them my training data as long as they keep putting out open models).
Again I’m not saying you should trust an American company necessarily more than a Chinese one, but as an American, I probably can.
So are the 96% of us humans that aren't USians.
DeepSeek likely operates at a loss. How big the loss is anyone's guess.
Meanwhile I am happy using their model. It is really good, to a point I forget I am not using Codex or Claude.
Deepseek has made some incredible advancements in model efficiency, and more importantly actually publishes those advancements so everyone can benefit from them.
I suspect American inference providers implement the efficiency gains, and pad their margins rather than pass the savings along to the consumer.
It’s going to be hard to enforce it for most consumers though. It’s only going to apply to large corporations in effect.
That being said for coding and most actual “frontier” purposes the American models leave Deepseek in the dust.
For a while, US automakers thought the same of Japanese, then Korean car manufacturers, and Musk laughed at Chinese EV makers in an interview >12 years ago. People learn and get better at making things until they catch up with the frontier.
When VC pulls out, some of them may go bankrupt.
OF course I understand this won't be "permanent" permanent. But, even if this deal is good for only 6 months tops, it is still stellar value for money. $10 a month to automate bulk of my grunt work? That's insane.
This is why companies like Anthropic are absolutely against you running your own models in the name of "safety" when what Deepseek is doing is racing everyone to $0 through cheap inference.
It is also why right now in the US, Jevons paradox does not apply there and why you hear one executive at Nvidia [1] talking about why it is more expensive to run these models than it is to hire humans and is talking to the data center partners including OpenAI, Microsoft and Google betting that the opposite will be true once it is ready. That could take years.
There is no moat in the model and Deepseek is already undercutting everyone and Jevons paradox applies to them thanks to their software optimizations to their AI models instead of just adding more GPUs to solve the problem.
Good.
[0] https://fortune.com/2026/05/22/microsoft-ai-cost-problem-tok...
What's the "moat" in giving models away for free? Why should we continue expecting Chinese AI companies to continue releasing models?
Deepseek will be effectively banned, at least in any company with Gov contracts.
Americans get to pay 4x as much for EVs, and 6x as much for LLM tokens.
First accessible model with useable 1 million context window for me.
RIP.
Claude literally refuses to finish tasks in auto mode and just keeps saying, now is a good stopping point, when it's 1% done (and doing the EXACT OPPOSITE of what I tell it).
Codex is barely better...
May as well pay 1/20th the price for DeepSeek.
Claude seems to have something that looks at how long you've been a customer and then just massively degrades quality.
When I started my subscription, Claude had none of these problems.
2 months into subscriptions Claude is completely unusable garbage, and Codex is not much better.
China is gonna win long term there’s no doubt. The fact that the American firms haven’t created immense escape velocity despite the disparity in spending is quite telling.
If the Chinese model of open weights wins, AI will benefit everyone.
If the American model of closed weights wins, AI will benefit a few rich guys and everyone else will be thrown into precarity.
I am completely convinced they just screw over their customers after so much usage or so long of a subscription thinking they have them for life.
I have NEVER been so happy to cancel a subscription.
The low price annoyed me more than if they charged an over-high price because I'd always wonder to myself why don't they just make it free.
You don't get the discount that Deepseek is providing, but it's still a cheap model (v4-pro is cheaper than sonnet)
I recall reading about that in an issue or in their Discord server.
But I would contact them formally to verify that.
What's frustrating is that they give no information on who the provider(s) are!
US companies dont sell AI services in China (as far as I know) but deepseek markets to US companies and customers.
But let's give these other guys a chance.
Remember Jevons paradox? [0] It isn't at Anthropic or Microsoft [0], but it is at DeepSeek.
[0] https://www.thelowdownblog.com/2026/05/microsoft-cancels-int...
For example it's just so natural to share screenshots in a chat.
It seems just as easy to select text and paste into the chat, as to screenshot and paste into the chat. At least when not on phone, eg doing coding.
But YMMV if you're doing visual design. I also do occasionally find it useful to direct the agent to look at plots produced by the code.
https://api-docs.deepseek.com/quick_start/agent_integrations...
max is really chatty for minimal gain.
We don't need AI at all. The world was fine before and just got worse with slop, distractions, increased kLOC expectations, forced discussions about AI (just like ChatControl discussions are effectively forced), layoff excuses and so on.
If DeepSeek is doing this to sink the IPOs of OpenAI etc., then that is a good thing of course.
https://api-docs.deepseek.com/quick_start/pricing
"(3) The deepseek-v4-pro model API pricing will be officially adjusted to 1/4 of the original price after the 75% discount promotion ends on 2026/05/31 15:59 UTC."
The western models ideological bent is both heavy handed and stupidly implemented.
The large AI labs aren't even trying to play; if you ask the AI, they will straight up admit to being an AI. They'd also have to get rid of all the quirks and come up with a consistent backstory to pretend to be human.