I understand ToS violations can lead to a ban. OpenAI is free to ban DeepSeek from using their APIs.
Yes, there is the question how much ChatGPT data DeepSeek has ingested. Certainly not zero! But if DeepSeek has achieved iterative self-improvement, that'd be huge too!
Even if o1 specifically was used (which is in itself doubtful), it does not mean that this was the main reason that r1 succeeded/it could not have happened without it. The o1 outputs hides the CoT part, which is the most important here. Also we are in 2025, scratch does not exist anymore. Creating better technology building upon previous (widely available) technology has never been a controversial issue.
who cares. even if the claim is true, does that make the open source model less attractive?
in fact, it implies that there is no moat in this game. openai can no longer maintain its stupid valuation, as other companies can just scrape its output and build better models at much lower costs.
everything points to the exact same end result - DeepSeek democratized AI, OpenAI's old business model is dead.
If your own API can leak your secret sauce without any malicious penetration, well, that's on you.
DDOSing web sites and grabbing content without anyone's consent is not hard earned at all. They did spent billions on their thing, but nothing was earned as they could never do that legally.
But let's keep the eye on the ball for a second. None of that changes the fact that what was built was a capability to reflect that knowledge in dynamic and deep ways in conversation, as well as image and audio recognition.
And did Deepseek also build that? From scratch? Because they might not have.
So say DS had simply published a paper outlining the RL technique they used, and one of Meta, Google or even OpenAI themselves had used it to train a new model, don't you think they'd have shouted off the rooftops about a new breakthrough? The fact that the provenance of the data is from a rival's model does not negate the value of the research IMHO.
One way or another, they were able to create something that has WAY cheaper inference costs than o1 at the same level of intelligence. I was paying Anthropic $15/1M tokens to make myself 10x faster at writing software, which was coming out to $10/day. O1 is $60/1M tokens, which for my level of usage would mean that it costs as much as a whole junior software engineer. DeepSeek is able to do it for $2.50/1M tokens.
Either OpenAI was taking a profit margin that would make the US Healthcare industry weep, or DeepSeek made an engineering breakthrough that increases inference efficiency by orders of magnitude.
It's been known for a while that competitors used OpenAI to improve their models, that's why they changed the TOS to forbid it.
That doesn't mean the deep seek technical achievements are less valid.
Well, that's literally exactly what it would mean. If DeepSeek relied on OpenAI’s API, their main achievement is in efficiency and cost reduction as opposed to fundamental AI breakthroughs.
In a way this is something most companies have been doing with their smaller models, DeepSeek just supposedly* did it better.
Eventually all future AIs will be produced with synthetic input, the amount of (quality) data we humans can produce is quite limited.
The fact that the input of one AI has been used in the training of another one seems irrelevant.
The deeper question is whether Deepseek has achieved real autonomy or if it’s just a derivative work. If the latter, then OpenAI still holds the keys to future advances. If Deepseek truly found a way to be independent while achieving similar performance, then OpenAI has a problem.
The details of how they trained matter more than the inevitability of synthetic data down the line.
This question is malformed, imo. Every lab is doing derivative work. OpenAI didn’t invent transformers, Google did. Google didn’t invent neural networks or back propagation.
If you mean whether OAI could have prevented DS from succeeding by cutting off their API access, probably not. Maybe they used OAI for supervised fine tuning in certain domains, like creative writing, which are difficult to formally verify (although they claim to have used one of their own models). Or perhaps during human preference tuning at the end. But either way, there are many roads to Rome, and OAI wasn’t the only game in town.
Point is, those future advances are worthless. Eventually anybody will be able to feed each other's data for the training.
There's no moat here. LLMs are commodities.
Also, if you read their papers it’s quite clear there are several important engineering achievements which enabled this. For example multi head latent attention.
It’s the same problem with pharmaceuticals and generics. It’s great when the price of drugs is low, but without perverse financial incentives no company is going to burn billions of dollars in a risky search for new medicines.
They had to be cheating.
https://news.ycombinator.com/newsguidelines.html
p.s. yes, that goes both ways - that is, if people are slamming a different country from an opposite direction, we say the same thing (provided we see the post in the first place)