undefined | Better HN

0 pointsI_am_tiberius1mo ago0 comments

Open weight!

0 comments

Please don't slander the most open AI company in the world. Even more open than some non-profit labs from universities. DeepSeek is famous for publishing everything. They might take a bit to publish source code but it's almost always there. And their papers are extremely pro-social to help the broader open AI community. This is why they struggle getting funded because investors hate openness. And in China they struggle against the political and hiring power of the big tech companies.

Just this week they published a serious foundational library for LLMs https://github.com/deepseek-ai/TileKernels

Others worth mentioning:

https://github.com/deepseek-ai/DeepGEMM a competitive foundational library

https://github.com/deepseek-ai/Engram

https://github.com/deepseek-ai/DeepSeek-V3

https://github.com/deepseek-ai/DeepSeek-R1

https://github.com/deepseek-ai/DeepSeek-OCR-2

They have 33 repos and counting: https://github.com/orgs/deepseek-ai/repositories?type=all

And DeepSeek often has very cool new approaches to AI copied by the rest. Many others copied their tech. And some of those have 10x or 100x the GPU training budget and that's their moat to stay competitive.

The models from Chinese Big Tech and some of the small ones are open weights only. (and allegedly benchmaxxed) (see https://xcancel.com/N8Programs/status/2044408755790508113). Not the same.

patshead1mo ago

DeepSeek's models are indeed open weight. Why do you feel that pointing this out would be considered slander?

culi1mo ago

I think they were reading GP's comment as a correction. Like "not open-source, just open weight". I'm not sure if their reading was accurate but I enjoyed their high effort comment nonetheless

alecco1mo ago

X is full of "open weights!" corrections as a dog whistle by the anti-China crowd. And they are right about models from the Chinese Big Tech, but completely wrong about DeepSeek.

alecco1mo ago

>> Truly open source coming from China.

> Open weight!

They clearly were implying it's not open source.

patshead1mo ago

Correct. We have open-weight models from OpenAI, Facebook, Mistral, DeepSeek, Z.ai, MiniMax, and all sorts of other companies. Most of them have fantastic and open licensing terms.

If we can't build the weights, then we don't have the source. I'm not entirely sure what an open-source model would even look like, but I am confident that these binary blobs that we are loading into llama.cpp and vllm aren't the equivalent of source code. We have absolutely no idea what sort of data went into them.

This is fine. It isn't slanderous. It is what we have, and it is awesome. Just because it is awesome doesn't make it open source.

kortilla1mo ago

It’s not slander to say something true. These are open weights, not open source. They don’t provide the training data or the methodology requires to reproduce these weights.

So you can’t see what facts are pruned out, what biases were applied, etc. Even more importantly, you can’t make a slightly improved version.

This model is as open source as a windows XP installation ISO.

alecco1mo ago

> These are open weights, not open source.

Did you even read my comment?

jatora1mo ago

I did. Show me the source code.

1 more reply

0-_-01mo ago

Weights are the source, training data is the compiler

crazylogger1mo ago

Training data == source code, training algorithm == compiler, model weights == compiled binary.

0-_-01mo ago

Training algorithm is the programmer, weights are the code that you run in an interpreter

ngruhn1mo ago

isn't it more like the data is the source, the training process is the compiler, and the weights are the binary output.

j / k navigate · click thread line to collapse

0 comments

alecco1mo ago

Just this week they published a serious foundational library for LLMs https://github.com/deepseek-ai/TileKernels

Others worth mentioning:

https://github.com/deepseek-ai/DeepGEMM a competitive foundational library

https://github.com/deepseek-ai/Engram

https://github.com/deepseek-ai/DeepSeek-V3

https://github.com/deepseek-ai/DeepSeek-R1

https://github.com/deepseek-ai/DeepSeek-OCR-2

They have 33 repos and counting: https://github.com/orgs/deepseek-ai/repositories?type=all

The models from Chinese Big Tech and some of the small ones are open weights only. (and allegedly benchmaxxed) (see https://xcancel.com/N8Programs/status/2044408755790508113). Not the same.

patshead1mo ago

DeepSeek's models are indeed open weight. Why do you feel that pointing this out would be considered slander?

culi1mo ago

I think they were reading GP's comment as a correction. Like "not open-source, just open weight". I'm not sure if their reading was accurate but I enjoyed their high effort comment nonetheless

alecco1mo ago

X is full of "open weights!" corrections as a dog whistle by the anti-China crowd. And they are right about models from the Chinese Big Tech, but completely wrong about DeepSeek.

alecco1mo ago

>> Truly open source coming from China.

> Open weight!

They clearly were implying it's not open source.

patshead1mo ago

Correct. We have open-weight models from OpenAI, Facebook, Mistral, DeepSeek, Z.ai, MiniMax, and all sorts of other companies. Most of them have fantastic and open licensing terms.

This is fine. It isn't slanderous. It is what we have, and it is awesome. Just because it is awesome doesn't make it open source.

kortilla1mo ago

It’s not slander to say something true. These are open weights, not open source. They don’t provide the training data or the methodology requires to reproduce these weights.

So you can’t see what facts are pruned out, what biases were applied, etc. Even more importantly, you can’t make a slightly improved version.

This model is as open source as a windows XP installation ISO.

alecco1mo ago

> These are open weights, not open source.

Did you even read my comment?

jatora1mo ago

I did. Show me the source code.

1 more reply

0-_-01mo ago

Weights are the source, training data is the compiler

crazylogger1mo ago

Training data == source code, training algorithm == compiler, model weights == compiled binary.

0-_-01mo ago

Training algorithm is the programmer, weights are the code that you run in an interpreter

ngruhn1mo ago

isn't it more like the data is the source, the training process is the compiler, and the weights are the binary output.

j / k navigate · click thread line to collapse