undefined | Better HN

0 pointswizzwizz40y ago0 comments

I know someone from OpenAI claimed this, but is there any evidence that DeepSeek actually trained their models on output of the models OpenAI have?

0 comments

binarymax0y ago

They talk about some examples in their research.

> “Specifically, we initialized the DeepSeek-Prover using the DeepSeekMath-Base 7B model (Shao et al., 2024). Initially, the model struggled to convert informal math problems into formal statements. To address this, we fine-tuned the DeepSeek-Prover model using the MMA dataset (Jiang et al., 2023), which comprises formal statements from Lean 4’s mathlib2 that were back-translated into natural language problem descriptions by GPT-4. We then instructed the model to translate these natural language problems into formal statements in Lean 4 using a structured approach.”

Section 3.1 in https://arxiv.org/html/2405.14333v1

wizzwizz4OP0y ago

I was thinking of their general-purpose models, like DeepSeek-R1 and DeepSeek-V3, for which I haven't found evidence that OpenAI models were used to generate synthetic training data. But I didn't find this, so clearly my searching skills aren't great.

j / k navigate · click thread line to collapse

0 comments

binarymax0y ago

They talk about some examples in their research.

Section 3.1 in https://arxiv.org/html/2405.14333v1

wizzwizz4OP0y ago

j / k navigate · click thread line to collapse