undefined | Better HN

0 pointskadushka11mo ago0 comments

Hallucinations usually happen when a model never knew the answer, not when it forgot something.

0 comments

cma11mo ago

I think this is definitely not true of catastrophic forgetting from finetuning. And with other related types of forgetting from model abliteration there are often extreme increases hallucination.

The InstructGPT paper also showed that RLHF made hallucination worse (with more user data rejecting common hallucinations instruction tuning and RLHF may lower specific hallucinations rejected by users though).

Some mention of that here: https://huyenchip.com/2023/05/02/rlhf.html#rlhf_and_hallucin...

kadushkaOP11mo ago

RL might be making hallucinations worse, that’s true. Why do you think RL is causing catastrophic forgetting? Are there factual knowledge benchmarks showing it for o3 or o4-mini?

cma11mo ago

Just since any continued training tends to cause catastrophic forgetting if the old info isn't regurgitated again.

Not specifically showing catastrophic forgetting, but hallucination for o3:

    >  From the results of this evaluation, o3's hallucination rate is 33 percent, and o4-mini's hallucination rate is 48 percent — almost half of the time. By comparison, o1's hallucination rate is 16 percent, meaning o3 hallucinated about twice as often.

https://mashable.com/article/openai-o3-o4-mini-hallucinate-h...

Deepseek R1 handles some of this by redistilling back in "factual Q&A" generated from original V3 model to make a new V3. The V3 paper mentions it incorporated an R1 pass too so it seems like: V3 base model, RL pass, V3 with RL distill and retraining a checkpoint for the final V3 release, additional RL pass for the final R1 release.

V3 Paper

> During the post-training stage, we distill the reasoning capability from the DeepSeekR1 series of models [I think that refers to the earlier checkpoint R1 after the first pass below]

R1 Paper:

> To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorporates a small amount of cold-start data and a multi-stage training pipeline. Specifically, we begin by collecting thousands of cold-start data to fine-tune the DeepSeek-V3-Base model. Following this, we perform reasoning-oriented RL like DeepSeek-R1-Zero. Upon nearing convergence in the RL process, we create new SFT data through rejection sampling on the RL checkpoint, combined with supervised data from DeepSeek-V3 in domains such as writing, factual QA, and self-cognition, and then retrain the DeepSeek-V3-Base model. After fine-tuning with the new data, the checkpoint undergoes an additional RL process, taking into account prompts from all scenarios. After these steps, we obtained a checkpoint referred to as DeepSeek-R1, which achieves performance on par with OpenAI-o1-1217.

In general with fine-tuning you can avoid catastrophic forgetting by mixing in the original data during later fine tuning steps, and from this it seems the same is true of the RL phases, but they are also doing some amount of augmentation and selection on the the data involved.

2 more replies

j / k navigate · click thread line to collapse

0 comments

cma11mo ago

I think this is definitely not true of catastrophic forgetting from finetuning. And with other related types of forgetting from model abliteration there are often extreme increases hallucination.

Some mention of that here: https://huyenchip.com/2023/05/02/rlhf.html#rlhf_and_hallucin...

kadushkaOP11mo ago

RL might be making hallucinations worse, that’s true. Why do you think RL is causing catastrophic forgetting? Are there factual knowledge benchmarks showing it for o3 or o4-mini?

cma11mo ago

Just since any continued training tends to cause catastrophic forgetting if the old info isn't regurgitated again.

Not specifically showing catastrophic forgetting, but hallucination for o3:

    >  From the results of this evaluation, o3's hallucination rate is 33 percent, and o4-mini's hallucination rate is 48 percent — almost half of the time. By comparison, o1's hallucination rate is 16 percent, meaning o3 hallucinated about twice as often.

https://mashable.com/article/openai-o3-o4-mini-hallucinate-h...

V3 Paper

> During the post-training stage, we distill the reasoning capability from the DeepSeekR1 series of models [I think that refers to the earlier checkpoint R1 after the first pass below]

R1 Paper:

2 more replies

j / k navigate · click thread line to collapse