undefined | Better HN

0 pointshypron3mo ago0 comments

My issue with this is that the LLM could just be roleplaying that it doesn't know.

0 comments

Of course it is. It's not capable of actually forgetting or suppressing its training data. It's just double checking rather than assuming because of the prompt. Roleplaying is exactly what it's doing. At any point, it may stop doing that and spit out an answer solely based on training data.

It's a big part of why search overview summaries are so awful. Many times the answers are not grounded in the material.

wavemode3mo ago

It may actually have the opposite effect - the instruction to not use prior knowledge may have been what caused Gemini 3 to assume incorrect details about how certain puzzles worked and get itself stuck for hours. It knew the right answer (from some game walkthrough in its training data), but intentionally went in a different direction in order to pretend that it didn't know. So, paradoxically, the results of the test end up worse than if the model truly didn't know.

stavros3mo ago

Doesn't know what? This isn't about the model forgetting the training data, of course it can't do that any more than I can say "press the red button. Actually, forget that, press whatever you want" and have you actually forget what I said.

Instead, what can happen is that, like a human, the model (hopefully) disregards the instruction, making it carry (close to) zero weight.

brianwawok3mo ago

To test would just need to edit the rom and switch around the solution. Not sure how complicated that is, likely depends on the rom system.

Workaccount23mo ago

I don't know why people still get wrapped around the axle of "training data".

Basically every benchmark worth it's salt uses bespoke problems purposely tuned to force the models to reason and generalize. It's the whole point of ARC-AGI tests.

Unsurprisingly Gemini 3 pro performs way better on ARC-AGI than 2.5 pro, and unsurprisingly it did much better in pokemon.

The benchmarks, by design, indicate you can mix up the switch puzzle pattern and it will still solve it.

j / k navigate · click thread line to collapse