undefined | Better HN

0 pointsTeMPOraL8mo ago0 comments

I'm curious how you ended up in such a conversation in the first place. Hallucinations are one thing, but I can't remember the last time when the model was saying that it actually run something somewhere that wasn't a tool use call, or that it owns a laptop, or such - except when role-playing.

I wonder if the advice on prompting models to role play isn't backfiring now, especially in conversational setting. Might even be a difference between "you are an AI assistant that's an expert programmer" vs. "you are an expert programmer" in the prompt, the latter pushing it towards "role-playing a human" region of the latent space.

(But also yeah, o3. Search access is the key to cutting down on amount of guessing the answers, and o3 is using it judiciously. It's the only model I use for "chat" when the topic requires any kind of knowledge that's niche or current, because it's the only model I see can reliably figure out when and what to search for, and do it iteratively.)

0 comments

westoncb8mo ago

I've seen that specific kind of role-playing glitch here and there with the o[X] models from openai. The models do kinda seem to just think of themselves as being developers with their own machines.. I think it usually just doesn't come up but can easily be tilted into it.

bradly8mo ago

What is really interesting is in the "thinking" section it said "I need to reassure the user..." so my intuition is that it thought it was right, but did not think I would think they were right, but if they just gave me the confidence, I would try the code and unblock myself. Maybe it thought this was the best % chance I would listen to it and so it is the correct response?

TeMPOraLOP8mo ago

Maybe? Depends on what followed that thought process.

I've noticed this couple times with o3, too - early on, I'd catch a glimpse of something like "The user is asking X... I should reassure them that Y is correct" or such, which raised an eyebrow because I already know Y was bullshit and WTF with the whole reassuring business... but then the model would continue actually exploring the question and the final answer showed no trace of Y, or any kind of measurement. I really wish OpenAI gave us the whole thought process verbatim, as I'm kind of curious where those "thoughts" come from and what happens to them.

ben_w8mo ago

Not saying this to defend the models as your point is fundamentally sound, but IIRC the user-visible "thoughts" are produced by another LLM summarising the real chain-of-thought, so weird inversions of what it's "really" "thinking" may well slip in at the user-facing level — the real CoT often uses completely illegible shorthand of its own, some of which is Chinese even when the prompt is in English, but even the parts in the users' own languages can be hard-to-impossible to interpret.

To agree with your point, even with the real CoT researchers have shown that model's CoT workspace don't accurately reflect behaviour: https://www.anthropic.com/research/reasoning-models-dont-say...

andrepd8mo ago

Okay. And the fact that LLMs routinely make up crap that doesn't exist but sounds plausible, and the fact that this appears to be a fundamental problem with LLMs, this doesn't give you any pause on your hype train? Genuine question, how do you reconcile this?

> I really wish OpenAI gave us the whole thought process verbatim, as I'm kind of curious where those "thoughts" come from and what happens to them.

Don't see what you mean by this; there's no such thing as "thoughts" of an LLM, and if you mean the feature marketers called chain-of-thought, it's yet another instance of LLMs making shit up, so.

1 more reply

bradly8mo ago

Ehh... I did ask it if it would be able to figure this out or if I should try another model :|

agos8mo ago

A friend recently had a similar interaction where ChatGPT told them that it had just sent them an email or a wetransfer with the requested file

j / k navigate · click thread line to collapse

0 pointsTeMPOraL8mo ago0 comments

0 comments

westoncb8mo ago

bradly8mo ago

TeMPOraLOP8mo ago

Maybe? Depends on what followed that thought process.

ben_w8mo ago

To agree with your point, even with the real CoT researchers have shown that model's CoT workspace don't accurately reflect behaviour: https://www.anthropic.com/research/reasoning-models-dont-say...

andrepd8mo ago

> I really wish OpenAI gave us the whole thought process verbatim, as I'm kind of curious where those "thoughts" come from and what happens to them.

Don't see what you mean by this; there's no such thing as "thoughts" of an LLM, and if you mean the feature marketers called chain-of-thought, it's yet another instance of LLMs making shit up, so.

1 more reply

bradly8mo ago

Ehh... I did ask it if it would be able to figure this out or if I should try another model :|

agos8mo ago

A friend recently had a similar interaction where ChatGPT told them that it had just sent them an email or a wetransfer with the requested file

j / k navigate · click thread line to collapse