undefined | Better HN

0 pointsAlexCoventry10mo ago0 comments

It depends on the AI. ChatGPT's higher models (o1-pro/o3/o4-mini-high) have some kind of limited capability to detect errors in the user's thinking, and have relatively few hallucinations.

0 comments

energy12310mo ago

o3 have twice the hallucinations of o1 according to their own hallucination benchmark

UltraSane10mo ago

I've had fun debates about things like p-zombies with Gemini 2.5 Pro

j / k navigate · click thread line to collapse

0 pointsAlexCoventry10mo ago0 comments

It depends on the AI. ChatGPT's higher models (o1-pro/o3/o4-mini-high) have some kind of limited capability to detect errors in the user's thinking, and have relatively few hallucinations.

0 comments

energy12310mo ago

o3 have twice the hallucinations of o1 according to their own hallucination benchmark

UltraSane10mo ago

I've had fun debates about things like p-zombies with Gemini 2.5 Pro

j / k navigate · click thread line to collapse