undefined | Better HN

0 pointsblitz_skull5mo ago0 comments

Again I just tap the sign.

All of your benchmarks mean nothing to me until you include Claude Sonnet on them.

In my experience, GPT hasn’t been able to compete with Claude in years for the daily “economically valuable” tasks I work on.

0 comments

jstummbillig5mo ago

Since as per Anthropics own benchmarks Sonnet 4.5 is beaten by Opus 4.5 would it not suffice to infer the rest?

https://x.com/OpenAI/status/1999182104362668275

nextworddev5mo ago

Claude is pretty trash for anything besides coding

wyre5mo ago

What are you basing that on? Between Sonnet and Opus I don't think I'm reaching for Gemini 3 at all.

romanovcode5mo ago

Yeah, but that is the whole point of Claude. And that's why we are interested in the comparison.

timmg5mo ago

That hasn't been my experience at all. I always wondered if we just get used to how to prompt a given model and that it hard to transition to another.

j / k navigate · click thread line to collapse