Skip to content
Better HN
Top
New
Best
Ask
Show
Jobs
Search
⌘K
undefined | Better HN
0 points
oof-baroomf
7mo ago
0 comments
Share
74.9 SWEBench. This increases the SOTA by a whole .4%. Although the pricing is great, it doesn't seem like OpenAI found a giant breakthrough yet like o1 or Claude 3.5 Sonnet
0 comments
default
newest
oldest
Workaccount2
7mo ago
I'm pretty sure 3.5 sonnet always benchmarked poorly, despite it being the clear programming winner of it's time.
iLoveOncall
7mo ago
That would assume there is a giant breakthrough to be found.
j
/
k
navigate · click thread line to collapse