hitradostava on Hacker News

1

GPT5 is worse than 4.1-mini for text and worse than Sonnet 4 for coding

It seems that OpenAI have got the PR machine working amazingly. The Cursor CEO said it's the best, as did Simon Willison (https://simonwillison.net/2025/Aug/7/gpt-5/).

But I've found it terrible. For coding (in Cursor), it's slow, fails with tool calls often (no MCP just stock Cursor tools) and stored some new application state in globalThis - something that no model has ever attempted to do in over a year of very heavy Cursor / Claude Code use).

For a summarization/insights API that I work on, it was way worse than gpt-4.1-mini. I tried both mini and full gpt5, with different reasoning settings. It didn't follow instructions, and output was worse across all my evals, even after heavy prompt adjustment. I did a lot of sampling and the results were objectively bad.

Am I the only one? Has anyone seen actual real-world benefits of GPT-5 vs other models?

10hitradostava9mo ago16

2

Show HN: I made an interactive sentiment model comparison site (opens in new tab)

(addmaple.com)

4hitradostava1y ago0

3

The Terrible UX of Spreadsheets (opens in new tab)

(addmaple.com)

2hitradostava2y ago0

4

Show HN: We made a data tool that pivots all columns/rows into charts locally (opens in new tab)

(addmaple.com)

1hitradostava2y ago0

5

Using multipart/x-mixed-replace for multi-modal chat streams (opens in new tab)

(addmaple.com)

3hitradostava2y ago1

hitradostava

Recent submissions

GPT5 is worse than 4.1-mini for text and worse than Sonnet 4 for coding

Show HN: I made an interactive sentiment model comparison site (opens in new tab)

The Terrible UX of Spreadsheets (opens in new tab)

Show HN: We made a data tool that pivots all columns/rows into charts locally (opens in new tab)

Using multipart/x-mixed-replace for multi-modal chat streams (opens in new tab)

Recent submissions

GPT5 is worse than 4.1-mini for text and worse than Sonnet 4 for coding

Show HN: I made an interactive sentiment model comparison site (opens in new tab)

The Terrible UX of Spreadsheets (opens in new tab)

Show HN: We made a data tool that pivots all columns/rows into charts locally (opens in new tab)

Using multipart/x-mixed-replace for multi-modal chat streams (opens in new tab)