https://help.openai.com/en/articles/6825453-chatgpt-release-...
"If you open a conversation that used one of these models, ChatGPT will automatically switch it to the closest GPT-5 equivalent."
- 4o, 4.1, 4.5, 4.1-mini, o4-mini, or o4-mini-high => GPT-5
- o3 => GPT-5-Thinking
- o3-Pro => GPT-5-Pro
Regular users just see incrementing numbers, why would they want to use 3 or 4 if there is a 5? This is how people who aren't entrenched in AI think.
Ask some of your friends what the difference is between models and some will have no clue that currently some of the 3 models are better than 4 models, or they'll not understand what the "o" means at all. And some think why would I ever use mini?
I think people here vastly underestimate how many people just type questions into the chatbox, and that's it. When you think about the product from that perspective, this release is probably a huge jump for many people who have never used anything but the default model. Whereas, if you've been using o3 all along, this is just another nice incremental improvement.
It is frankly ridiculous to assume anyone would think that 4o is in anyway worse then o3. I don't understand why these companies suck at basic marketing this hard, like what is with all these .5s and mini and other shit names. Just increment the fucking number or if you are embarrassed by having to increase the number all the time just use year/month. Then you can have different flavors like "light and fast" or "deep thinker" and of course just the regular "GPT X"
Of course, I know that having a line-up of tons of models is quite confusing. Yet I also believe users on the paid plan deserve more options.
As a paying user, I liked the ability to set which models to use each time, in particular switching between o4-mini and o4-mini-high.
Now they’ve deprecated this feature and I’m stuck with their base GPT-5 model or GPT-5 Thinking, which seems akin to o3 and thus has much smaller usage limits. Only God knows whether their routing will work as well as my previous system for selecting models.
I suppose this is probably the point. I’m still not super keen on ponying up 200 bucks a month, but it’s more likely now.
I wouldn't want to be in charge of regression testing an LLM-based enterprise software app when bumping the underlying model.
Maybe to service more users they're thinking they'll shrink the models and have reasoning close the gap... of course, that only really works for verifiable tasks.
And I've seen the claims of a "universal verifier", but that feels like the Philosopher's Stone of AI. Everyone who's tried it has shown limited carryover between verifiable tasks (like code) to tasks with subjective preference.
-
To clarify also: I don't think this is nefarious. I think as you serve more users, you need to at least try to reign in the unit economics.
Even OpenAI can only afford to burn so many dollars per user per week once they're trying to serve a billion users a week. At some point there isn't even enough money to be raised to keep up with costs.
The names of GPT models are just terrible. o3 is better than 4o, maybe?