- GPT 3.5: Good for finding reference terms. I could not trust anything it said, but it could help me find some general terms in fields I was unfamiliar with.
- GPT 4: Good for cached, obscure knowledge. I generally could trust the stuff it said to be true, but none of its logic or conclusions.
- GPT 4.5: Good for reference proofs/code. I cannot trust its proofs or code, but I can get a decent outline for writing my own.
- GPT 5: Good for directed thinking. I cannot trust it to come up with the best solution on its own, but if I tell it what I'm working on, it's pretty decent at using all the tricks in its repertoire (across many fields) to get me a correct solution. I can trust its proofs or code to be about as correct as my own. My main issues are I cannot trust it to point out confusion or ask me, "is this actually the problem we should be solving here?" My guess is this is mostly a byproduct of shallow human feedback, rather than an actual issue with intelligence (as it will often ask me at the end of spending a bunch of computation if I want to try something mildly different).
For me, GPT 5 is way more useful than the previous models, because I don't have a lot of paper-pushing problems I'm trying to solve. My guess is the wider public may disagree because it's hard to tell the difference between something better at the task than you, and something much better.