All of your benchmarks mean nothing to me until you include Claude Sonnet on them.
In my experience, GPT hasn’t been able to compete with Claude in years for the daily “economically valuable” tasks I work on.
https://x.com/OpenAI/status/1999182104362668275