What an understatement. It has me thinking „man, fuck this“ on the daily.
Just today it spontaneously lost an entire 20-30 minutes long thread and it was far from the first time. It basically does it any time you interrupt it in any way. It’s straight up data loss.
It’s kind of a typical Google product in that it feels more like a tech demo than a product.
It has theoretically great tech. I particularly like the idea of voice mode, but it’s noticeably glitchy, breaks spontaneously often and keeps asking annoying questions which you can’t make it stop.
And the UI lack of polish shows up freshly every time a new feature lands too - the "branch in new chat" feature is really finicky still, getting stuck in an unusable state if you twitch your eyebrows at wrong moment.
it's like the client, not the server, is responsible for writing to my conversation history or something
works great for kicking off a request and closing tab or navigating away to another page in my app to do something.
i dont understand why model providers dont build this resilient token streaming into all of their APIs. would be a great feature
Copilot Chat has been perfect in this respect. It's currently GPT 5.0, moving to 5.1 over the next month or so, but at least I've never lost an (even old) conversation since those reside in an Exchange mailbox.
But voice is not a huge traffic funnel. Text is. And the verdict is more or less unanimous at this time. Gemini 3.0 has outdone ChatGPT. I unsubscribed from GPT plus today. I was a happy camper until the last month when I started noticing deplorable bugs.
1. The conversation contexts are getting intertwined.Two months ago, I could ask multiple random queries in a conversation and I would get correct responses but the last couple of weeks, it's been a harrowing experience having to start a new chat window for almost any change in thread topic. 2. I had asked ChatGPT to once treat me as a co-founder and hash out some ideas. Now for every query - I get a 'cofounder type' response. Nothing inherently wrong but annoying as hell. I can live with the other end of the spectrum in which Claude doesn't remember most of the context.
Now that Gemini pro is out, yes the UI lacks polish, you can lose conversations, but the benefits of low latency search and a one year near free subscription is a clincher. I am out of ChatGPT for now, 5.2 or otherwise. I wish them well.
Codex is decent and seemed to be improving (being written in rust helps). Claude code is still the king, but my god they have server and throttling issues.
Mixed bag wherever you go. As model progress slows / flatlines (already has?) I’m sure we’ll see a lot more focus and polish on the interfaces.
That's sometimes me with the CLI. I can't use the Gemini CLI right now on Windows (in the Terminal app), because trying to copy in multiple lines of text for some reason submits them separately and it just breaks the whole thing. OpenCode had the same issue but even worse, it quite after the first line or something and copied the text line by line into the shell, thank fuck I didn't have some text that mentions rm -rf or something.
More info: https://github.com/google-gemini/gemini-cli/issues/14735#iss...
At the same time, neither Codex CLI, nor Claude Code had that issue (and both even showed shortened representations of copied in text, instead of just dumping the whole thing into the input directly, so I could easily keep writing my prompt).
So right now if I want to use Gemini, I more or less have to use something like KiloCode/RooCode/Cline in VSC which are nice, but might miss out on some more specific tools. Which is a shame, because Gemini is a really nice model, especially when it comes to my language, Latvian, but also your run of the mill software dev tasks.
In comparison, Codex feels quite slow, whereas Claude Code is what I gravitate towards most of the time but even Sonnet 4.5 ends up being expensive when you shuffle around millions of tokens: https://news.ycombinator.com/item?id=46216192 Cerebras Code is nice for quick stuff and the sheer amount of tokens, but in KiloCode/... regularly messes up applying diff based edits.
To posit a scenario: I would expect General Motors to buy some Ford vehicles to test and play around with and use. There's always stuff to learn about what the competition has done (whether right, wrong, or indifferent).
But I also expect the parking lots used by employees at any GM design facility in the world to be mostly full of General Motors products, not Fords.
With Gemini, it will send as soon as I stop to think. No way to disable that.