A chatbot's worst enemy is page refresh (opens in new tab)

(zknill.io)

84 pointszknill1mo ago25 comments

25 comments

> But we’ve hit the ceiling for SSE. That terrible Claude UI refresh gif is state of the art for SSE. And it sucks.

This is nothing to do with SSE. It's trivial to persist state over disconnects and refresh with SSE. You can do all the same pub sub tricks.

None of theses companies are even using brotli on their SSE connection for 40-400x compression.

It's just bad engineering and it's going to be much worse with web sockets. Because, you have to rebuild http from scratch, compression is nowhere near as good, bidirectional nukes your mobile battery because of the duplex antenna, etc, etc.

andersmurphy1mo ago

Just to add. The main value of websockets was faster up events pre http2. But, now with multiplexing in http2 that's no longer the case.

So the only thing you get from websockets is bidirectional events (at the cost of all the production challenges websockets bring). In practice most problems don't need that feature.

anonzzzies1mo ago

Thanks for that. I know very little about frontend and this definitely will help me make something better.

mrieck1mo ago

No mention of ChatGPT? Anyone else have this problem:

Go to ChatGPT.com while logged in, start typing right away, 8 words into typing it clears the text in the form. Why?

hsbauauvhabzb1mo ago

Claude also has odd UI/UX bugs in what is almost literally a single page web application.

qouteall1mo ago

It's probably due to server side rendering and rehydration. The rehydration use server side component state to override DOM state.

nicbou1mo ago

I switched away from ChatGPT mainly due to that. Gemini is much faster to type into.

0xy1mo ago

It's a rerendering bug. Insanely annoying.

trillic1mo ago

useEffect if I had to guess

hglaser1mo ago

Yep. We had to do a surprising amount of work to solve this in our product: https://www.kitewing.ai/blog/stateless-agents-stateful-produ...

Very weird that the foundational LLM companies' own chat pages don't do this.

kgeist1mo ago

>surprising amount of work

Dunno, in my Go+HTMX project, it was pretty trivial to add SSE streaming. When you open a new chat tab, we load existing data from the DB and then HTMX initiates SSE streaming with a single tag. When the server receives a SSE request from HTMX, it registers a goroutine and a new Go channel for this tab. The goroutine blocks and waits for new events in the channel. When something triggers a new message, there's a dispatcher which saves the event to the DB and then iterates over registered Go channels and sends the event to it. On a new event in the tab's channel, the tab's goroutine unblocks and passes the event from the channel to the SSE stream. HTMX handles inserting new data to the DOM. When a tab closes, the goroutine receives the notification via the request's context (another Go primitive), deregisters the channel and exits. If the server restarts, HTMX automatically reopens the SSE stream. It took probably one evening to implement.

luxurytent1mo ago

We resolved this by creating a separate context for the lifecycle of a chat/turn so if the user leaves the page, the process continues on the server. UI calls an RPC to fetch in progress turn, which allows it to resume, or if it's done, simply render the full turn.

Wasn't that complex!

zknillOP1mo ago

Assuming the traditional stateless routing of requests, say round robin from load balancers; how do you make sure the returning UI client ends up on the same backend server replica that's hosting the conversation?

Or is it that all your tokens go through a DB anyway?

It's fairly easy to keep an agent alive when a client goes away. It's a lot harder to attach the client back to that agents output when the client returns, without stuffing every token though the database.

viraptor1mo ago

You normally need to do that anyway. The specific backend host may have been destroyed in the meantime so you have to recover the context. And it's not like they're huge after compression.

xyzsparetimexyz1mo ago

It is honestly shocking how sloppy (pun intended) a lot of the online chatbot UIs are.

cyanydeez1mo ago

Its further fascinating how they're trying to sell coding tools and a future wgere these things arw integral

7777777phil1mo ago

The SSE thing is a symptom of something bigger imo. These models are stateless but we often act like context windows are memory. Nothing around them actually remembers anything, and vector search doesn't fix it. I went down this rabbit hole recently: https://philippdubach.com/posts/beyond-vector-search-why-llm...

mrcartmeneses1mo ago

This is a feature of the web. Browser refreshes SHOULD dump state. Otherwise it can be difficult to recover from system errors. Of course if you can build a system that is guaranteed to never have bugs then go ahead and disable this feature. But users may still be confused as to why refreshing hasn’t restarted their window

jasonjmcghee1mo ago

It's interesting because this is a solved problem with collaborative docs.

CRDT or OT will work great but are even overkill. But so many of the edge cases you'd usually need to think about just disappear.

(I've built an agent / chat that used CRDT to represent the chat. You can have an arbitrary number of tabs, closing/opening at any time. All real time, in sync.)

rcarmo1mo ago

I fixed that in my own front-end: https://github.com/rcarmo/vibes.

rbbydotdev1mo ago

t3.chat solves this pretty well. I believe they utilize convex db. I think it’s something like a backend server process is the true connection and state of the chat. The front end syncs and receives updates from it.

kazinator1mo ago

> What are folks doing to get around it?

Some are using Google Gemini.

It saves your chats, which are presented in a pane you can expand on the left and search. You can jump back into any chat and continue it, or delete individual chats.

This history is attached to your Google account, not to the chat window. You can pick up an existing chat in another browser on another device where you are authenticated with the same Google identity.

Now about the specific use scenario in this article (hitting refresh immediately after submitting a prompt, while the response is coming). Not sure why that would be important?

I just tried it several times. Both times, it initially appeared as if the Gemini interface lost the chats, since they didn't appear in the chat history section of the left pane. But after another refresh, they appeared. So there is just some delay.

Anyway, it's good in this regard beyond giving a damn.

nubg1mo ago

Lmao sorry but you completely missed the point of the article.

Yes of course all chat providers store your chats, and they will be available eventually when the response has finished streaming and has been dumped to a db.

This is about live streaming getting lost and not being reconnected (and restreamed) when you refresh the page.

And since chatting with AI and seeing the responses streamed is a major usecase, the author was correct to question why eg Anthropic wouldn invest some of the 30B in fixing this glaring problem.

Esp since it looks like your initial message was not received by the backend server at all!

It may not be super criticsl, but it's like saying "my ferrari sometimes shows the wrong speed. it's still driving, but the speedometer is stuck. it does get back to the correct speed eventually though, so no biggie"

kazinator1mo ago

So how do I repro the problem with Google Gemini?

1 more reply

j / k navigate · click thread line to collapse

25 comments

andersmurphy1mo ago

> But we’ve hit the ceiling for SSE. That terrible Claude UI refresh gif is state of the art for SSE. And it sucks.

This is nothing to do with SSE. It's trivial to persist state over disconnects and refresh with SSE. You can do all the same pub sub tricks.

None of theses companies are even using brotli on their SSE connection for 40-400x compression.

andersmurphy1mo ago

Just to add. The main value of websockets was faster up events pre http2. But, now with multiplexing in http2 that's no longer the case.

So the only thing you get from websockets is bidirectional events (at the cost of all the production challenges websockets bring). In practice most problems don't need that feature.

anonzzzies1mo ago

Thanks for that. I know very little about frontend and this definitely will help me make something better.

mrieck1mo ago

No mention of ChatGPT? Anyone else have this problem:

Go to ChatGPT.com while logged in, start typing right away, 8 words into typing it clears the text in the form. Why?

hsbauauvhabzb1mo ago

Claude also has odd UI/UX bugs in what is almost literally a single page web application.

qouteall1mo ago

It's probably due to server side rendering and rehydration. The rehydration use server side component state to override DOM state.

nicbou1mo ago

I switched away from ChatGPT mainly due to that. Gemini is much faster to type into.

0xy1mo ago

It's a rerendering bug. Insanely annoying.

trillic1mo ago

useEffect if I had to guess

hglaser1mo ago

Yep. We had to do a surprising amount of work to solve this in our product: https://www.kitewing.ai/blog/stateless-agents-stateful-produ...

Very weird that the foundational LLM companies' own chat pages don't do this.

kgeist1mo ago

>surprising amount of work

luxurytent1mo ago

Wasn't that complex!

zknillOP1mo ago

Or is it that all your tokens go through a DB anyway?

viraptor1mo ago

You normally need to do that anyway. The specific backend host may have been destroyed in the meantime so you have to recover the context. And it's not like they're huge after compression.

xyzsparetimexyz1mo ago

It is honestly shocking how sloppy (pun intended) a lot of the online chatbot UIs are.

cyanydeez1mo ago

Its further fascinating how they're trying to sell coding tools and a future wgere these things arw integral

7777777phil1mo ago

mrcartmeneses1mo ago

jasonjmcghee1mo ago

It's interesting because this is a solved problem with collaborative docs.

CRDT or OT will work great but are even overkill. But so many of the edge cases you'd usually need to think about just disappear.

(I've built an agent / chat that used CRDT to represent the chat. You can have an arbitrary number of tabs, closing/opening at any time. All real time, in sync.)

rcarmo1mo ago

I fixed that in my own front-end: https://github.com/rcarmo/vibes.

rbbydotdev1mo ago

kazinator1mo ago

> What are folks doing to get around it?

Some are using Google Gemini.

It saves your chats, which are presented in a pane you can expand on the left and search. You can jump back into any chat and continue it, or delete individual chats.

Now about the specific use scenario in this article (hitting refresh immediately after submitting a prompt, while the response is coming). Not sure why that would be important?

Anyway, it's good in this regard beyond giving a damn.

nubg1mo ago

Lmao sorry but you completely missed the point of the article.

Yes of course all chat providers store your chats, and they will be available eventually when the response has finished streaming and has been dumped to a db.

This is about live streaming getting lost and not being reconnected (and restreamed) when you refresh the page.

And since chatting with AI and seeing the responses streamed is a major usecase, the author was correct to question why eg Anthropic wouldn invest some of the 30B in fixing this glaring problem.

Esp since it looks like your initial message was not received by the backend server at all!

kazinator1mo ago

So how do I repro the problem with Google Gemini?

1 more reply

j / k navigate · click thread line to collapse