I built Remembrall, a proxy on top of your OpenAI queries that gives your chat system long-term memory.
How it works: Just add an extra user id to your OpenAI call. When a user stops chatting actively, it will trigger an "autosave" and use GPT to save/update important details about the conversation into a vector db. When the user continues the conversation, we'll query the db for relevant info and prepend it into the system prompt.
All happens in < 100 ms latency on the edge, with only two lines of code needed for integration. You also get observability (i.e. a log of all LLM requests) for free in a simple dashboard.