Automatic context compression management. I think a killer feature of the LLM provider would be to store the entire context, but automatically compress it with internal LLM calls that summarize the big parts. Summarize large coding files to just class function names, summarize requests, etc...
And even if you have internal compression, also allow it to automatically expand on any portion of that context when a request is specifically about a certain file.
Right now a lot of the industry is trying to create the best agent which in turn means the best compression algorithms.