I never use the $20 plan but I access everything via API and i spend a couple of dollars per month.
Although lately I have a home server that can do llama 3.1 8b uncensored and that actually works amazingly well.
The only thing I see is that it hallucinates a lot when you ask it for knowledge. Which makes sense because 8B is just not a lot to keep detailed information around. But the ability to recite training knowledge is really a misuse of LLMs and only a peculiar side-effect. I combine it with google searches (though OpenWebUI and SearXNG) and it works amazingly well then.
It works ok but with a large context it can still run out of memory and also gets a lot slower. With small context it's super snappy and surprisingly good. What it is bad at are facts/knowledge but this is not something a LLM is meant to do anyway. OpenWebUI has really good search engine integration which makes it work like perplexity does. That's a better option for knowledge usecases.
[1] https://www.nytimes.com/2024/09/27/technology/openai-chatgpt...
Maybe we can have a service that's an LLM with shared/cached responses as well as training on its own questions/answers for the easy stuff. Currently my usage is largely as a better search engine as google has gotten so much worse than it ever was, though I imagine I'll find custom things it's useful for as well.