undefined | Better HN

0 pointsfrom-nibly1y ago0 comments

I doubt $20 a month is going to cover it. How much would you.pay before you weren't getting a good deal?

0 comments

The thing is, you can be economical. You don't need a GPT-4 quality model for everything. Some things are just low value where a 3.5 model would do just fine.

I never use the $20 plan but I access everything via API and i spend a couple of dollars per month.

Although lately I have a home server that can do llama 3.1 8b uncensored and that actually works amazingly well.

changing19991y ago

I use Llama 3.1 8B Instruct 128k at home and that pretty much covers all my LLM needs. Don't see a reason to pay for GPT-4.

wkat42421y ago

Yeah it's good, right?? Amazingly good. The first-gen small models were a bit iffy but Llama 3.1 is so good <3

The only thing I see is that it hallucinates a lot when you ask it for knowledge. Which makes sense because 8B is just not a lot to keep detailed information around. But the ability to recite training knowledge is really a misuse of LLMs and only a peculiar side-effect. I combine it with google searches (though OpenWebUI and SearXNG) and it works amazingly well then.

1 more reply

fennecfoxy1y ago

Yeah, and realistically once we can get hardware powerful but cheap/energy efficient enough to run llm + TTS + ASR without any noticeable delay during a conversation then who needs cloud services for most stuff. The really big models will still be useful, but really only for specific things.

asyx1y ago

What’s your home server specs? I might want to host something like that too but I think I’d need to upgrade.

wkat42421y ago

An old Ryzen CPU (2600 IIRC) and Radeon Pro VII 16GB. Got it new at a really good price.

It works ok but with a large context it can still run out of memory and also gets a lot slower. With small context it's super snappy and surprisingly good. What it is bad at are facts/knowledge but this is not something a LLM is meant to do anyway. OpenWebUI has really good search engine integration which makes it work like perplexity does. That's a better option for knowledge usecases.

titanomachy1y ago

If we’re talking about inference costs alone (minimal new training) I think it would cover it. I started doing usage-based pricing through openrouter instead of paying monthly subscriptions, and my modest LLM use costs me about $3/month. Unless you think that the API rates are also priced below cost.

mikeocool1y ago

According the New York Times [1], OpenAI told investors that they're costs are increasing as more people use their product, and they are planning to increase the cost of ChatGPT from $20 to $44 over the next five years. That certainly suggests that they're selling inference at a loss right now.

[1] https://www.nytimes.com/2024/09/27/technology/openai-chatgpt...

karmakaze1y ago

My bigger concern isn't the financial cost, it's that the amount represents a degree of energy expenditure. A flat rate might not be advisable and having usage-based pricing may limit energy waste. OTOH I don't know how much energy some of the other recurring things I consume monthly are either.

Maybe we can have a service that's an LLM with shared/cached responses as well as training on its own questions/answers for the easy stuff. Currently my usage is largely as a better search engine as google has gotten so much worse than it ever was, though I imagine I'll find custom things it's useful for as well.

j / k navigate · click thread line to collapse

0 comments

wkat42421y ago

The thing is, you can be economical. You don't need a GPT-4 quality model for everything. Some things are just low value where a 3.5 model would do just fine.

I never use the $20 plan but I access everything via API and i spend a couple of dollars per month.

Although lately I have a home server that can do llama 3.1 8b uncensored and that actually works amazingly well.

changing19991y ago

I use Llama 3.1 8B Instruct 128k at home and that pretty much covers all my LLM needs. Don't see a reason to pay for GPT-4.

wkat42421y ago

Yeah it's good, right?? Amazingly good. The first-gen small models were a bit iffy but Llama 3.1 is so good <3

1 more reply

fennecfoxy1y ago

asyx1y ago

What’s your home server specs? I might want to host something like that too but I think I’d need to upgrade.

wkat42421y ago

An old Ryzen CPU (2600 IIRC) and Radeon Pro VII 16GB. Got it new at a really good price.

titanomachy1y ago

mikeocool1y ago

[1] https://www.nytimes.com/2024/09/27/technology/openai-chatgpt...

karmakaze1y ago

j / k navigate · click thread line to collapse