undefined | Better HN

0 pointsm3kw92y ago0 comments

And then you will have to endure not using a model by OpenAI that is 10x better than a local one

0 comments

I think there is probably a threshold of usefulness, local LLMs are expensive to run but pretty close to it for most use cases now. In a couple years, our smartphones will probably be powerful enough to run LLMs locally that are good enough for 80% of uses.

grishka2y ago

What is so special about OpenAI's cloud hardware that one can't build themselves a similar server to run AI models of similar size?

airspresso2y ago

The hardware is primarily standard Nvidia GPUs (A100s, H100s), but the scale of the infrastructure is on another level entirely. These models currently need clusters of GPU-powered servers to make predictions fast enough. Which explains why OpenAI partnered with Microsoft and got billions in funding to spend on compute.

You can run (much) smaller LLM models on consumer-grade GPUs though. A single Nvidia GPU with 8 GB RAM is enough to get started with models like Zephyr, Mistral or Llama2 in their smallest versions (7B parameters). But it will be both slower and lower quality than anything OpenAI currently offers.

qeternity2y ago

> But it will be both slower and lower quality than anything OpenAI currently offers.

It will definitely not be slower. Local inference with a 7b model on a 3090/4090 will outpace 3.5-turbo and smoke 4-turbo.

baq2y ago

Nothing stopping you from buying an H100 and putting it in your desktop.

As for me, I’ve got other uses for $45k.

dcchambers2y ago

If you wanted to buy one...are you even able to order one? Aren't they massively backordered by big companies? Is there a black marker for H100 cards?

1 more reply

j / k navigate · click thread line to collapse