Creating AI assistant with GPT and Ruby and Redis using embeddings (opens in new tab)

(release.com)

113 pointserik_landerholm3y ago25 comments

25 comments

From a discussion with a friend today..

Are embeddings a hack? Is building out tooling and databases and APIs and companies around embeddings all going to be for naught as soon as there's a solid LLM/API with a big enough context window?

jrpt3y ago

I don't think the context window will ever be big enough for some use cases. There was a recent paper talking about a million tokens, but that's just the Harry Potter books. Which is amazing, but you know there's going to be use cases that will have more than 7 books worth they want to use. Furthermore, the performance will be better when you don't have to give it all 1 million tokens, but just the most relevant parts of a context.

regiswilson3y ago

The short answer is that, yes, embeddings are probably a hack in the same way that using bits or short variable names were hacks to reduce memory usage. At some point you are correct: someone would prompt "given <large amount of data>, answer <user request>".

lukev3y ago

But embedding-based semantic search can handle arbitrary sized databases. I fully believe context windows are going to grow: I am skeptical they will grow to cover "all your company's documents" or even "the full encyclopedia" sizes.

1 more reply

dmix3y ago

It's more than just optimizing for space (which is still going to be important), it's also about using vector databases to seed the data from a wider dataset and translating that into something the AI can use. I mean technically in the far future you could dump a whole database into the 'context' and work off of it, but Vector DBs will fill that role in the meantime and add a memory layer on top of it for future queries.

1 more reply

joe_the_user3y ago

I'd say:

Yes - embeddings are a hack:

No - there won't anything like a "real API" unless there's a new discovery or a shift in the way LLMs are constructed. It's not theoretically impossible but there's no clear way to get guaranteed results from present day LLMs, all they do output guesses from their input text (combining prompt text and then user text).

jeremy_k3y ago

I can't say I'm very well versed in all of this but I was asking my coworkers today about whether embeddings were the way forward or if doing your own training would be more beneficial. Or even yet, could you take an open source model and train it specifically on just your content; would that wield better results?

Expanding context seems like an approach, but if you're trying to get an answer about your company's documentation, why would you need the entirety of GPT-X?

simonw3y ago

Every time I've asked this question the answer has been that injecting relevant content into the prompt provides much better results than attempting to fine-tune a model on your own content.

Here's a relevant quote: https://simonwillison.net/2023/Apr/15/ted-sanders-openai/

1 more reply

dragonwriter3y ago

The broad general training of GPT-X (and fine tuning on your content) provides context and (loosely speaking, at least) “analytical” ability, search-via-embeddings to inject material into the prompt provide exact recall of specific material, with capacity greater than the context limit.

Analogous, more or less, to a human with general experience (base training), experience with your code base (fine tuning), and the ability to reference the current code base directly (embedding-based search/recall). All three have a role, they are complementary rather than mutually exclusive.

1 more reply

fzliu3y ago

Even with an incredibly long context window (say, 1M tokens), attention still suffers from a problem with long-term dependencies. This is probably why OpenAI hasn't publicly released their 32k token length model just yet.

Der_Einzige3y ago

I think they haven't released it because the capabilities it has are simply too powerful when combined with a vectorDB.

2 more replies

toxicFork3y ago

Embeddings are useful for sentiment analysis and search in general, but given a "powerful enough AI with enough of a context window" they may be obsolete indeed, if it can do all of those things.

welfare3y ago

That's gotta be a Hacker News bingo if I've ever seen one.

taf23y ago

Think we can optimize this with rust

darkwater3y ago

And Postgres.

1 more reply

strudey3y ago

Can use https://github.com/alexrudall/ruby-openai to do this sort of thing also :)

j / k navigate · click thread line to collapse

25 comments

Mizza3y ago

From a discussion with a friend today..

Are embeddings a hack? Is building out tooling and databases and APIs and companies around embeddings all going to be for naught as soon as there's a solid LLM/API with a big enough context window?

jrpt3y ago

regiswilson3y ago

lukev3y ago

1 more reply

dmix3y ago

1 more reply

joe_the_user3y ago

I'd say:

Yes - embeddings are a hack:

jeremy_k3y ago

Expanding context seems like an approach, but if you're trying to get an answer about your company's documentation, why would you need the entirety of GPT-X?

simonw3y ago

Every time I've asked this question the answer has been that injecting relevant content into the prompt provides much better results than attempting to fine-tune a model on your own content.

Here's a relevant quote: https://simonwillison.net/2023/Apr/15/ted-sanders-openai/

1 more reply

dragonwriter3y ago

1 more reply

fzliu3y ago

Der_Einzige3y ago

I think they haven't released it because the capabilities it has are simply too powerful when combined with a vectorDB.

2 more replies

toxicFork3y ago

Embeddings are useful for sentiment analysis and search in general, but given a "powerful enough AI with enough of a context window" they may be obsolete indeed, if it can do all of those things.

welfare3y ago

That's gotta be a Hacker News bingo if I've ever seen one.

taf23y ago

Think we can optimize this with rust

darkwater3y ago

And Postgres.

1 more reply

strudey3y ago

Can use https://github.com/alexrudall/ruby-openai to do this sort of thing also :)

j / k navigate · click thread line to collapse