undefined | Better HN

0 pointssroussey2y ago0 comments

You can use llama2 to do embedding and summaries and chat.

Turning the docs into questions is something I will test on stuff (just learning and getting a feel).

I am intrigued... what makes a good vector index??

0 comments

2 comments · 1 top-level

deckar012y ago· 1 in thread

My heuristic is how much noise is in the closest vectors. Even if the top k matches seem good, if the following noise has practically identical distance scores, it is going to fail a lot in practice. Ideally you could calculate some constant threshold so that everything closer is relevant and everything further is irrelevant.

srousseyOP2y ago

Apologies for being naive, but how do you calculate noise?

j / k navigate · click thread line to collapse