Yeah Heap's Law is a bit of a bitch in these situations. Like you'll definitely make sure those vectors are 64 bit if you plan of indexing a proper large number of documents.
I'd also advise caution leaning too much into the vector interpretation of these algorithms, as it's largely viewed as a quaint historical artifact that was bit of a dead end (e.g. as in Croft, Metzler & Strohman 7.2.1)
I was looking for the fine print on their "Try For Free"/"Free Tier Available" and was pleasantly surprised by
Qdrant Vector Search Cloud
Start building now!
A free forever 1GB cluster included for trying out.
No credit card required.
[0] https://qdrant.tech/pricing/I came across two questions. Perhaps some kind folks with more experience can shed some light on these qdrant use cases.
1. for embeddings for use cases such as LLM chat bots, I split internal data into chunks. Those chunks are then vectorized and stored. Alongside the entry itself, I stored the original chunk in metadata. That way, a lookup can immediately feed that into the LLM prompt context, without lookup in a secondary data store by some ID. Feels like a hack. Is that a sensible use case?
2. I resorted to using `fastembed` and generated all embedding client-side. Why is it that qdrant queries, in the ordinary case (also showcased a lot in their docs, e.g. [0]), expect a ready-made vector? I thought the point of vector DBs was to vectorize input data, store it, and later vectorize any text queries themselves?
Having to do all that client-side feels besides the point; for example, what if two separate clients use separate models (I used [1])? Their vectorizations will differ. I thought the DB is the source of truth here.
In any case, fascinating technology. Thanks for putting it together and making it this accessible.
[0]: https://qdrant.tech/documentation/quick-start/#run-a-query
[1]: `sentence-transformers/all-MiniLM-L6-v2`, following https://qdrant.tech/documentation/tutorials/neural-search-fa...
For my applications, I use pgvector since I can also use fulltext indexes and JOINs with the rest of my business logic which is stored in a postgres database. This also makes it easier to implement hybrid search, where the fulltext results and semantic search results are combined and reranked.
I think the main selling-point for standalone vector databases is scale, i.e., when you have a single "corpus" of over 10^7 chunks and embedding vectors that needs to serve hundreds of req/s. In my opinion, the overhead of maintaining a separate database that requires syncing with your primary database did not make sense for my application.
2. You often can perform the embedding in the DB, but there are a lot of use cases where you want to manage your embedding models outside the DB. This way you aren't dependent on which models the DB supports and you don't duplicate them throughout your system
It shows which Vector DBs have a particular feature. "In-built Text Embeddings creation" is a column you can look at.
Does Qdrant look like a winning horse then?
Was about to use Weaviate for a project today and this gives me pause. Anyone have some strong opinions? pg_vector also been on my radar recently. Qdrant vs Weaviate I know is partially a rust vs go topic.
Faiss and Pinecone are at the top (disclosure: I'm from Pinecone). But Faiss isn't really a full-fledged vector DB. Pinecone is a managed option which is out of the question for a company like Twitter and maybe for you (although you should consider it). After that comes Chroma in third, and then Qdrant, and then Weaviate.
Chroma has a big following by virtue of being plugged into the AI ecosystem in SF. Qdrant seems to be doing great work but their location in Europe is probably not helping.
Also, don't believe everything posted on the internet ;)
Local mode: https://github.com/qdrant/qdrant-client#local-mode
Tried a few DBs that didn't work well (e.g. I think it was ChromaDB that didn't support Python 3.12) and ended up picking LanceDB.
Very simple onboarding (just built on top of parquet) but there are a few rough edges.
Curious how it compares with qdrant for non-crazy problems
I’ve been building a Hasura Data Connector for Qdrant and it’s been too much fun. Glad to see them getting talked about here.