The major challenges are how to implement and manage such many indices within single database. That's why we build this database start from scratch. Infinity is actually a kind of "indexing" database, based on a columnar store. The executor also requires refined design to fuse these hybrid search approaches effectively.
1. Performance
The performance of pg_vector is far slower than vector search of Infinity due to the vector index design. The performance of pg_sparse is also slower than sparse vector search of infinity. The performance of pg_search is much slower than full text search of infinity. pg_search is based on Tantivy, which is much slower than the inverted index of infinity.
Detailed benchmark could be seen in this article : https://infiniflow.org/blog/fastest-hybrid-search or github repo.
2. Infinity has all the builtin implementation of the above three search approaches. These indices could work smoothly together with the executor of infinity. The users could use any combination of the search approaches, together with the fused ranking algorithms, in a very efficient approach.
3. Infinity has also builtin support for tensor, which makes it possible to deliver an in-database colbert reranker compared with the cross encoder based reranker outside. The colbert reranker could bring much benefits for search qualities.
4. Infinity is much easier to use, it could be deployed as either a standalone server, or as an embedded python library just through pip install.
5. Infinity is designed start from scratch, it does not have the burden of postgresql, and is evolving fast. It will run on cloud in very near future which could save the cost a lot.
1. pg_sparse is deprecated. pgvector released native sparse vector support with the `sparsevec` datatype, and ParadeDB no longer maintains pg_sparse. It has been this way for several months already.
I'd love to see a benchmark re: Tantivy. You claim that pg_search is much slower, but Tantivy is state-of-the-art for full-text search performance and the ParadeDB performance is robust. You can see our benchmarks in our repository README, where we compare ourselves to Elastic.
4/5. ParadeDB is Postgres by design. If you are adopting Postgres, which many are, then ParadeDB can be installed directly as an extension via logical replication on a read replica. This removes the need for ETL to a non-Postgres system, which drastically reduces operational burden.
Of course, if you're not using Postgres, ParadeDB is not designed for you and a tool like Infinity seems like a viable option alongside other standalone search engines.