> "Hybrid approaches that use vector search for broad matches and rerank using BM25" Hybrid approaches, e.g. Learning To Rank, normally do it the other way around, given the main benefit of hybrid is to mitigate the cost (time) of vector search, i.e. use a non-vector search (e.g. BM25) to get a broadly relevant set of results first (and quickly), and then the much more computationally expensive vector search to rerank the smaller results set. There are various approaches to try to make vector search more viable across large corpuses, e.g. Locality Sensitive Hashing and Approximate Nearest Neighbour Search, but if you've implemented one of those than I'm not sure there'd be any benefit in retaining a hybrid approach.