There's a similar thing happening on RAG. Where people think building the chat interaction is the hard thing. The hard thing is extracting + searching to get relevant context. A lot of founders I talk to suddenly realize this at the last minute, right before shipping, similar to search back in the day. It's harder than just throwing chunks in a vector DB. It involves a lot of different backend data sources potentially, and is in many ways harder than a standard search relevance problem (which is itself hard enough).
It's going to just evolve into recreating the various search and ranking processes of old just on top of a bit more semantic understanding with some smarter NLG layered in :). It won't be just LLMs, we'll have intent classification, named entity recognition, a personalization layer, reranking, all that fun stuff again.
It becomes a very frustrating experience matching the inherent chaos of a conversation.
In many ways it makes the chat more Siri-like than ChatGPT like. Which may not be what users actually expect.
They spent so much time on the UI and basically left the actual search to the last minute, and it was a hilarious failure on launch.
Last week I finished building my 3rd RAG stack for legal document retrieval. Almost-vanilla RAG got me 90-95% of the way. Only drawback is cost, still 10x-100x above the ideal price point; but that will only improve in the future.
The standard RAG-U uses vector embeddings of chunks, which are fetched from a vector index. An envisioned role of knowledge graphs is to improve standard RAG-U by explicitly linking the chunks through the entities they mention. This is a promising idea but one that need to be subjected to rigorous evaluation as done in prominent IR publications, e.g., SIGIR.
The post then discusses the scenario when an enterprise does not have a knowledge graph and discuss the ideal of automatically extracting knowledge graphs from unstructured pdfs and text documents. It covers the recent work that uses LLMs for this task (they're not yet competitive with specialized models) and highlights many interesting open questions.
Hope this is interesting to people who are interested in the area but intimidated because of the flood of activity (but don't be; I think the area is easier to digest than it may look.)
https://neuml.hashnode.dev/introducing-the-semantic-graph
https://github.com/neuml/txtai
Disclaimer: I'm the primary author of txtai
What's awesome about them is that they essentially form in my mind the "extractive" analogue to LLMs "generative" nature.
Semantic Graphs give every single graph theory algorithm a unique epistemological twist given any particular dataset. In my case, I've built and released pre-trained semantic graphs for my debate evidence. I observe that path traversals form "debate cases", and that graph centrality in this case finds the most "generic/universally applicable" evidence. Given a different dataset, the same algorithms will have different interpretations.
What makes txtai so awesome is that it creates a synchronized interface between an underlying vector DB, SQL DB, and a semantic knowledge graph. The flexibility and power this offers compared to other vector DB solutions is simply unparalleled. I have seen zero meaningful competition from a vectorDB industry which is flooded with money despite little product differentiation among themselves.
Disclaimer: I wrote an NLP paper with dmezzetti as my co-author about semantic graphs: https://aclanthology.org/2023.newsum-1.10.pdf
Most RAG tools seem to start with the LLM and add Vector building and retrieval around it, while this tool seems like it started with Vector / Graph building and retrieval, then added LLM support later.
According to the article, it is either costly (if using OpenAI), or slow using open source AI models. In both cases, predicting the quality of generated KG using LLMs is hard.
As some others here have pointed out, information extraction and searching with relevant context are the hardest parts of any search system, and it's clear that simply chunking vectors up and throwing them into a vector DB has limitations, no matter what the vector DB vendors tell you. Just like this article says, I hope that 2024 is the year where we actually get some papers that perform more rigorous evaluations of systems that use vector DBs, graph DBs, or a combination of them for building RAGs.