Is there a deep searcher that can also use local LLMs like those hosted by Ollama and LM Studio?
From a quick glance, this project doesn't seem to use any tool/function calling or streaming or format enforcement or any other "fancy" API features, so all chances are that it may just work, although I have some reservations about the quality, especially with smaller models.
This version appears to show off a vector store for documents generated from a web crawl (the writer is a vector-store-aaS company)
[1] https://github.com/huggingface/smolagents/tree/main/examples...
I think the biggest one is the goal: HF is to replicate the performance of Deep Research on the GAIA benchmark whereas ours is to teach agentic concepts and show how to build research agents with open-source.
Also, we go into the design in a lot more detail than HF's blog post. On the design side, HF uses code writing and execution as a tool, whereas we use prompt writing and calling as a tool. We do an explicit break down of the query into sub-queries, and sub-sub-queries, etc. whereas HF uses a chain of reasoning to decide what to do next.
I think ours is a better approach for producing a detailed report on an open-ended question, whereas HFs is better for answering a specific, challenging question in short form.
For now we’ve just managed to optimize how quickly we download pages, but haven’t found an API that actually caches them. Perhaps companies are concerned that they’ll be sued for it in the age of LLMs?
The Brave API provides ‘additional snippets’, meaning that you at least get multiple slices of the page, but it’s not quite a substitute.
I wrote my own implementation using various web search APIs and a puppeteer service to download individual documents as needed. It wasn't that hard but I do get blocked by some sites (reddit for example).
Google and Bing removed their cache features when LLMs started taking off – as I said in a sibling comment, I wonder if they felt that that regime was finally going to be challenged in court as people tried to protect their data.
That being said, "can't present the full document due to copyright" seems at odds with all of the above examples existing for years.
We started off with Arxive papers to test out the product- would love to get feedback :)
https://milvus.io/blog/i-built-a-deep-research-with-open-sou...
https://milvus.io/blog/introduce-deepsearcher-a-local-open-s...
https://gist.github.com/zitterbewegung/086dd344d16d4fd4b8931...
The QuickStart had a good response. [1] https://gist.github.com/zitterbewegung/086dd344d16d4fd4b8931...
It could be useful for comparing reports built using DeepSeek R1 vs. GPT-4o and other large models. The code being open source might highlight the limitations of different LLMs much faster and help develop better reasoning loops in future prompts for specific needs. Really interesting stuff.
Search is not a problem . What to search is!
Using reasoning model, it is much easier to split task and focus on what to search