Of course, there's a ton of note-taking systems out there. Org-Mode [1], Obsidian [2], plain .txt, ...
And it's become quite simple to integrate such systems with LLMs.
Whether to add that data to the LLM [3], using LLM formatting, or to visualize and use it as a personal partner. For the latter - there's also a ton of open-source UIs such as Chatbot-ui[4] and Reor[5].
And that's just the tip of the iceberg.
Personally, I haven't been consistent enough through the years in note-taking.
So, I'm really curious to learn more about those of you who were and implemented such pipelines.
I'm sure there's a ton of cool interaction experiences.
[1] https://orgmode.org/ [2] https://obsidian.md/ [3] https://ollama.com/ [4] https://github.com/mckaywrigley/chatbot-ui [5] https://github.com/reorproject/reor
- The audio is preprocessed (chunked) and sent to Whisper to generate a transcript
- The transcript is sent to GPT-4 to generate a summary, action items, concepts introduced with additional information
- The next meeting’s date/time is added to my calendar
- A chatbot is created that allows me to chat with each session, including playing the role as the therapist and continuing the conversation (with the entire context of what I actually talked about)
It’s been exceedingly helpful to be able to review all my therapy sessions this way.
I'm actually the founder of an AI Meeting Bot company - and we're thinking of open-sourcing so you could run exactly this set-up locally with perfect diarization / recording while also maintaining privacy [1].
I'm currently creating code examples, and just finished the "chat with each session". Would love to know how you implemented it.
This flow could help me improve fluency in her sessions ( eg. she has a hardware translation device (expensive) which has significant issues auto translating ), since it's missing context a lot.
Eg. When grieving is incorrectly translated between dutch-polish, it defeats a bit of the purpose of being fluent in your native language.
Reducing the error rate would help a lot.
At the moment it runs on AWS, and we're thinking of open-sourcing so you could also run it locally to maintain 100% privacy of such conversations.
You'd get speaker diarization, names on top of the recording [2].
[1] https://aimeetingbot.com [2] https://spoke-1.gitbook.io/ai-meeting-bot
Happy to get in touch and have you run it
I found GPT4ALL (https://gpt4all.io) to have a nice-enough GUI, and it runs reasonably quickly on my M1 MacBook Air with 8Gb of ram, and it can be setup to be a completely local solution - not sending your data to the Goliaths.
GPT4ALL has an option to access local documents, via the Sbert text embedding model (RAG).
My specific results have been as follows; using the Nous Hermes 2 Mistral DPO and Sbert - I indexed 153 days of my daily writing (most days I write between 2 and 3 thousand words).
Asking a simple question like "what are the challenges faced by the author?" provides remarkable, almost spooky results (which I won't share here) - which in my opinion are spot-on regarding my own challenges over that the period - and Sbert provides references to which documents it used to generate the answer. Options are available to reference an arbitrary number of documents, however the default is 10. Ideally I'd like to have it reference all 153 documents in the query - I'm not sure if it's a ram or a token issue, however increasing the value of documents referenced has resulted in machine lock-ups.
Anyhow - that's my experience - hope it's helpful to someone.
This is regular embeddings + LLM.
At the end of the day, you are basically just adding a preprompt to a search. Not to mention, the Mistral models are barely useful for logic.
I'm not really sure what you are getting out of it. I'm wondering if you are reading some mostly generic Mistral output with a few words from your pre-prompt/embedding.
I haven't yet observed it being completely incorrect - I keep the queries simple without negation.
>This might seem impressive because of the subjectiveness.
It's surprising how it can summarise my relationship with another person, for example - if I ask "who is X?" it will deliver quite a succinct summary of the relationship - using my own words at times.
>I'm not really sure what you are getting out of it.
Mostly it's useful for self-reflection, it's helped me to see challenges I was facing from a more generalised perspective - particularly in my relationships with others. I'm also terribly impressed by the technology - being able to natural-language query and receive a sensible, and often insightful response - it feels like the future to me.
When you say Sbert you mean the GPT4All LocalDocs plugin?
One thing I haven’t worked out yet is the agent reliably understanding if it should do a “point retrieval query” or an “aggregation query.”
Point query: embed and do vector lookup with some max N and distance threshold. For example: “Who prepared my 2023 taxes?”
Aggregation query: select a larger collection of documents (1k+) that possibly don’t fit in the context window and reason over the collection. “Summarize all of the correspondence I’ve had with tax preparation agencies over the past 10 years”
The latter may be solved with just a larger max N and larger context window.
Almost like it’s a search lookup vs. a map reduce.
Mind sharing how you set up your RAG pipeline and which (presumabely FOSS) components you incorporated?
I'm not a good photographer, but I have taken tens of thousands of photos of my family. I would love to provide a prompt for a specific day and persons and have it create a photo that I never was able to take. I don't mind that it's not "real" because I find photography to be philosophically unreal as it is. I want it to look good, and inspire my mind to recreate the day however it can imagine.
And I want to do it locally, without giving away my family's data and identity.
However, I find it challenging to achieve on Macbooks, despite all the neural core horsepower I have. If anyone has achieved this with non-NVIDIA setups i'd love to hear!
completely unrelated i just read a scifi story where a technology was developed that could revive dead bodies for a short while in order to pose for family photos that they hadn't created before the person passed away.
https://clarkesworldmagazine.com/liu_03_23/
obviously going way overboard for something AI can do today, but probably the author wrote the story before that.
That being said, I’m trying to document as much as my life in anticipation of such programs existing in the near future. I’m not going overboard, but for example, I wouldn’t really keep a personal diary, but now I try to jot down something every day, write down my thought processes on things, what actions were done and why.
I’m looking forward to a day where I have an AI assistant (locally hosted and under my control of course) who can help me with decision-making based on my previous actions. Would be neat to compare/contrast how I do things now, compared to the future me.
https://www.lospessore.com/13/07/2023/una-chatbot-per-contin...
Since presumably "all writings" refers to all his writings during his lifetime, I'd hope it can account for those times in his life at which he changend his mind on certain topics?
Khoj will allow you to plug in your Obsidian vault or any plaintext files on your machine or Notion workspace. After you share the relevant data, it creates embeddings and uses it for RAG, so you get appropriately contextual responses with your LLM.
This is the best place to start for self-hosting: https://docs.khoj.dev/get-started/setup
Have you looked into the OpenAI APIs? They make it relatively easy to do assuming you have some limited programming knowledge.
Just playing around for now, but it makes sense to have a runnable example for our users too :) [2].
I mean calling the embeddings API and then having software locally that finds and appends documents to your queries.
Your blog post is really neat on top - thanks for sharing
Presumably, I have more than enough messages from me along with responses from others to chat with a version of myself that bears an incredible likeness to how I speak and think. In some cases, I'd expect to be able to chat with an LLM of a given contact to see how they'd respond to various questions as well.
It must run locally / require no network requests. I can run on an M2 w 24GB or M3 with 36GB.
My email is in my profile here.
Anyone had success RAG-ing a chat history??
It is not "training" a model but works pretty great.
I don't use an org-roam note system but I've been working on a similar and highly opinionated note system that I'm always making tools for. And I'm always interested in seeing people's ideal note systems.
my crude WIP Obsidian / Markdown note RAG tool: https://github.com/bs7280/markdown-embeddings-search
I have a ton of databases on Notion (with all my teams conversation transcripts, meeting to-dos, etc.) and global AI search just isn't there.
I haven't found a way there (but have elsewhere using open source) to create a kick-ass search.