Show HN: Embeddings Solution for Personal Journal

3 pointstony_codes2y ago4 comments

It occurred to me that a personal diary/journal is one of the most interesting data sets for a vector embedding / chat context product. People express their hopes, dreams, fears and much more in their journal. A psychiatrist/psychologist/therapist would have many insights from reading a journal.

So this is what I'm trying to build at Jumble Journal. We enable people to chat with their past journals as a first feature.

Technology

For vector embeddings and similarity search, we use ChromaDB. It's open source and has great performance. For something small scale, I didn't want to get locked into one of the Vector DB services like Pinecone.

The DB is hosted on an EC2 instance. Backend API is serverless with AWS Lambda and API Gateway. We back up all embeddings in S3 in case of failure.

I am really happy to discuss the technology stack and the feature itself.

Links https://jumblejournal.org https://www.trychroma.com/

4 comments

tony_codesOP2y ago

The formatting is a bit off.

The web app is here: https://jumblejournal.org

The DB used is here: https://www.trychroma.com/

rbanffy2y ago

The scariest part is to put someone's innermost thoughts in a place someone else could calculate or use their embeddings

tony_codesOP2y ago

true, we host the vector DB in our own VPC. Obviously it requires trust between the customers and Jumble though. It's tricky, these technologies are cheap as a service, but very expensive for private instances.

enkrateiaLucca2y ago

Awesommmeeeeee

j / k navigate · click thread line to collapse

4 comments

tony_codesOP2y ago

The formatting is a bit off.

The web app is here: https://jumblejournal.org

The DB used is here: https://www.trychroma.com/

rbanffy2y ago

The scariest part is to put someone's innermost thoughts in a place someone else could calculate or use their embeddings

tony_codesOP2y ago

enkrateiaLucca2y ago

Awesommmeeeeee

j / k navigate · click thread line to collapse