Google's Gemini AI caught scanning Google Drive PDF files without permission https://news.ycombinator.com/item?id=40965892 .
Looks like GPT4All[1] and AnythingLLM[2] are worth exploring. There's also the closed-source macOS app RecurseChat[3,4] which appeared on HN a few months ago[5].
[1] https://github.com/nomic-ai/gpt4all
[2] https://github.com/Mintplex-Labs/anything-llm