What I found is a huge number of custom GPTs upload pirated PDFs and ebooks. For example: https://x.com/TechWithElias/status/1733448828542722102?s=20
Additionally, I found an unbelievable level of sloppiness with some of the knowledge files provided. The authors literally save HTML pages (with its ugly inline JavaScript) 200kb+ and upload as knowledge files. Even if RAG is employed, you are guaranteed the worst performance ever.
Hear me out:
This practice really enforces my assumptions that OpenAI's custom GPTs' knowledge search uses no RAG whatsoever and instead is just a basic regular expression search for keywords from your prompt. Then the relevant 'pages' are extracted and fed as context. I doubt OAI uses vector dbs or any proper RAG pipeline.
Finally:
What a new security nightmare going forward to manage all those uploaded files, exploitable files, passing 'secrets' through custom GPTs, drops and what not.
Food for thought and just my two cents.