Extracting training data from ChatGPT (https://news.ycombinator.com/item?id=38458683) (126 comments)
And direct link,
https://not-just-memorization.github.io/extracting-training-...
I'm not much worried about this specific example of information exfiltration, though I have significant concerns over how one may debug something like this for applications working with potentially more sensitive data than email signatures. Put another way, I think we are well within the infancy of this technology, and there is far more work needed before we have actually useful applications that have a concept of information security relative to their training data sets.
If you Google parts of the old signature, do you get any results?
You will get every 50-grams, not because the model memorized all of them but by pure chance. It seems pretty obvious to me.
It makes me question if there were some cases where the model output an identical 50-grams but it wasn't present in the training dataset of the model, like in a very structured setting, like assembly code where there is usually a very limited number of keywords used.
Depending on settings, they are also capable of producing a lot of ungrammatical nonsense, but the odds of what it produces are changed considerably by the training.
https://livingsystems.substack.com/p/the-future-of-data-less...
Why is it a problem if a LLM tells you what it knows?
Are LLMs trained on secret data?
So, yes.
Probably. And on copyrighted data probably as well.