Scalable extraction of training data from (production) language models (opens in new tab)

(arxiv.org)

105 pointswazokazi2y ago14 comments

14 comments

skilled2y ago

Related (blog post from the team),

Extracting training data from ChatGPT (https://news.ycombinator.com/item?id=38458683) (126 comments)

And direct link,

https://not-just-memorization.github.io/extracting-training-...

hpcjoe2y ago

A friend sent me the image from page 9. The email signature. It is mine, from when I ran my company. Mid 2010s.

I'm not much worried about this specific example of information exfiltration, though I have significant concerns over how one may debug something like this for applications working with potentially more sensitive data than email signatures. Put another way, I think we are well within the infancy of this technology, and there is far more work needed before we have actually useful applications that have a concept of information security relative to their training data sets.

Aurornis2y ago

That's an unexpected surprise. Do you have any theories? Presumably someone posted one of your e-mails somewhere on the internet?

If you Google parts of the old signature, do you get any results?

GaggiX2y ago

>This leads to a natural question that has not yet been dis- cussed in the literature: if we could query a model infinitely, how much memorization could we extract in total?

You will get every 50-grams, not because the model memorized all of them but by pure chance. It seems pretty obvious to me.

It makes me question if there were some cases where the model output an identical 50-grams but it wasn't present in the training dataset of the model, like in a very structured setting, like assembly code where there is usually a very limited number of keywords used.

dwringer2y ago

One can fine tune a smaller parameter model like GPT-NeoX on a home GPU pretty readily, and it's absolutely capable of doing what you specified. Teach it with a bunch of example sentences that have parts of speech like verb and noun following a simple grammar, and you will see it generate sentences afterward that combine the parts of speech grammatically in novel ways, using the same grammatical structures but forming productions that did not appear in the training set.

Depending on settings, they are also capable of producing a lot of ungrammatical nonsense, but the odds of what it produces are changed considerably by the training.

GaggiX2y ago

No I mean creating 50-grams that appear in the dataset created by the paper linked by OP, but not present in the actual dataset the model was trained on. Of course, the model would be able to output 50-grams that were not present in either.

1 more reply

jerpint2y ago

Using memorization is something that can be a feature in some cases, especially in reducing hallucinations. Perhaps instead of embedding retrieval you’d condition a model to only repeat memorized relevant passages, something that can be trained end to end and beneficial for RAG

xsbos2y ago

This is probably an unintended feature bit I see no immediate problem in alignment not erasing raw memorized data. It could as well be a design choice to have a raw memory unaffected by alignment procedures

samuell2y ago

Interesting! This aligns with my hunch that we might soon start to actually store much of data in models rather than unwieldly datasets :) ... as I've been writing about:

https://livingsystems.substack.com/p/the-future-of-data-less...

MrThoughtful2y ago

An LLM remembers like a human. Mostly concepts, but some things it remembers verbatim.

Why is it a problem if a LLM tells you what it knows?

Are LLMs trained on secret data?

kevindamm2y ago

DeepMind recently extracted PII from ChatGPT by prompting (e.g., telling the LLM to repeat 'poem' indefinitely will cause a long sequence of that word until popping out of it and revealing by accident some PII from a person's email signature).

So, yes.

_ink_2y ago

> Are LLMs trained on secret data?

Probably. And on copyrighted data probably as well.

jamesdwilson2y ago

there's a lot of "copyrighted data" on wikipedia as well, for example from another HN post, lyrics from songs about truck crashes.

https://en.wikipedia.org/wiki/List_of_car_crash_songs

j / k navigate · click thread line to collapse

14 comments

skilled2y ago

Related (blog post from the team),

Extracting training data from ChatGPT (https://news.ycombinator.com/item?id=38458683) (126 comments)

And direct link,

https://not-just-memorization.github.io/extracting-training-...

hpcjoe2y ago

A friend sent me the image from page 9. The email signature. It is mine, from when I ran my company. Mid 2010s.

Aurornis2y ago

That's an unexpected surprise. Do you have any theories? Presumably someone posted one of your e-mails somewhere on the internet?

If you Google parts of the old signature, do you get any results?

GaggiX2y ago

>This leads to a natural question that has not yet been dis- cussed in the literature: if we could query a model infinitely, how much memorization could we extract in total?

You will get every 50-grams, not because the model memorized all of them but by pure chance. It seems pretty obvious to me.

dwringer2y ago

Depending on settings, they are also capable of producing a lot of ungrammatical nonsense, but the odds of what it produces are changed considerably by the training.

GaggiX2y ago

1 more reply

jerpint2y ago

xsbos2y ago

samuell2y ago

Interesting! This aligns with my hunch that we might soon start to actually store much of data in models rather than unwieldly datasets :) ... as I've been writing about:

https://livingsystems.substack.com/p/the-future-of-data-less...

MrThoughtful2y ago

An LLM remembers like a human. Mostly concepts, but some things it remembers verbatim.

Why is it a problem if a LLM tells you what it knows?

Are LLMs trained on secret data?

kevindamm2y ago

So, yes.

_ink_2y ago

> Are LLMs trained on secret data?

Probably. And on copyrighted data probably as well.

jamesdwilson2y ago

there's a lot of "copyrighted data" on wikipedia as well, for example from another HN post, lyrics from songs about truck crashes.

https://en.wikipedia.org/wiki/List_of_car_crash_songs

j / k navigate · click thread line to collapse