>Yes, and who is supposed to run that code?
People that are honest and ethical?
And/or groups that don't want to risk getting sued? (Your: [1] [2] [3])?
>> Name two entities that were asked to stop using a given individuals' images that failed to stop using them after the stop request was issued.
>Github? OpenAI?[1] Stable Diffusion?[2] LAION?[3] What do you think why there are currently multiple high-profile lawsuits ongoing about exactly that topic?
Because:
a) (Some) American Lawyers (AKA "Bar Association Members") -- are Sue Happy?
b) Because various Governments / Deep States (foreign and domestic) / Dark Money Groups / Paid (and highly biased) Political Activists -- want to see if they can get new draconian laws (whilst believing their actions to be super-patriotic to their respective countries!) -- or at least court precedents that move in that direction -- passed?
c) Because there's big money at stake, all the way around? (https://www.biblegateway.com/passage/?search=1%20Timothy%206...)
d) Because the alleged "victims" are "playing the victim card"?
(https://tvtropes.org/pmwiki/pmwiki.php/Main/PlayingTheVictim...) (Note that as a theory, this pairs well with (a)!)
(How much revenue will they be losing if their net income from artwork was $0? Also, wouldn't such high profile cases give the artists a ton of free advertising? The Defendant companies should counter-sue for giving the Plaintiff artists what amounts to free publicity for their artwork so great that they couldn't buy it with all of the Google advertising credits in the world!)
>Besides, that's not how things work. Training a foundation model takes months and currently costs a fortune in hardware and power - and once the model is trained, there is, as of now, no way to remove individual images from the model without restraining.
>"without retraining"...
Meditate on that one for a moment...
>So in practical terms it's impossible to remove an image if it has already been trained on.
In practical terms -- just retrain the model -- sans ("without") the encroaching images!
The models will need to be updated every couple of months anyway to include new public data from the web!
Create a list of images NOT to include in the next run (see above, "no-ai.txt" -- good suggestion incidentally!) -- and then don't include them" on the next run!
It's not Rocket Science! :-)
(Also, arguably Elon Musk doesn't think that "Rocket Science" is in fact as hard as "Rocket Science" is purported to be -- but that's a separate debate! <g>)
>So the better question would be, name two entities who have ignored an artist's request to not include their image when they encountered it the first time. It's still a trick question though because the point is that scraping happens in private - we can't know which images were scraped without access to the training data. The one indication that it was probably scraped is if a model manages to reproduce it verbatim - which is the basis for some of the above lawsuits.
Explain to me, from the point of view of an AI company, how that AI company is to know ahead of time NOT to include an image from the web? (And thus not break the law, copyright law at least, and thus not incur the lawsuits and all the chaos that will apparently follow such an act?)
How is the AI company supposed to know, ahead of time, that a given image on the web is not to be included?
How please?
Because you see, that's the root of the problem you are trying to solve.
In fact, let me ask you a better question...
How can an arbitrary Internet User -- not a big, legally powerful AI company, but an arbitrary small-fry Internet User -- know ahead of time, that a given Image, exposed to the public via the public Internet; the Web -- that the artist who created that image (or the intellectual/artistic property holder) -- does NOT want their Image to be used for specific purposes?
?
Because well, I don't know of any easily parsible, easily understandable standard for that on the Web currently...
So, to recap, the question is:
How is everybody (humans and machines) to know the unambiguous, easily parsable, easily understandable uses that the artist (or intellectual/artistic property) of an image -- wishes/wills for that image?
And how to easily know the unintended uses?
That might be a better definition of the problem that is trying to be solved...