This is just an assertion that you're making. There's no argument here. I'm aware that this is also an assertion that some judges have made.
My claim is that LLMs are not human, therefore when you apply words like "training" to them, you're only doing it metaphorically. It's no more "training" than copying code to a different hard drive is training that hard drive. And it's no more "transformative" than rar'ing or zipping the code, then unzipping it. I can't sell my jpgs of pngs I downloaded from Getty.
I have no idea how LLMs can be considered transformative work that immunizes me from owing the least bit of respect to the source material, but if I sample 2-6 second snatches from 10 different songs, put them through over 9000 filters and blend them into a new work, I owe money to everyone involved. I might even owe money to the people who wrote the filters, depending on the licensing.
> 98.7% unique.
This doesn't mean anything. This is a meaningless arrangement of words. The way we figure out things are piracy is through provenance, not bizarre ad hoc measurements. If I read a book in Spanish and rewrite it in English, it doesn't suddenly become mine even though it's 96.6492387% unique. Not even if I drop a few chapters, add in a couple of my own, and change the ending.