For one, if you can show that you didn't use the original copyrighted work, then your work is not a derivative, no matter how similar the end results are.
And then if the original work was involved, how it was used and what processes were used to are also relevant.
That's why OpenAI employees who did the scraping first-hand are valuable witnesses to those who are suing OpenAI.
Legal processes proceed in a way that is often counter-intuitive to technologists. IMHO you'd gain a better perspective if you actually tried to understand it rather than confidently assume what you already know from tech-land applies to law.