I find the entire premise of this announcement absurd. Fraudulent accounts? They're just accounts. They paid for the access the same as any other. They're accessing Claude just like a human (or *claw) would.
There's no argument against their strategy that doesn't make them complete hypocrites in respect to how they got the model training data in the first place.
Instead of vacuuming petabytes of trash from Common Crawl, you can just take high-quality distillate from a SOTA model and get comparable results. Bad news for anyone betting solely on massive compute clusters and closed datasets
Lots of people think Anthropic training their own LLM is the same but it really isn’t.
I don’t think I’m the only one feeling some schadenfreude at this news. I suppose it’s ok when you’re a hot Silicon Valley scale-up to slurp up the rest of the worlds data for free and then hire hot shot lawyers to defend you against all the creatives you ripped off, but when it’s the “evil” Chinese doing the same to you it’s a dastardly “attack”?
And now the hypocrisy went full circle with complains of others not respecting their rights!
They publish weights and useful research for everyone to benefit.
I mean this is incredibly tone deaf for a company facing multiple lawsuits over where they got their training data from.