Data, implies factual information. You can not copyright factual information.
The fact that I use the word "appalling" to describe the practice of doing this results in some vector relationship between the words. Thats the data, the fact, not the writing itself.
There are going to be a bunch of interesting court cases where the court is going to have to backtrack on copyrighting facts. Or were going to have to get some real odd legal interpretations of how LLM's work (and buy into them). Or we're going to have to change the law (giving everyone else first mover advantage).
Base on how things have been working I am betting that it's the last one, because it pulls up the ladder.