If it's judged as fair use, then yes. And then it's not flouting anything.
Remember the whole point of fair use is to benefit society by allowing reuse of material in ways that don't directly copy large portions of the material verbatim.
For example, nonfiction authors already "just take it" when reviews describe the main points of their book without paying them a cent. The justification is that it's for the greater good, and rights are limited.
[1] https://www.404media.co/judge-rules-training-ai-on-authors-b...
That's a rather bastardized and twisted representation of copyright and fair use.
The "whole point" of copyright was to promote the authorship of original creative works by legally protecting the financial income of those authors. The "whole point" of fair use was to make exceptions in cases where it's clear that the usage doesn't result in a market substitute and deprive original authors of their income.
The end-goal of LLMs is to ingest all of that original content and reproduce it with expert-level accuracy, promising to be the know-all, end-all product. If wildly optimistic predictions of LLM proponents turn out to be correct then they will never buy a book again, they will have no reason to. And this is precisely what the copyright was designed to protect authors against.
And under those circumstances, your opinion is that copyrighted books should continue to exist, with full legal protection?
How could anyone, including the authors, possibly benefit from an obsolete paradigm like that? At that hypothetical point, your attachment to legacy copyright law would arguably hold back human progress as a whole, not just impede a few greedy corporations from training models on illegally-downloaded books.
We should absolutely have a discussion about modernizing copyright (and patent!) protections. But it has to be done through a democratic process, companies shouldn't be allowed to just ignore laws that are inconvenient to their business model.
> At that hypothetical point, your attachment to legacy copyright law would arguably hold back human progress as a whole
There won't be any progress if nobody is getting paid for their work. Either copyright stands and LLMs aren't allowed to train without compensation, or they get an exemption and there will be nothing left to train on in a few years.
I'll stop you right there - I really don't think that applies at all. Does 'society' really benefit when the whole thing is a funnel for enormous amounts of wealth to go to already-gigantic companies like Microsoft?
If you don't like it, there's a process for changing how it works, but don't expect an easy path to success. Various people will object, and will have to be won over to your way of thinking.
Except the converse is true. Copyright law today governs how fair use works and even so, how material can be obtained, licensed, etc. To change it to explicitly allow what you're suggesting would require changing copyright law.
How do you think masked language models work?
Your license can only operate with what copyright allows you to withhold initially.
A license that banned AI training cannot be enforced. It is meaningless. The same way you can't write a book with a license that readers are not allowed to write reviews of it.
Fair use cannot be restricted by license like that.
(You can engage in individual contacts with people, with terms like NDA's work, but those actually have to be signed and stuff, and you can't do it with public information like published writing.)