It isn’t clear at all that there’s any infringement going on at all, except in cases where AI output reproduces copyrighted content or content that is sufficiently close to copyrighted content to constitute a derivative work. For example, if you told an LLM to write a Harry Potter fanfic, that would be infringement - fanfics are actually infringing derivative works that usually get a pass because nobody wants to sue their fanbase.
It’s very unlikely simply training an LLM on “unlicensed” work constitutes infringement. It could possibly be that the model itself, when published, would represent a derivative work, but it’s unlikely that most output would be unless specifically prompted to be.