Remember that, at least in this country and I believe in all countries who signed onto the Berne Convention, copyright is the default.
If your AI is limited to only training on the paucity of explicitly permissibly licensed/public domain content (and as I think about it more, this would only apply to things that are permissibly licensed without an attribution requirement, which is something there is no meaningful way to do in a model like the one we are discussing) your AI will not be very useful. With that in mind, I would argue that yes, it absolutely is an obvious fact about its nature.