> If a LLM knows a song as part of its training data, then it is copyright infringement.
No it isn't. You can feed whatever you want into your LLM, including copyrighted data. The issues arise when you start reproducing or distributing copyrighted content.
That is mostly an issue of the latter, whether the service that Meta/OpenAI offers outputs content that is a violation of copyright. Technically, derivative works are a copyright violation, but if you're not distributing them, you normally have a good fair use argument, and/or nobody knows.