> User agents should identify themselves
There is no rule that this is true, and many user agents exist _specifically to not be identified_. See Tor and other privacy-centric user agents.
> A crawler is not a User agent - it's an agent for Brave
You know, I thought "what does Wikipedia have to say on this matter?" and sure enough:
> Examples include all common web browsers, such as Google Chrome, Mozilla Firefox, and Safari, as well as some email readers, command-line utilities like cURL, and arguably headless services that power part of a larger application, such as a web crawler.
I can't even make that up.
> just because something is publicly available doesn't mean anyone can redistribute it
You're mistaking reselling content with providing access to it. By your logic, caching proxy servers would be illegal on the grounds of copyright. The physical act of downloading files necessarily creates copies of the data every step of the journey from the source server to you. There's a material difference between paying someone for a copy of some content and paying someone to fetch content for you on your behalf. Nothing about copyright law specifically requires the person physically acquiring the content is the one who ends up consuming it.