undefined | Better HN

0 pointssangnoir2y ago0 comments

> With relatively few exceptions, I can probably accomplish what I use a search engine for manually, but with much more time and effort. So I'm paying for a tool to assist me with my use of the web. A "user agent", you might even say

1. User agents should identify themselves

2. A crawler is not a User agent - it's an agent for Brave

>I don't think there's any court at this point that would back you up that freely published content annotated with full provenance cannot be scraped and published for a fee.

You can't end-run copyright like this: just because something is publicly available doesn't mean anyone can redistribute it. Look at the legal issues & cases relating to Library Genesis.

0 comments

bastawhiz2y ago

> User agents should identify themselves

There is no rule that this is true, and many user agents exist _specifically to not be identified_. See Tor and other privacy-centric user agents.

> A crawler is not a User agent - it's an agent for Brave

You know, I thought "what does Wikipedia have to say on this matter?" and sure enough:

> Examples include all common web browsers, such as Google Chrome, Mozilla Firefox, and Safari, as well as some email readers, command-line utilities like cURL, and arguably headless services that power part of a larger application, such as a web crawler.

I can't even make that up.

> just because something is publicly available doesn't mean anyone can redistribute it

You're mistaking reselling content with providing access to it. By your logic, caching proxy servers would be illegal on the grounds of copyright. The physical act of downloading files necessarily creates copies of the data every step of the journey from the source server to you. There's a material difference between paying someone for a copy of some content and paying someone to fetch content for you on your behalf. Nothing about copyright law specifically requires the person physically acquiring the content is the one who ends up consuming it.

memefrog2y ago

Downloading something isnt redistributing it. It is your website. You provide what is on it to me. I send you an HTTP request. You dont have to respond. You do. I am not copying anything. Copyright simply isnt engaged at any point in this process.

j / k navigate · click thread line to collapse