It already sounds like you're using several IPs to access sites, which seems like a work around to someone somewhere trying to limit the use of one IP (or just lack of desire to host and distribute the data yourself to your various hosts).
Just because you can do something doesn't mean everyone must accept and like that you are doing that thing.
GP is absolutely right. If your server is just going to send me traffic when I ask I’m just going to ask and do what I want with the response.
Your server will respond fine if I click through with different IPs and it’s just a menial task to have this distribution of requests to IPs, which is what we made computers for.
Yeah, you’re right of course that no one has to like the “piracy” or “scraping” or whatever other name you’re giving to a completely normal request-response interaction between machines. They can complain. And I can say they’re silly for complaining. No one has to like anything. Heck you could hate ice cream.
A personal website is like a community cupboard or an open access water tap, people put it out there for others to enjoy but when the reseller shows up and takes it all it's no longer sustainable to provide the service.
Of course, it's all a spectrum: from monster corporations that build in the loss to their projections and participate in wholesale data collection and selling to open websites with no ads or limited ads as a sort of donation box; from a person using css/js to block ads or software to pirate for cheaper entertainment to an AI scrapper using swathes of IPs and servers to non-stop request all the data you're hosting for their own monetary gain. I have different opinions depending on where on the spectrum you are. But I do think piracy and ad blocking are on the same spectrum, and much closer to acceptable than mass AI scraping.
These responses were more about your comments about AI scraping then the piracy vs ad blocking conversation, but in my opinion the gap between them and scraping is quite large.
If blocking ads is permissible because the server cannot control the client but can control itself; then so is “scraping”. Both services ask of their clients something they cannot enforce. And both find that the clients refuse.
If you find the justification valid but decide that the conclusion is nonetheless absurd, you must find which step in the reasoning has a failure. The temptation is epicyclic: corporations vs humans or something of the sort; commercial vs non-commercial.
But on its own there is no justification. It’s just that your principles lead you to absurdity but you refuse to revisit them because you like taking from others but you don’t like when others take from you. A fairly simple answer. Nothing for Occam’s Razor to divide.
Particularly believable because the arrival of AI models trained on the world seems to have coincided with some kind of copyright maximalism that this forum has never seen before. Were the advocates of the RIAA simply not users yet?
Or, more believably, is it just that taking feels good but being taken from feels bad?
I stated that the open internet as a whole is the commons, not any specific person's pet project, and thus, that AI scraping (or any bulk scraping done commonly and wholesale) makes it untenable for most people to keep participating. Twitter for example has gone your preferred way, mostly requiring authentication to access. There are many arguments on HN about whether that's a good move, or even a move that others could take and expect success. And that's a huge platform. Just recently there have been front page posts on HN about bringing back personal blogs, and also posts about how personal blogs not behind the great wall of Cloudflare led to TBs of "false" traffic because of scrapers, which costs real money.
I stated I think piracy, ad block, and AI scraping to be part of the same spectrum. I think the justification for ad blocking has a much lower level of burden than the justification for AI scraping to the point you need multiple IPs and argue for whitelisting as the only option to stop it, because of the amount of effect you are having.
Much like how bandwidth has different levels of payment if you use less than 100 MB or more than 1 TB, or how delivering a package that weighs 10 lbs is way cheaper than a package that weighs 1000 lbs, or how at some level of effort times repetition it makes sense to automate something programmatically vs just doing it manually. There are of course situations where each makes sense, but the expectations can vary, and the results are not always linear depending on the inputs. This all completely ignores the social aspect of it that can add a whole new layer of complexity that has it's own logic.
Scraping (or access without ads eg ad blockiing, or outside sharing of data eg piracy) has always been complained about by those that have data that people want to scrape, eg airlines or hbo or disney, it's just that now all data is data that is being scraped absolutely non-stop to the detriment of many and the gain of few that everyone has a reason to complain. It also explains why people have differing opinions.