1
Ask HN: How to remove Ads from a downloaded HTML file to output an ad free file?
Is there a tool/script that will allow me to filter out ads from a page when downloading it using curl. (Similar to how uBlock Origin works for a browser).
Basically, what I am doing is downloading a snapshot of a site using curl. But the sites have advertisements in them which I want to filter out. So is there a tool that will let me do that from the command line so that the output file doesn't have ads in it?
In short, I want something like uBlock Origin but for html files that I will be converting to PDF's or epubs. Something like:
curl https://www.google.com | AdRemover.sh | htmltopdf
Most of the solutions I found require you to update the /etc/hosts file to stop showing the ads but would rather avoid that if possible.