However, I should caution that in this era of companies being particularly user-hostile and authoritarian, especially Big Tech, I would be more careful with sharing stuff like this. Being forced to run JS is bad enough; profiling users based on other traits, and essentially determining if they are using "approved" software, is a dystopia we should fight strongly against. Stallman's Right To Read comes to mind as a very relevant warning story.
Like, I get the need for some protective mechanisms for interactive content/posting/etc, but there should be zero cases where a simple HTTP 200 GET requires javascript/client side crap. If they serve me a slightly stale version of the remote resource (5 minutes/whatnot) that's fine.
They've effectively just turned into a google protection racket. Small/special purpose search/archive tools are just stonewalled.
The best you've got is "essentially off" but that wording is such because even with everything disabled there are still edge cases where their security will enforce a JS challenge or CAPTCHA.
I don't really know how you resolve that absent just like... putting everything behind logins, though.
Not all sites are configured to do this. Some pages are expensive to render and have no cache layer.
Right to Read indeed... fanfiction.net has over the last months become really annoying. Especially at night, when you have the FFN UI set to dark, and then out of nothing a bright white Cloudflare page appears. Or why the Cloudflare "anti bot" protection leads to an endless loop when the browser is the Android web view inside a third-party Reddit client.
A browser is becoming a universal agent by itself, but many people (maybe increasingly) use terminal to access to the resources, and stonewalling these paths are never OK in my book.
you should really be impersonating an ESR version (eg. 91). Versions from the release channel is updated every month or so, and everyone has autoupdate enabled. Therefore unless you keep it up to date, your fingerprint is going to stick out like a sore thumb in a few months. On the other hand, ESR sticks to one version and shouldn't change significantly during its one year lifetime. It's still going to stick out to some extent (most people don't use ESR), but at least you have some enterprises who use ESR to blend into.
Perhaps you may get broken sites with Firefox, because no-one cared. But banning? Seems like a stretch.
As a user of non-browser clients (not curl though) I have not run into this in the wild.^1
Anyone have an example of a site that blocks non-browser clients based on TLS fingerprint.
1. As far as I know. The only site I know of today that is blocking non-browser clients appears to be www.startpage.com. Perhaps this is the heuristic they are using. More likely it is something simpler I have not figured out yet.
In my own country, for critical sites, I will probably have go to court since 'noscript/basic (x)html" interop was broken in the last few years.
Always thought it was the myriad of cookies and expiry time of said cookies that tend to make non-browser clients more obvious to CF.
The HTTP requests were exact copies of browser requests (in terms of how the server would've seen them), so it was something below HTTP. I ended up finding a lot of info about Cloudflare and the TLS stuff on StackOverflow, with others having similar issues. Someone even made an API to do the TLS stuff as a service, but was too expensive for me. https://pixeljets.com/blog/scrape-ninja-bypassing-cloudflare...
What inspired the project?
There are some industries (virtually all of Wall Street, for example, and certain parts of government) where the company needs to surveil 100% of what their employees do on the web from inside the office. These companies have been running MITM proxies for decades.
Wouldn't any website that rejects a non-browsery TLS client be blocking out these people as well?