> All sites on statichost.eu get a SITE-NAME.statichost.eu domain, and during the weekend there was an influx of phishing sites.
Second, they should be using the public suffix list (https://publicsuffix.org/) to avoid having their entire domain tagged. How else is Google supposed to know that subdomains belong to different users? That's what the PSL is for.
From my reading, Safe Browsing did its job correctly in this case, and they restored the site quickly once the threat was removed.
The new separate domain is pending inclusion in the PSL, yes.
Edit: the "effort" I'm talking about above refers to more real time moderation of content.
That is probably true, but in this case I think most people would think that they used that power for good.
It was inconvenient for you and the legitimate parts of what was hosted on your domain, but it was blocking genuinely phishing content that was also hosted on your domain.
This safety feature saves a nontrivial number of people from life-changing mistakes. Yes we publishers have to take extra care. Hard to see a negative here.
In the social, there is always someone with most of the power (distributed power is an unstable equilibrium), and it's incumbent upon us, the web developers, to know the current status quo.
Back in the day, if you weren't testing on IE6 you weren't serving a critical mass of your potential users. Nowadays, the nameplates have changed but the same principles hold.
In this case they did use it for good cause. Yes, alternatively you could have prevented the whole thing from happening if you cared about customers.
> Second, they should be using the public suffix list (https://publicsuffix.org/) to avoid having their entire domain tagged.
NO, Google should be "mindful" (I know companies are not people but w/e) of the power it unfortunately has. Also, Cloudflare. All my homies hate Cloudflare.
[1] https://github.com/publicsuffix/list/blob/main/public_suffix...
[2] https://groups.google.com/g/publicsuffix-discuss/c/xJZHBlyqq...
Can you elaborate on this? I didn't see anything in either link that would indicate unreasonable challenges. The PSL naturally has a a series of validation requirements, but I haven't heard of any undue shenanigans.
Is it great that such vital infrastructure is held together by a ragtag band of unpaid volunteers? No; but that's hardly unique in this space.
How is this kinda not insane? https://publicsuffix.org/list/public_suffix_list.dat
A centralized list, where you have to apply to be included and it's up to someone else to decide whether you will be allowed in? How is this what they went for: "You want to specify some rules around how subdomains should be treated? Sure, name EVERY domain that this applies to."
Why not just something like https://example.com/.well-known/suffixes.dat at the main domain or whatever? Regardless of the particulars, this feels like it should have been an RFC and a standard that avoids such centralization.
That said, there are a number of IT professionals that aren't aware of the PSL as these are largely initiatives that didn't exist prior to 2023 and don't get a lot of advertisement, or even a requirement. They largely just started being used silently by big players which itself presents issues.
There are hundreds if not thousands of whitepapers on industry, and afaik there's only one or two places its mentioned in industry working groups, and those were in blog posts, not whitepapers (at M3AAWG). There's no real documentation of the organization, what its for, and how it should be used in any of the working group whitepapers. Just that it is being used and needs support; not something professional's would pay attention to imo.
> Second, they should be using the public suffix list
This is flawed reasoning as is. Its hard to claim this with a basis when professionals don't know about this, a small subset just arbitrarily started doing this, and seems more like false justification after-the-fact for throwing the baby out with the bath water.
Security is everyone's responsibility, and Google could have narrowly tailored the offending domain name accesses instead of blocking the top-level. They didn't do that, and worse that behavior could even be automated in a way that the process could be extended and there could be a noticing period to the toplevel provider before it started hitting everyone's devices. They also didn't do that apparently.
Regardless, no single entity should be able to dictate what other people perceive or see arbitrarily from their devices (without a choice; opt-in) but that is what they've designed these systems to do.
Enumerating badness doesn't work. Worse, say the domain names get reassigned to another unrelated customer.
Those people are different people, but they are still blocked as happens with small mail servers quite often. Who is responsible when someone who hasn't been engaged with phishing is being arbitrarily punished without due process. Who is to say that google isn't doing this purposefully to retain their monopolies for services they also provide.
Its a perilous torturous path where trust cannot be given because they've violated that trust in the past, and have little credibility with all net incentives towards their own profit at the expense of others. They are even willing to regularly break the law, and have never been held to account for it. (i.e. Google Maps WIFI wiretapping).
Hanlon's razor is a joke intended as a joke, but there are people that use it literally and inappropriately to deceitfully take advantage of others.
Gross negligence coupled with some form of loss is sufficient for general intent which makes the associated actions malicious/malice.
Throwing out the baby with the bath water without telling anyone or without warning, is gross negligence.
The view that professionals in this industry exclusively participate in academic circles runs counter to my experience. Unless you're following the latest AI buzz, most people are not spending their time on arXiv.
The PSL is surely an imperfect solution, but it's solving a problem for the moment. Ideally a more permanent DNS-based solution would be implemented to replace it. Though some system akin to SSL certificates would be necessary to provide an element of third-party trust, as bad actors could otherwise abuse it to segment malicious activity on their own domains.
If you're opposed to Safe Browsing as a whole, both Chromium and Firefox allow you to disable that feature. However, making it an opt-in would essentially turn off an important security feature for billions of users. This would result in a far greater influx of phishing attacks and the spread of malware. I can understand being opposed to such a filter from an idealistic perspective, but practically speaking, it would do far more harm than good.
So good, in fact, that it should have been known to an infrastructure provider in the first place. There's a lot of vitriol here that is ultimately misplaced away from the author's own ignorance.
It's a weird thing, to be honest, a Github repo mentioned nowhere in any standards that browsers use to treat some subdomains differently.
Information like this doesn't just manifest itself into your brain once you start hosting stuff, and if I hadn't known about its existence I wouldn't have thought to look for a project like this either. I certainly wouldn't have expected it to be both open for everyone and built into every modern internet-capable computer or anti malware service.
https://publicsuffix.org/list/public_suffix_list.dat
It even says so in the file itself. If Microsoft goes up in flames, they can switch to another repository provider without affecting the SoT.
I don't have a lot of sympathy for people who allow phishing sites suffering reputational consequences.
I checked it for two popular public suffixes that came to mind: 'livejournal.com' and 'substack.com'. Both weren't there.
Maybe I'm mistaken, it's not a bug and these suffixes shouldn't be included, but I can't think of the reason why.
User-uploaded content (which does pose a risk) is all hosted on substackcdn.com.
The PSL is more for "anyone can host anything in a subdomain of any domain on this list" rather than "this domain contains user-generated content". If you're allowing people to host raw HTML and JS then the PSL is the right place to go, but if you're just offering a user post/comment section feature, you're probably better off getting an early alert if someone has managed to breach your security and hacked your system into hosting phishing.
For what it's worth, this makes it sound like you think the vitriol should be aimed at the author's ignorance rather than the circumstances which led to it, presuming you meant the latter.
However, I'm now reflecting on what I said as "be careful what you wish for", because the comments on this HN post have done a complete 180 since I wrote it, to the point of turning into a pile-on in the opposite direction.
The PSL is one of those load-bearing pieces of web infrastructure that is esoteric and thanklessly maintained. Maybe there ought to be a better way, both in the sense of a direct alternative (like DNS), and in the sense of a better security model.
It basically goes: growing user base -> growing amount of malicious content -> ability to submit domain to PSL. In that order, more or less.
In terms of security, for me, there's no issue with being on the same domain as my users. My cookies are scoped to my own subdomain, and HTTPS only. For me, being blocked was the only problem, one that I can honestly admit was way bigger than I thought.
Hence, the PSA. :)
My open source project has some daily users, but not thousands. Plenty to attract malicious content, I think a lot of people are sending it to themselves though (like onto a malware analysis VM that is firewalled off and so they look for a public website to do the transfer), but even then the content will be on the site for a few hours. After >10 years of hosting this, someone seems to have fed a page into a virus scanner and now I'm getting blocks left and right with no end in sight. I'd be happy to give every user a unique subdomain instead of short links on the main domain, and then put the root on the PSL, if that's what solves this
from PSL's GitHub repo's wiki [0].
[0]: https://github.com/publicsuffix/list/wiki/Guidelines#validat...
If you mean with the domain option, that's not really sufficient. You need to use the Host- prefix
However, I think the issue is that with great power comes great responsibility.
They are better than most organisations, and working with many constraints that we cannot always imagine.
But several times a week we get a false "this mail is phishing" incident, where a mail from a customer or prospect is put in "Spam", with a red security banner saying it contains "dangerous links". Generally it is caused by domain reputation issues, that block all mail that uses an e-mail scanning product. These products wrap URLs so they can scan when the mail is read, and thus when they do not detect a virus, they become defacto purveyors of virii, and their entire domain is tagged as dangerous.
I have raised this to Google in May (!) and have been exchanging mail on a nearly daily basis. Pointing out a new security product that has been blacklisted, explaining the situation to a new agent, etc.
Not only does this mean that they are training our staff that security warnings are generally false, but it means we are missing important mail from prospects and customers. Our customers are generally huge corporations, missing a mail for us is not like missing one mail for a B2C outfit.
So far the issue is not resolved (we are in Oct now!) and recently they have stopped responding. I appreciate our organisation is not the US Government, but still, we pay upwards of 20K$ / year for "Google Workspace Enterprise" accounts. I guess I was expecting something more.
If someone within Google reads this: you need to fix this.
Half (or more) of security alerts/warnings are false positives. Whether it's the vulnerability scanner complaining about some non-existent issues (based on the version of Apache alone... which was back ported by the package maintaner), or an AI report generated by interns at Deloitte fresh out of college, or someone reporting www.example.com to Google Safe Browsing as malicious, etc. At least half of the things they report on are wrong.
You sort of have to have a clue (technically) and know what you are doing to weed through all the bullshit. Tools that block access, based on these things do more harm than good.
In the author’s case, he was at least able to reproduce the issues. In many cases, though, the problem is scoped to a small geographic region, but for large internet services, even small towns still mean thousands of people reaching out to support while the issue can’t be seen on the overall traffic graph.
The easiest set of steps you can do to be able to react to those issues are: 1. Set up NEL logging [1] that goes to completely separate infrastructure, 2. Use RIPE Atlas and similar services in the hope of reproducing the issue and grabbing a traceroute.
I’ve even attempted to create a hosted service for collecting NEL logs, but it seemed to be far too niche.
[1]: https://developer.mozilla.org/en-US/docs/Web/HTTP/Guides/Net...
Corporations are a little safer. They have mutually binding contracts with multiple internet service providers and dedicated circuits. They have binding contracts with DNS registrars. Having been on the receiving end of abuse@ they notify over phone and email giving plenty of time to figure out what is going on. I've never seen corporate circuits get nuked for such shenanigans.
Post author is throwing a lot of sand at Google for a process that has (a) been around for, what, over a decade now and (b) works. The fact of the matter is this hosting provider was too open, several users of the provider used it to put up content intended to attack users, and as far as Google (or anyone else on the web is concerned) the TLD is where the buck stops for that kind of behavior. This is one of the reasons why you host user-generated content off your TLD, and several providers have gotten the memo; it is unfortunate statichost.eu had not yet.
I'm sorry this domain admin had to learn an industry lesson the hard way, but at least they won't forget it.
What I'm trying to say in the post specifically about Google is that I personally think that they have too much power. They can and will shut down a whole domain for four billion users. That is too much power no matter the intentions, in my opinion. I can agree that the intentions are good and that the net effect is positive on the whole, though.
On the "different aspects" side of things, I'm not sure I agree with the _works_ claim you make. I guess it depends on what your definition of works is, but having a blacklist as you tool to fight bad guys is not something that works very well in my opinion. Yes, specifically my own assets would not have been impacted, had I used a separate domain earlier. But the point still stands.
The fact that it took so long to move user content off the main domain is of course on me. I'm taking some heat here for saying this is more important than one (including me) might think. But nonetheless, let it be a lesson for those of you out there who think that moving that forum / upload functionality / wiki / CMS to its own domain (not subdomain) can be done tomorrow instead of today.
This read like a dark twist in a horror novel - the .page tld is controlled by Google!
The public suffix list (https://publicsuffix.org/) is good and if I were to start from scratch I would do it that way (with a different root domain) but it's not absolutely required, the search engines can and do make exceptions that don't just exclusively use the PSL, but you'll hit a few bumps in the road before that gets established.
Ultimately Google needs to have a search engine that isn't full of crap, so moving user content to a root domain on the PSL that is infested with phishing attacks isn't going to save you. You need to do prolific and active moderation to root out this activity or you'll just be right back on their shit list. Google could certainly improve this process by providing better tooling (a safe browsing report/response API would be extremely helpful) but ultimately the burdon is on platforms to weed out malicious activity and prevent it from happening, and it's a 24/7 job.
BTW the PSL is a great example of the XKCD "one critical person doing thankless unpaid work" comic, unless that has changed in recent years. I am a strong advocate of having the PSL management become an annual fee driven structure (https://groups.google.com/g/publicsuffix-discuss/c/xJZHBlyqq...), the maintainer deserves compensation for his work and requiring the fee will allow the many abandoned domains on the list to drop off of it.
Strict cookies crossing root to subdomains would be a major security bug in browsers. It's always been a (valid) theoretical concern but it's never happened on a large scale to the point I've had to address it. There is likely regression testing on all the major browsers that will catch a situation where this happens.
Get yourself on public suffix list or get better moderation. But of course just moaning about bad google is easier.
In my experience, safe browsing does theoretically allow you to report scams and phishing in terms of user generated content, but it won't apply unless there's an actual interactive web page on the other end of the link.
There is the occasional false positive but many good sites that end up on that list are there because their WordPress plugin got hacked and somewhere on their site they are actually hosting malware.
I've contacted the owners of hacked websites hosting phishing and malware content several times, and most of the time I've been accused of being the actual hacker or I've been told that I'm lying. I've given up trying to be the good guy and report the websites to Google and Microsoft these days to protect the innocent.
Google's lack of transparency what exact URLs are hosting bad material does play a role there.
The last point is actually the one I'm trying to make.
From spamblocking that builds heuristics fed by the spam people manually flag in GMail to Safe Browsing using attacks on users' Chrome as a signal to their voice recognition engine leapfrogging the industry standard a few years back because they trained it on the low-quality signal from GOOG411 calls, Google keeps building product by harvesting user data... And users keep signing up because the resulting product is good.
This puts a lot of power in their hands but I don't think it's default bad... If it becomes bad, users leave and Google starts to lose their quality signal, so they're heavily incentivized to provide features users want to retain them.
This does make it hard to compete with them. In the US at least, antitrust competition has generally been about user harm, not actually market harm. If a company has de-facto control but customers aren't getting screwed, that's fine because ultimately the customer matters (and nobody else is owed a shot at beeing a Google).
Folks around here are generally uneasy about tracking in general too, but remove big brother monitoring from Safe Browsing and this story could still be the same: whole domain blacklisted by Google, only due to manual reporting instead.
"Oh, but a human reviewer would've known `*.statichost.eu` isn't managed by us"—not in a lot of cases, not really.
If Facebook became a trap that frequently hosted malware to strangers, the rest of the net would begin to interpret it as damage and route around it.
There is no real way a normal person even can flag facebook.
Search Console always points to my internal login page, which isn’t public and definitely isn’t phishing.
They clear it quickly when I appeal, and since it’s just for me, I’ve mostly stopped worrying about it.
My workaround is to use an IPv6 ULA for my publicly hosted private IP addresses, which is extremely unlikely to ever be reused by a bad actor.
https://www.apple.com/legal/privacy/data/en/safari/
https://support.mozilla.org/en-US/kb/how-does-phishing-and-m...
Microsoft seems to do its own thing for Edge, though.
https://learn.microsoft.com/en-us/deployedge/microsoft-edge-...
As a result, some ISPs apparently block the domain. Why is it listed? I have no idea. There are no ads, there is no user content, and I've never sent any email from the domain. I've tried contacting spamhaus, but they instantly closed the ticket with a nonsensical response to "contact my IT department" and then blocked further communication. (Oddly enough, my personal blog does not have an IT department.)
Just like it's slowly become quasi-impossible for an individual to host their own email, I fear the same may happen with independent websites.
Either that or your DNS provider hosts a lot of spam.
This is the infuriating part. I get that someone buying cheap hosting may end up with an IP address that used to send spam, but spam lists are not reliable indicators of website security.
Overzealous security products are a blight on the internet. I'd be less annoyed at them if they weren't so trivial to bypass as a hacker with access to a stolen credit card.
Anyone who can upload HTML pages to subdomain.domain.com can read and write cookies for *.domain.com, unless you declare yourself a public suffix and enough time has passed for all the major browsers to have updated themselves.
I've seen web hosts in the wild who could have their control panel sessions trivially stolen by any customer site. Reported the problem to two different companies. One responded fairly quickly, but the other one took several years to take any action. They eventually moved customers to a separate domain, so the control panel is now safe. But customers can still execute session fixation attacks against one another.
By putting UGC on the same TLD you also put your own security at risk, so they basically did you a favor…
Do you think I'm reading/writing sensitive data to/from subdomain-wide cookies?
Also, yes, the PSL is a great tool to mitigate (in practice eliminate) the problem of cross-domain cookies between mutually untrusting parties. But getting on that list is non-trivial and they (voluntary maintainers) even explicitly state that you can forget getting on there before your service is big enough.
Despite being a paying Google Workspace customer, I can't get in touch with anyone who can help.
The bigger issue is that the internet needs governance. And, in the absence of regulation, someone has stepped in and done it in a way that the author didn't like.
Perhaps we could start by requiring that Google provide ways to contact a living, breathing human. (Not an AI bot that they claim is equivalent.)
Hopefully, this helps you understand why your living, breathing human is such a farcical idea for theGoogs to consider.
So you can't take one part of the responsibility and abdicate the other part!
> Static site hosting you can trust
is more like amateur hour static site hosting you can’t trust. Sorry.
I'm also trusting my users to not expose their cookies for the whole *.statichost.eu domain. And all "production" sites use a custom domain anyway, which avoids all of this anyway.
I have read A LOT of blogs/rants/incidents on social media about startups, small businesses, and individuals getting screwed by large companies in similar capacities. I am VERY sympathetic to those cries into the sky, shaking fists at clouds, knowing very well we are all very small and how the large providers seem to not care. With that in mind, I am not blind to the privilege my organization has to rope in Google to discuss root causes for incidents.
I am writing about it here because I believe most people will never be able to pull a key Google stakeholder into a 40 minute video call to deeply discuss the RCAs. The details of the discussion are probably protected by NDA so I'll be speaking in general terms.
Google has a product called Web Risk (https://cloud.google.com/web-risk/docs/overview), I hear it's mostly used by Google Enterprise customers in regulated verticals and some large social media orgs. Web Risk protects the employees of these enterprise organizations by analyzing URLs for indicators of risk, such as phishing, brand impersonation, etc.
My SaaS platform is well established and caters mostly to large enterprise. I provide enterprise customers with optional branded SSO landing pages. Customers can either use sign-in from the branded site (SP-initiated) or redirect from their own internal identity provider to sign-in (IdP-initiated). The SSO site branding is directed by the customer, think along the lines of what Microsoft does for Entra ID branded sign-in pages. Company logo(s), name, visual styling, and other verbiage may be included. The branded/vanity FQDN is (company).productname.mydomain.com.
You may be able to see where I'm headed at this point... Why was my domain blocked? For suspected phishing.
A mutual enterprise customer was subscribed to Google's Web Risk. When their employees navigated to their SSO site, Google scanned it. Numerous heuristics flagged the branded SSO site as phishing and we were blocked by Safe Browsing across all major web browsers (Safari, Chrome, Firefox, Edge, and probably others). Google told us that had our customer put the SSO site on their Web Risk allow-list, we wouldn't have been blocked.
I'm no spring chicken, I cannot rely nor expect a customer to do that, so I pressed for more which led to a lengthy conversation on risk and the seemingly, from my perspective, arbitrary decisions made by a black box without any sort of feedback loop.
I was provided a tiny bit of insight into the heuristic secret sauce, which led to prescribed guidance on what could be done to greatly reduce the risk of getting false positive flag for phishing again. Those specifics I assume I cannot detail here, however the overall gist of it is domain reputation. Google was unable to positively ascertain my domain's reputation.
My recommendation is for those of you out there in the inter-tubes who have experienced false positive Safe Browsing blocks, think about what you can do to increase your domain's public reputation. Also, get a GCP account so if you do get blocked, you can open a ticket from that portal. I was told it would be escalated to the appropriate team and be actioned on within 10-15 minutes.
That list doesn't have a clear way to get off of it. I would be happy to give them the heads up that their users are complaining about a website being broken, but there is no such thing, neither for users nor for me. In looking around, there's many "sources" that allegedly independently decided around the same day that my site needs to not work anymore, so now there's a dozen parties I need to talk to and more popping up the further you look. Netcraft started sending complaints to the registrar (which got the whole domain put on hold), some other list said they sent abuse to the IP space owner (my ISP), public resolvers have started delisting the domain (pretending "there is no such domain" by returning NXDOMAIN), as well as the mentioned adblockers
There's only one person who hasn't been contacted: the owner. I could actually do something about the abusive content...
It's like the intended path is that users start to complaint "your site doesn't work" (works for me, wdym?) and you need to figure out what software is it they're using, what DNS resolver they use, what antivirus, what browser, if a DOH provider is enabled... to find out who it might be that's breaking the site. People don't know how many blocklists they're using, and the blocklists don't give a shit if you're not a brand name they recognize. That's the only difference between my site and a place like Github: if I report "github.com hosts malware", nobody thinks "oh, we need to nuke that malicious site asap!"
I'd broaden the submitted post to say that it's not only Google with too much power, but these blocklists have no notification mechanism or general recourse method. It's a whack-a-mole situation which, as an open source site with no profit model (intentionally so), I will never win. Big tech is what wins. Idk if these lists do a trademark registration check or how they decide who's okay and who's not, but I suspect it's simply a brand name thing and your reviewer needs to know you
> Luckily, Google provided me with a helpful list of the offending sites
Google is doing better than most others with that feature. Most "intelligence providers", which other blocklists like e.g. Quad9 uses, are secretive about why they're listing you, or never even respond at all
I love that IT is a field where there's no required formal education track and you can succeed with any learning path, but we definitely need some better way to make sure new devs are learning about some of these gotchas.
On top of that, it is also recommended to serve user content from another domain for security reasons. It's much easier to avoid entire classes of exploits this way. For the site admins: treat it as a learning experience instead of lashing out on goog. In the long run you'll be better off, having learned a good lesson.
Like i get that google has a lot of power, but you think they would use a case where google was actually in the wrong.