> Google is pretty accurate in figuring out transactional versus marketing. They don't tell their heuristics, but you don't think engineers who build web crawlers cannot build email classifiers?
Yes, I definitely think that. The engineers can build anything, but where the company focuses matters.
I've seen transactional E-mails get sorted into people's spam/junk/newsletter folders too many times.
I also get tons of spam to my inbox despite regularly marking it as such, so if they are classifying marketing emails, they're not doing anything with that information.
How hard is it to classify a message that literally contains the string "this is an advertisement"?
It's also not an email. I've never seen a legitimate email with that string, and all of the illegitimate ones should be triggering other heuristics, such as the existence of an unsubscribe link, or things like "$x off".
In any case the false positive rate on that would likely be incredibly low, so it's a good heuristic considering how bad the false negative rate is right now.