Google just made public their latest version of the Quality Raters Guidelines (PDF linked from the blog post)
Which is just summarizing what the blog post is about (probably a G+ share that got plus'd a lot). Nothing against the commentator, it's rather their ranking/voting system that needs to be fixed.
Later in the document they talk about doing reputation research on the site. They say, "[F]or Page Quality rating, you must also look for outside, independent reputation information about the website."
So they are evaluating content. They are encouraging their reviewers to make a judgement. This strikes me as unreliable, maybe even a slippery slope for them.
Ratings from evaluators do not determine individual site rankings, but are used help us understand our experiments. The evaluators base their ratings on guidelines we give them; the guidelines reflect what Google thinks search users want
Often an AI can be trained by a corpus, like training a spell correcter by feeding it the text of a dictionary. But the data that Google works is the entire web, which changes all the time, so they need constant feedback on what is good or bad.
Google also personalizes results now, so the AI needs to learn not just what is a good result, but what is a good result for that particular person, given everything else the AI knows about that person. That would require inputs from a wide variety of people.
Their ML algos really come down to codified, scaled human intuition and decisions. However, to avoid having to answer for them (google has the ability to make or break many internet companies), they repeatedly go on and on about ML and pretend it's just math, as if there is a first principles answer to eg which candle company should rank highest passed down from Gauss.
The reason this is valuable to google is if they admit it is just collective human judgement, governments are much more likely to demand input.