It's not so much a function of how many Chinese people are using it as how many instances of the word being posted result in a comment being flagged.
Intersections such as "The word is rarely used, but when it is used it happens in a political setting where someone is more likely to decide to hit the flag button" would train an ML algorithm that the word is unwelcome in general.
Couldn't you just have bot accounts that search for YT comments and key phrases and flag those? Simply enough flags results in auto removal regardless of YT's decision's on the acceptability of these words/phrases. This wouldn't be very hard to setup either.
One challenge is that Google's actually got some pretty solid signal to find and kill bots. But it's not impossible to botnet their services; just harder than doing it to the average online service that doesn't have an army of engineers who've trained on the adversarial space of people trying to automate ad clicks for real-money revenue.