It was submitted as https://news.ycombinator.com/item?id=40623629
Again, there is effectively zero real data showing this. Further, RLHF isn't likely to reinforce such word selection regardless.
A more logical, likely scenario is that training data is biased heavily towards higher grade level material, so word selection veers towards writings that you find in those realms.