Also, my sense is that Python is something of an outlier in having a mean and median functions in it standard library. I could be wrong, but AFAIK Go and JS do not, for example -- so people using those languages would surely bomb (at least on the median calculation part -- again, assuming they interpreted your question as "calculate from scratch").
I don't mean to be pedantic or split hairs. The point I'm trying to make is that even simple-seeming problems can have gotchas to them, depending on the context.
Interviewers could do better by either thinking just a bit more about the problems they select, or just communicating better. But many do not, unfortunately -- I have the sense they just pull problems out of the air, and see what sticks. Meanwhile counting the high number of fails as a success signal.