If the evaluation doesn't make sense, then the conditions placed on the evaluation don't really matter.
Similarly, "assimilated the material" only makes sense if the live coding interview really does cover "the material". To use your analogy, measuring a football prospect’s cycling times aren't that good of a test of football playing skills. I mean, yes, there's some overlap, but there are more useful ways.
And one of the example tests was "handle scores for bowling", which is far from most work-related issues.
"consistent process that’s fair for everyone."
The author addressed this idea at several points, including "Any belief that a live coding interview is a consistently reliable way to make an objective assessment represents willful ignorance at best."
Picking a name out of a hat containing potential employee names is also consistent and fair.
Just because it's easy to measure doesn't mean it's an effective predictor.