This is an underrated comment. Compare: playing top level chess is a good indicator that someone will be good at maths problems, but not that a chess computer will be.
So what is missing? Could we add up those missing competencies to create a new test of "general knowledge-worker capabilities" and then train an AI to pass this test?