Your first scenario: possible, but quite unlikely. If James cannot even perform migrations without causing outages or dropping data, the chances are that "flagship order processing pipeline" he made is similarly bad, and even if it works, it's likely has outages and missing data. I've never seen a developer who can only do hard tasks but is genuinely bad at simple tasks (They may refuse to do those, but if they start on them they'll do them well.)
Your second scenario is unfortunately very likely, people are jerks, and if they are also high performers (or high bullshitters) then can get away with it.
Either way I fully agree one one should be firing/promoting people based on a single metric, even if that metric is very relevant to the job description. That doesn't mean that "there is no such thing", or that if you really need to get that DB migration done, you want to choose a "quintessential communicator with jr devs".