I guess my anwnser is: Developer performance is messured badly.
To messure it well you need someone with good taste (which is had to find developers often dislike other peoples code for no good reason).
From your 3 options its only on code you ship thats cared about. But its quality and volume.