My point is that the choice of 'p' (or, in the article's presentation, the choice of origin) cannot truly be arbitrary because if it reduces the expected squared difference between μ and û, then it necessarily contains information about μ. If all you truly know about μ is x and σ, then you will have no way to guess in which direction you should even shift your estimate û to reduce that error.
If you do have some additional information about μ, beyond just x alone, then sure, take advantage of it! But then don't call it a paradox.