The first reason is the whole category of optimizations that the compiler is worse than a human at (like register allocation) or cannot do effectively at all (messing with calling conventions, computed jumps).
The second reason is more subtle: in any case that you abstract yourself from some part of a problem, you inherently create a less efficient solution.
For example, intrinsics mean that you don't have to manually allocate registers. But this also means that if your algorithm uses too many registers and it would be more efficient to modify it to require fewer (and thus not need spills), you will have no way of knowing such a thing. By insulating yourself from that layer of complexity, you've also limited your ability to make higher-level optimizations that improve lower-level performance.
This applies on practically every level possible: any method of abstraction, no matter how well designed, will always in some fashion reduce the maximum performance you can achieve. Of course, this doesn't mean abstraction is bad--it provides an often-useful tradeoff between developer time and performance.