1. Structural - designing the top-level system to perform its work with minimal activity.
2. Local - Calling conventions, component-local algorithms, etc.
Now this is fuzzy for the same reason that one man's strategy is another man's tactics. But the gist is simple: you don't tighten the bolts on the system until you have it in the right shape.
Local optimizations are easy to do b/c you can usually pull them out of a textbook, or figure it out by looking at one or to parts of code. Structural optimizations require an understanding of the system performance as a whole -- possibly allowing some suboptimal local behavior. Sadly the way I've mostly seen is super-tightening the bolts and having the system slowly distort from something suboptimal but flexible into a contorted, performs-ok-but-dont-touch-it terrible beast. That's the sort of optimization that's the root of all evil.
Example: I have a webserver, with a table of regexs mapping to handlers. The handlers can be static assets in files or dynamically generated in code.
Local optimization: I notice that after URL pattern match, my static assets are read through normal stdio file I/O. I use sendfile or whatever's the new hotness on my platform to make it fast.
Structural optimization: I notice that the majority of my hits are static for a small number of files. I write a new webserver (say static.foo.com) for them specifically, that looks up the URL in a hash table for an in-memory copy of the file. I just ping it with SIGHUP when I want to invalidate the hashtable's values.
"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil"
Note, "small" and "97%".
That means that you should make the right large-scale decisions before you code. Or measure.