One of the lessons I'm still trying to absorb is how convinced we all were (and many still are) that there must be a ton of implicit parallelism in the world, but once we went looking for it, it turned out there was hardly any.
I've been chewing on this for years, trying to figure out whether this is something true about the world, true about the problems we try to solve, or because of the pervasive use of the imperative paradigm which is highly ordered and makes it very easy to impose ordering constraints. Now, clearly, the latter is a non-trivial component... but I still struggle with whether it is a full answer, because when people sat down with clean sheets of paper and tried to solve problems with extensive parallelism, they've largely failed to get more than low single-digit speedup factors. Non-imperative paradigms like logic programming have been tried, especially in these projects.
It isn't exactly news that our intuitive beliefs about our code and the real characteristics it has is quite out of whack. This underlies the pervasive advice to just go grab a profiler whenever you're trying to accelerate some code because even experts in the field with decades of experience are routinely completely wrong about what is slow in some bit of code. This is one particular aspect I'm still struggling with. It still feels like there should be so much more parallelism available....