I am sorry, do you want to say "performance" and "big-O" have nothing to with trying to make the program go faster?
I think you have lost your way and need to backtrack a little bit.
The whole point of big-O analysis is to be able to reason about how fast a program will be given input size.
If I ask for an O(1) algorithm and you build something that is as fast as possible, faster in every case, but sometimes it's really fast and sometimes it's a little less fast but still fast -- well, then it's not O(1) and not what I asked for. It may be fast, but if it's sometimes faster than at other times it's not O(1).
Thus, I do not consider big-O to be synonymous with speed. They are different, because the best possible big-O, O(1), does not necessarily mean the fastest.
Maybe not synonymous in the sense of technically exactly equal, but certainly a pretty good approximation -- and absolutely synonymous in the sense that the whole purpose of big-O notation is to be able to reason about speed in more general terms, comparing algorithms in pseudo code in stead of having to actually write the program so you can run it through a benchmark.
So, no: If "the best possible big-O, O(1), does not necessarily mean the fastest," then it isn't actually the best.
Asymptotic complexity is of course only loosely correlated with speed. In real systems with real workload sizes, constant factors matter a lot.
It's an important technicality that matters when you are doing performance tuning of code on modern CPU's. Big-O is asymptotic but omits the constant multiplier, so if you're comparing, for example, deletion from a naive binary search tree vs. an unsorted array, it's not necessarily obvious that a tree search on a naive pointer based tree (O(log n)) is faster than a find, swap, and delete (O(n)) until n is sufficiently large.
Another example is iterating through the pixels of a very large bitmap on the order of 10m+ pixels. While iterating through all pixels should be linear to the number of pixels (O(width * height)), assuming it's a row-oriented bitmap, which most are, scanning row by row can be substantially faster than scanning column by column because of caching behaviors.
Point being, big-O and actual performance are not always the same thing because the constant factor can sometimes dominate depending on what you're trying to do.
To take a real world example, immutable operations in Javascript are often more performant than mutable operations--thanks to the fact that the Chrome Engine "cheats" with its hot path functionality.
However, in Big O, constantly generating new data structures, even within loops, would clearly trash the algorithm's space complexity. On compilers that don't optimize for immutability, the Big O algorithm is more performant; on the ones that do, the immutable approach is more performant.
It's because of example like these that you want to disconnect the concept of "performance" from Big O. Because context matters for the former, while the latter relates only to the algorithm itself.
> No it’s not, it’s about trying to quantify algorithmic complexity.
And what would you say is the goal of quantifying algorithmic complexity?
Thus, the answer to why we care about complexity is "it depends". But it is not always about speed.