At NetApp when we were doing scaling analysis in the early 2000's it became clear the memory bandwidth was limited more by the transaction rate of the memory controller than it was the actual available bandwidth of the memory subsystem.
That is because a memory transaction involves "opening" a page, and then "doing the operation", which can be one to several hundred locations long. "Pointer chasing", code that reads in a structure, then deferences a pointer to another structure, then derefences that pointer to still another stucture, Etc. was really hard on the memory subsystem. It burned a lot of memory ops reading relatively small chunks of memory.
Its a great topic in systems architecture and there are a number of papers on it.