Just to complement what I said above, here's a link to a presentation I made a couple of years ago about performance optimizations in simulation:
https://indico.cern.ch/event/1052654/contributions/4521602/We also have meetings dedicated to performance, some of which are not public, but this series from ROOT is:
https://indico.cern.ch/category/14122/
If you search above, you will see many discussions about performance. The CI for ROOT also has a set of benchmarks to catch regressions, and Geant4 has two systems to track performance, a CI job checking every merge request, which I've set up myself (not publicly accessible), and a more complex system to track performance run by FNAL:
https://g4cpt.fnal.gov/
These are just some examples from the projects I've worked on. There are also efforts to port stuff to GPUs and HPCs, and many other projects like event generators that are also undergoing performance work for HL-LHC. If you Google you can probably find a lot more stuff than what I already mentioned. Cheers,