Shor's original paper discusses the fundamental need for several runs and gives the time complexity.
https://arxiv.org/pdf/quant-ph/9508027.pdf
The paper you linked to is just saying that the need for several runs and 2006 readout error rates together led to awful scaling (though still better than classical factorisation). That isn't surprising. Error rates in all parts of even modern quantum computers are still too high to scale well. Especially when repetition is called for.
Ultimately, complexity is calculated using logical qbits. Kalai doesn't think that even with error correction codes we can get close enough. Most other people think we can. That paper you linked to is just making the point that we're definitely not there yet. When we do, algorithms in BQP will still need several runs on many algorithms (the exact same way as algorithms in BPP do) but that should already be incorporated into those algorithm's complexity.