That's pretty much exactly how I started. Ran whisper.cpp locally for a while on a 3070Ti. It worked quite well when n=1.
For our use case, we may get 1 audio file at a time, we may get 10. Of course queuing them is possible but we decided to prioritize speed & reliability over self hosting.