Which part of this can't be done on TPUs?
https://en.wikipedia.org/wiki/Overlap%E2%80%93save_method#Ps... As far as I can tell, all of those operations can be done on TPUs. In fact, I linked to the operation list that shows they can be.
You'll need to link me to some specific implementation that you want me to port over, not just namedrop some random algorithm. Got a link to a github?
If your point is "There isn't a preexisting operation for overlap-save FFT" then... yes, sure, that's true. There's also not a preexisting operation for any of the hundreds of other algorithms that you'd like to do with signal processing. But they can all be implemented efficiently.
Yet it remains a fact that TPU cannot do certain workloads without offloading to the CPU (making it orders of magnitude slower), and that's somehow okay?
I think this is the crux of the issue: you're saying X can't be done, I'm saying X can be done, so please link to a specific code example. Emphasis on "specific" and "code".