Training the T2S model from scratch takes around 8h on 96 A100 GPUs. Training the `tiny` S2A model is around 3x faster (training HQ `small` variant is comparable to T2S).
I think you would get good results with fine-tuning but unfortunately we don't have a user-friendly notebook or script to do that right now. The biggest model is 800MB (FP32) so you won't even need a very big GPU to be able to fine-tune.