undefined | Better HN

0 pointsrockemsockem3y ago0 comments

No. You are vastly overestimating it. There is a reason there is no broadly available TTS service like there is for text-to-image. Anyone who says you can clone a voice in a few minutes is not talking about human-quality.

0 comments

lvl1023y ago

I’ve done a few models on my own. Stephen Fry is a relative easy one. This is from 2020 so I am sure state of the art is far better now.

rockemsockemOP3y ago

I'm not saying you can't do it, I'm saying it likely does not sound good enough for the average person to listen to for a long time.

Got a sample?

j / k navigate · click thread line to collapse

0 comments

lvl1023y ago

I’ve done a few models on my own. Stephen Fry is a relative easy one. This is from 2020 so I am sure state of the art is far better now.

rockemsockemOP3y ago

I'm not saying you can't do it, I'm saying it likely does not sound good enough for the average person to listen to for a long time.

Got a sample?

j / k navigate · click thread line to collapse