undefined | Better HN

0 pointslvl1023y ago0 comments

I think you’re underestimating state of the art in this area. You can do amazing things with just a few minutes of readings.

0 comments

rockemsockem3y ago

No. You are vastly overestimating it. There is a reason there is no broadly available TTS service like there is for text-to-image. Anyone who says you can clone a voice in a few minutes is not talking about human-quality.

lvl102OP3y ago

I’ve done a few models on my own. Stephen Fry is a relative easy one. This is from 2020 so I am sure state of the art is far better now.

rockemsockem3y ago

I'm not saying you can't do it, I'm saying it likely does not sound good enough for the average person to listen to for a long time.

Got a sample?

j / k navigate · click thread line to collapse

0 comments

rockemsockem3y ago

lvl102OP3y ago

I’ve done a few models on my own. Stephen Fry is a relative easy one. This is from 2020 so I am sure state of the art is far better now.

rockemsockem3y ago

I'm not saying you can't do it, I'm saying it likely does not sound good enough for the average person to listen to for a long time.

Got a sample?

j / k navigate · click thread line to collapse