Skip to content
Better HN
Top
New
Best
Ask
Show
Jobs
Search
⌘K
undefined | Better HN
0 points
lvl102
3y ago
0 comments
Share
I think you’re underestimating state of the art in this area. You can do amazing things with just a few minutes of readings.
0 comments
default
newest
oldest
rockemsockem
3y ago
No. You are vastly overestimating it. There is a reason there is no broadly available TTS service like there is for text-to-image. Anyone who says you can clone a voice in a few minutes is not talking about human-quality.
lvl102
OP
3y ago
I’ve done a few models on my own. Stephen Fry is a relative easy one. This is from 2020 so I am sure state of the art is far better now.
rockemsockem
3y ago
I'm not saying you can't do it, I'm saying it likely does not sound good enough for the average person to listen to for a long time.
Got a sample?
j
/
k
navigate · click thread line to collapse