undefined | Better HN

0 pointsconvery5y ago0 comments

You seem to know a lot about the topic, any idea about the current state of text-to-speech? Haven't seen any opensource projects that would make, for example, an ebook enjoyable.

0 comments

nshm5y ago

Recent more or less reasonable one is https://github.com/TensorSpeech/TensorFlowTTS, it implements all the latest algorithms. For simple business books it will be ok, for emotional fiction probably not there yet.

liability5y ago

Extant TTS is already there for fiction, if you approach it with the right expectations (more an alternative to visual reading than dramatically read audio books.) I've 'read' numerous fiction books using MacOS's TTS ('Alex') and with my kindle (3rd gen 'keyboard' model from 2010.)

These extant solutions require an effort-investment from the user to work up to fast speeds, but once the user becomes acclimatized they work great. The neuroplasticity of the human brain seems to do a great job of smoothing out the wrinkles.

JZL0035y ago

I agree - I've been using google's TTS api for audiobooks and it's great. I switch off between professional audio books (overdrive is amazing and free by public libraries) and TTS and, while professionals can add something, you get used to TTS pretty fast. Google's TTS gives 1 million free characters a month, which is pretty generous for a single person and it sounds pretty good. I read books with pretty weird character names (like the Wandering Inn web serial) and it never explodes. Sometimes it spells out character names but even for very non-standard names, it does fine.

I've experimented with some of tacotron TTS/espnet to do the TTS on my computer and they work alright. Sometimes you get weird edge cases and it makes some pretty weird sounds (and even if your laptop doesn't have a GPU, google co-lab works well for quick audiobook generation). I don't hit the million characters that often so it hasn't been a big deal but I'll probably move to home-made just because I like tweaking it.

The way I think about it is that the written word doesn't have much intonation anyway so as long as the audiobook doesn't offend me, it's a pretty good solution (and helps prevent eye strain after working on a computer all day)

j / k navigate · click thread line to collapse

0 comments

nshm5y ago

liability5y ago

JZL0035y ago

j / k navigate · click thread line to collapse