Show HN: MP3 to Text (opens in new tab)

(veed.io)

28 pointssabbakeynejad5y ago18 comments

18 comments

"MP3 to Text" seems very inaccurate since you can only upload video files. In fact uploading an .mp3 file shows "File type not supported".

edit: I get it, OP just keeps submitting his service with different descriptions until one gets some upvotes. Only took 25 tries to get 30 points. Shameful.

arteez5y ago

Just goes to show that people upvote anything if it sounds cool, and don't bother checking it out.

sabbakeynejadOP5y ago

We add new major product features every few week and think its important to see if people like it!

fredley5y ago

MP3 to text? Why does it ask me to upload a video?

sabbakeynejadOP5y ago

Opps, thats a UX problem. Will get this fixed now.

rock_artist5y ago

Very cool but how do I know what languages supported? It says "VEED is able to recognise and transcribe languages from all over the world - English, Spanish, French, Chinese, and many more".

From my experience with NLP/AST the tricky part is models for some less common languages.

sabbakeynejadOP5y ago

This is true, we support over 55 languages. The more popular the language the better the results.

rubyron5y ago

What’s the pricing? What speech-to-text engine is being used?

Clicking on the Sign Up button on iOS Safari does nothing.

Clicking on the Get Started button takes me to an Upload Video form - not what I expected from a mp3-to-text service.

remram5y ago

Apparently you're limited to 50 MB for free, which is pretty short if you can't send audio files but only videos.

mxuribe5y ago

This would have been genius in the napster days of yore; why seek out and download mp3s yourself? Just sit back and have people send stuff to you! I kid, i kid! ;-)

remram5y ago

Is there even a good offline version of this? There are some opensource tools for speed-to-text but what about batch processing of audio files?

synesthesiam5y ago

You may be interested in voice2json for offline batch processing: https://voice2json.org

Here's an example using GNU parallel: http://voice2json.org/recipes.html#parallel-wav-recognition

remram5y ago

Wow this is exactly what I had in mind for "opensource tools for speed-to-text". I didn't know it did this too. Thanks a lot!

yorwba5y ago

> voice2json is optimized for:

> Sets of voice commands that are described well by a grammar

> Commands with uncommon words or pronunciations

> Commands or intents that can vary at runtime

Doesn't sound like what you'd want for a generic transcription service.

synesthesiam5y ago

It supports open-ended transcription too: https://voice2json.org/commands.html#open-transcription

Users have reported good accuracy with the English Deepspeech profile: https://github.com/synesthesiam/voice2json-profiles

Lemmih5y ago

How does this work? And is it more accurate than YouTube's automatic captions?

asah5y ago

Multi speaker?

236dev5y ago

reminds me of descript

j / k navigate · click thread line to collapse

18 comments

remram5y ago

"MP3 to Text" seems very inaccurate since you can only upload video files. In fact uploading an .mp3 file shows "File type not supported".

edit: I get it, OP just keeps submitting his service with different descriptions until one gets some upvotes. Only took 25 tries to get 30 points. Shameful.

arteez5y ago

Just goes to show that people upvote anything if it sounds cool, and don't bother checking it out.

sabbakeynejadOP5y ago

We add new major product features every few week and think its important to see if people like it!

fredley5y ago

MP3 to text? Why does it ask me to upload a video?

sabbakeynejadOP5y ago

Opps, thats a UX problem. Will get this fixed now.

rock_artist5y ago

Very cool but how do I know what languages supported? It says "VEED is able to recognise and transcribe languages from all over the world - English, Spanish, French, Chinese, and many more".

From my experience with NLP/AST the tricky part is models for some less common languages.

sabbakeynejadOP5y ago

This is true, we support over 55 languages. The more popular the language the better the results.

rubyron5y ago

What’s the pricing? What speech-to-text engine is being used?

Clicking on the Sign Up button on iOS Safari does nothing.

Clicking on the Get Started button takes me to an Upload Video form - not what I expected from a mp3-to-text service.

remram5y ago

Apparently you're limited to 50 MB for free, which is pretty short if you can't send audio files but only videos.

mxuribe5y ago

This would have been genius in the napster days of yore; why seek out and download mp3s yourself? Just sit back and have people send stuff to you! I kid, i kid! ;-)

remram5y ago

Is there even a good offline version of this? There are some opensource tools for speed-to-text but what about batch processing of audio files?

synesthesiam5y ago

You may be interested in voice2json for offline batch processing: https://voice2json.org

Here's an example using GNU parallel: http://voice2json.org/recipes.html#parallel-wav-recognition

remram5y ago

Wow this is exactly what I had in mind for "opensource tools for speed-to-text". I didn't know it did this too. Thanks a lot!

yorwba5y ago

> voice2json is optimized for:

> Sets of voice commands that are described well by a grammar

> Commands with uncommon words or pronunciations

> Commands or intents that can vary at runtime

Doesn't sound like what you'd want for a generic transcription service.

synesthesiam5y ago

It supports open-ended transcription too: https://voice2json.org/commands.html#open-transcription

Users have reported good accuracy with the English Deepspeech profile: https://github.com/synesthesiam/voice2json-profiles

Lemmih5y ago

How does this work? And is it more accurate than YouTube's automatic captions?

asah5y ago

Multi speaker?

236dev5y ago

reminds me of descript

j / k navigate · click thread line to collapse