Ask HN: Review my idea: Online transcription service with ASR

2 pointsjcrocholl16y ago0 comments

I want to make online transcription (speech to text) more accurate, faster, and cheaper at the same time. Users can upload MP3 files of interviews or podcasts or any other voice recording, using a web form or an API. Or they can call a phone number, enter their account number with the dial pad, and dictate their message after the beep.

My first step would be to cut the input into 5-minute chunks and use automatic speech recognition (ASR) to generate a rough outline for the transcription. Then each transcription chunk is posted automatically to Amazon Mechanical Turk for proofreading and editing. Turkers can earn points for good work, and this will qualify them for premium tasks which cost more.

The resulting Audio and Text can be used to improve the acoustic models for the speech recognition engine, so the automatic transcripts get better over time, and less work is required for proofreading and editing. It would be possible to train several classes of speaker-independent acoustic models, e.g. adult female speaker with German accent. Languages other than English are possible too.

This service is very similar to castingwords.com but faster and cheaper because it uses self-improving speech recognition technology.

Please let me know what you think. I'm planning to implement a simple prototype in Seattle during the next few weeks. Want to brainstorm with me over beer or coffee? We could be co-founders if we work well together.

0 comments

No comments yet.