I'm really glad there's work and advancement being done in this arena and I'm hoping to see more people playing around with it.
Some improvement text>>comprehension will be great. Right now it does not understand many of my queries. Keep it up the good work!
My training set is specifically designed to be conversational interview and personal questions, but I think a lot of the people who reach the site don't grasp that.
Here's some examples of the input it gets and has no clue what to do with:
- " um changes nice so when you change the things december " matched to: -1: no match
- " nexus 10 " matched to: 13: Huh?
- " pictures " matched to: 14: Huh?
- " call " matched to: 17: Cool
- "videos " matched to: 16: Huh?
- " change " matched to: 15: Huh?
And this is a set of input where people did get it:
- " hi what's your name " matched to: 23: My name
- "what's your name " matched to: 27: My name
- "what do you do " matched to: 31: What I do
- "why should we hire you " matched to: 38: Why you should hire me
-"what's your favorite food " matched to: 35: Food
-"wendy's see yourself in 5 years " matched to: 28: Goals
Like if you could just create smart NLP around SMS menus, you'd solve the third world's sms-as-a-helpdesk frustrations.
Or think of the premium subscription services you could charge for when people can interact on the level of natural language instead of just replying with simple commands.
"for the first time, the developers themselves do not have to be experts in the field, or face the prospect of huge expense to bring in that technical knowledge from elsewhere." - I love that the building blocks of building cool experiences become more well-polished and easier to fit together.
It's a good time to be alive, that's for sure!
However the cool thing about Wit is that they are constantly updating their suite of NL recognizers. The more you use the service, the better it gets, and it does so without having to buy a new release of Dragon. :)
That's disappointing since the only problem I ran into with doing home automation via a web application was the speech-to-text, not processing commands once they were in text. A list of regular expressions works quite well for that.
The HTML5 Speech Recognition API in Chrome kinda sucks. It does speech to text well, but reliably keeping the API listening for speech at all has been challenging. Even a bunch of code basically checking "has the webkitSpeechRecognition object borked itself yet? recreate it and restart listening" every two seconds doesn't work reliably.
I'd love a JavaScript API that can listen to the microphone, determine if anything has been spoken (versus silence or background noise), and when something that may be speech is detected, send it to another API endpoint that converts it to text.
Edit: They do take audio input, woo :) Thanks for the correction. https://wit.ai/docs/api#toc_9
Can't recommend this service enough.