Today we're highlight two recent posts from Abhijit Jana, both speech related.
If you've been following long, you'll know my feelings about speech and how that's a killer feature. Abhijit's two posts help you understand and better use speech in your next app...
When a speech is recognized by the Speech Recognizer engine, the Speech Recognizer returns the recognized words as collection of the type
RecognizedWordUnitclass. This set of words are extremely useful to deal with any sentences. In my previous post, I discussed about recognizing a statement like “draw a read circle” or “draw a green circle”; where we had to identify the sentences from the Kinect captured audio and then splitting it in a series of words.
With Microsoft Speech API, the
SpeechRecognitionEngineclass handles all the operation related with the speech. You can then attach an event handler to the
SpeechRecognizedevent, which will fire whenever the audio is internally converted into text and identified by the recognizer. The following code shows how you can create an instance of
SpeechRecognitionEngineand register the Speech Recognized event handler.
Project Information URL: http://dailydotnettips.com/2014/01/20/get-the-list-of-recognized-words-from-kinect-speech-commands/
In my Kinect for Windows SDK Tips series, over the last few posts I was discussing about speech recognition using Kinect for Windows SDK. You have seen how we can load / unload multiple grammar, how to use wildcard with grammar builder or even getting list of recognized words from Kinect. This post is related with the confidence level of recognized words and I think this is required for all types of speech enabled application using Kinect.
In a speech enabled Kinect application, whenever the speech is recognized, usually we invoke a method to parse the command; and perform the action based on the recognized commands. But before we parse it:
The speech recognizer also provides all the information based on the confidence level of the sound source on the speech that was identified. If the speech is detected but does not match properly or is of very low confidence level, the
SpeechRecognitionRejectedevent handler will fire