Kinect for Windows Quickstart Series

Audio Fundamentals

Download this episode

Download Video

Description

In the final installment of the Kinect for Windows Quickstart series, we’ll discuss how you can leverage the audio features of Kinect in your application, including:

  • How to use your Kinect to determine the angle and confidence for where a sound is coming from
  • How to use the KinectAudioSource to record audio synchronously and asynchronously
  • How to build a basic speech recognition application to dynamically turn application features on/off
  • How the speech recognition engine can be used even when the application is not the current active application 

Resources

Embed

Format

Available formats for this video:

Actual format may change based on video formats available and browser capability.

    The Discussion

    • Ray

      I was wondering if anyone else has problems with the Audio Demo? It will not run for me at all. It says that I need the Kinect SDK installed or that there is an error starting it. I have the Kinect SDK language pack and I have the V11 SDK / runtime etc..

    • Dan

      @Ray: Assuming you are talking about the KinectAudioDemo, can you send in the exact error that's causing the issue? There is a message box that displays the message below if it doesn't get a Kinect recognizer or there was an issue creating the speech recognition engine. Is this the error message you are seeing?  

      "There was a problem initializing Speech Recognition. Ensure you have the Microsoft Speech SDK installed.","Failed to load Speech SDK",

    • gswitz

      This is great! I remember you when you were an evangelist presenting in Richmond, Va (2001?). I'm having a great time with the Kinect, Xbox Controller, C#, VPL and Lego Mindstorms. I hadn't yet played with the positional data for the sound. This must mean that there are two microphones on the Kinect getting a stereo image? Does this risk sound issues related to phase issues in the recordings? Is the recorded output the sum of both mics, or one mic, or the other? Can you get both to make a stereo recording?

    • kendrick0772

      @Dan:

      Hello! Sorry for my bad English and straight forwarding.

      Few questions to ask here. I'm not fully understand this.

      There is no coding details on how to develop/code this section.

      Step by Step guidance for beginner like us is vital.

      Or any other source which we can refer?

      Is that any coding example which include this audio tracking and together which skeleton tracking?

      is there any code sample fundamental which working together with both audio and skeleton tracking?

      Need guidance here.

      Thanks in advance!

    • HugoBringas

      Up Agree.. can't understand very well.. im a beginner

    • Dan

      , kendrick0772 wrote

      @Dan:

      Hello! Sorry for my bad English and straight forwarding.

      Few questions to ask here. I'm not fully understand this.

      There is no coding details on how to develop/code this section.

      Step by Step guidance for beginner like us is vital.

      Or any other source which we can refer?

      Is that any coding example which include this audio tracking and together which skeleton tracking?

      is there any code sample fundamental which working together with both audio and skeleton tracking?

      Need guidance here.

      Thanks in advance!

      Thanks for your feedback:)

      The issue for audio programming is that there is a lot to cover - sound source localization, audio recording, and speech recognition so I wanted to make sure to cover all of them and how they work versus coding the demos for all of them individually as there is a lot of monotonous code that you can just reuse from the samples. Either way it's good feedback and perhaps I should look at just doing one of the demos.

    • Antonis

      @Dan Hello Dan. I have a rather small problem i think with speech aspect. I want to add some speech elements in my project but when am about to add this line
      'private EnergyCalculatingPassThroughStream stream;' i get error that i dont have a reference to it. Ive added all references and using statements that are needed. Can yo please tell me what am i missing?
      Thank you in advance.

    • KinectFJamal

      @Dan Hello Dan.I have a question and indeed it comes to image detection.

      what do I do at the beginning!!

      please Help,

      best regards

       

    • Dan

      @Dan Hello Dan. I have a rather small problem i think with speech aspect. I want to add some speech elements in my project but when am about to add this line
      'private EnergyCalculatingPassThroughStream stream;' i get error that i dont have a reference to it. Ive added all references and using statements that are needed. Can yo please tell me what am i missing?
      Thank you in advance.

      [/quote]

      Hmm, I'm not sure what could be missing, can you copy/paste the exact error?

    • Dan

      , KinectFJamal wrote

      @Dan Hello Dan.I have a question and indeed it comes to image detection.

      what do I do at the beginning!!

      please Help,

      best regards

       

      Can you give me more detail on exactly what you're trying to do?

    • KimZe

      Hi Dan, first of all congrats for this GREAT video tutorials about Kinect. Big Smile Big Smile

      Just one question... what about Portuguese Speech Recognition, you know if there will be any language pack available for kinect? Being the Portuguese the 5th most spoken language in the world, with 272,9 milions of people talking it, it's important that Microsoft take this in mind. 

      Thanks and keep the good work! 

      Obrigado. Smiley

    • Dan

      @KimZe: Thanks for the feedback Smiley

      The team has announced what's coming in the next version on their blog, and the next language packs coming are: French, Spanish, Italian, and Japanese. If you want to give feedback to the Kinect team, I'd suggest you do it directly on that blog post.

      http://blogs.msdn.com/b/kinectforwindows/archive/2012/03/26/what-s-ahead-a-sneak-peek.aspx

    • ehsan rahim

      hello,
      I am working on kinect sensor I want to record a video through kinect sensor I did environment setting every thing and created buttons but I just need little idea how to record a video.

      Best regards,

    • Dan

      @ehsan rahim: The 1.5 release of the SDK, releasing in May, will include the ability to record video.

      Here's a snippet from the blog post:

      Among the most exciting new capabilities is Kinect Studio, an application that will allow developers to record, playback and debug clips of users engaging with their applications.  

       

      http://blogs.msdn.com/b/kinectforwindows/archive/2012/03/26/what-s-ahead-a-sneak-peek.aspx

    • Stevie Giovanni

      Hi Dan, I'm trying to integrate Kinect speech recognition to face tracking visualization sample application. I copied some of the code from speechbasics-d2d. I'm having some problem. When I use NuiInitialize(dwNuiInitDepthFlag | NUI_INITIALIZE_FLAG_USES_SKELETON | NUI_INITIALIZE_FLAG_USES_COLOR | NUI_INITIALIZE_FLAG_USES_AUDIO); the SREngineConfidence is always very low, but when i only initialize the audio it works fine. Do you know why is that? Is there a sample code that combine skeleton, face tracking and speech recognition? Thanks.

    • Stevie Giovanni

      ok, so it seems that sometimes it works fine with NuiInitialize(dwNuiInitDepthFlag | NUI_INITIALIZE_FLAG_USES_SKELETON | NUI_INITIALIZE_FLAG_USES_COLOR | NUI_INITIALIZE_FLAG_USES_AUDIO); But do you know how i can make it so that it will always work like the origina speechbasics-d2d?

    • smithdavidp

      Okay, here's a new one for you.  I am trying to modify the MRDS4 code for Robotics tutorial 7 (Speech and vision in Robots).  I am using the Kinect for Windows (No XBox) and the Microsoft.Speech SDK 11.  I am programming in MS/VS 2010 (C#).  My goal is to make the follow me, and drive by voice, examples work with the 1.5 Kinect and the Microsoft.Speech 11 SDK's.  Has anyone accomplished this yet?  Running the original tutorial, with my Eddie Mark platform, netted some interesting results.  Nice slow turns when you say left or right but saying forward, or backward, the thing is a Jack Rabbit.  If you don't say stop 1 second after the forward command the thing will have traveled 6 feet.  I know that I just need to cut the speeds down in the original code.  But Microsoft is all happy over the Kinect 1.5 and the Speech SDK 11 so I want to run the tutorial under those librarys and manifests.  Needless to say the iRobot was the target of that tutorial.  Can anyone help me with this?

    • Kimberly Shane

      Hi Dan

      Is it possible that in a program I can record video (with audio) and at the same I have voice recognition. Thanks!

    • juanpibanez

      Hi Dan, I was wondering the following after taking a look at the source code of this example:

      - Why do we need the EnergyCalculatingPassThroughStream class? Is it really necesary for just recongizing single words like "play" or "stop"?

      - In which part of the code do I set the language pack and how do I do it for example for loading the Spanish Mexico one?

    • Roger

      Hi Dan.
      This video is actually great but i'm not sure that i can follow.
      I'm just a newbie in Kinect programing and have a favor to aks you.
      -Can you show me how to make a simple C# code that when you say something to the Kinect it will immediately display your sentence to a textbox?

      Looking forward to your response ASAP.
      Thanks.

    Comments closed

    Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums, or Contact Us and let us know.