Sign in to queue


Another cool NUI example that simplifies the controlling of another media player. What I liked was how not only generic, start/stop/etc. voice commands were shown off, but specific track support too, "Kinect Play Track [song name here]"

Introducing KinecTunes - Control iTunes with Your Voice!

A few weeks ago, Microsoft released a SDK for the Kinect. As I had just gotten a Kinect, I was excited to check it out. I love the voice command feature of the Kinect (especially for Netflix on Xbox), so naturally I decided to make an app that incorporated them. I wanted something useful, something beyond just a workable demo. Today, KinecTunes is complete - the ability to control iTunes with voice commands!


  • Play specific songs or artists
  • Pause, Resume, or Stop the currently playing song
  • Two versions: Command-line tool and Windows Form app

Project Information URL:

Project Download URL:

Project Source URL:


private Grammar BuildItunesGrammar(RecognizerInfo ri)
     Choices choices = new Choices();
     itunesConnector = new ItunesConnector();
     List<KinecTrack> tracks = itunesConnector.GetFullLibrary();

     // Get list of all Artists and add new command for each
     List<string> artistList = tracks.Select(song => song.Artist).Distinct().ToList();
     foreach (var artist in artistList)
         if (artist != null)
             choices.Add(string.Format("kinect play artist {0}", artist));

     List<string> songList = tracks.Select(song => song.Name).Distinct().ToList();
     foreach (var song in songList)
         if (song != null)
             choices.Add(string.Format("kinect play song {0}", song));

     // Add playback functionality
     choices.Add("kinect pause");
     choices.Add("kinect play");
     choices.Add("kinect stop");
     choices.Add("kinect terminate");

     // Specify the culture to match the recognizer in case we are running in a different culture.
     var gb = new GrammarBuilder();
     gb.Culture = ri.Culture;

     return new Grammar(gb);

Contact Information:

The Discussion

  • User profile image

    Great idea!

  • User profile image

    I have a question about speech with kinect.
    Does the kinect speech recognition need training? Or it just uses the kinect as a microphone and the speech engine Windows7 provides?


  • User profile image

    @CristiR: Hi, there is no "training" involved.  You provide the Kinect the words (Choices in the above code) that it should recognize.  The Kinect does have a built-in microphone, so when you say a command to it and it finds a matching Choice for that command, it fires an event that you can then write more code against.  Additional speech libraries are necessary to download to get this to work as well; see the project page.

  • User profile image

    Great job on this it works really great I hope someday to learn as much as you. only problem I ran into was that if the music was playing the kinect would not recognize anything i said after I turned it down it would which made it kind of defeat the purpose because to change the song you can't just say it you have to go up to the computer and turn the music down but obviously getting as far as you did and making work so smoothly takes alot of skill mad rexpect and hopefully some one can think of an Idea of the problem i just mentioned. do you?

  • User profile image

    I also thought of this problem and when I created this video

    showing my own run through of this program I placed my headphones over my camera so that the Kinect could hear my commands.

    One solution would be to wear a headset so that the microphone is close to the mouth as opposed to placed near a speaker etc.

Add Your 2 Cents