Entries:
Comments:
Posts:

Loading User Information from Channel 9

Something went wrong getting user information from Channel 9

Latest Achievement:

Loading User Information from MSDN

Something went wrong getting user information from MSDN

Visual Studio Achievements

Latest Achievement:

Loading Visual Studio Achievements

Something went wrong getting the Visual Studio Achievements

Project Lily and Context-Aware Dialogue with Kinect

Today's project focuses on the a part of Kinect development that might not be a awe inspiring as augmented reality, games, 3D modeling, NUI's, etc, but in the end is one of the killer features of the Kinect, speech recognition and using it to add communication capabilities to your projects...

Context-Aware Dialogue with Kinect

Meet Lily, my office assistant. We converse often, and at my direction Lily performs common business tasks such as looking up information and working with Microsoft Office documents. But more important, Lily is a virtual office assistant, a Microsoft Kinect-enabled Windows Presentation Foundation (WPF) application that’s part of a project to advance the means of context-aware dialogue and multimodal communication.

Before I get into the nuts-and-bolts code of my app—which I developed as part of my graduate work at George Mason University—I’ll explain what I mean by context-aware dialogue and multimodal communication.

Context-Aware Dialogue and Multimodal Communication

As human beings, we have rich and complex means of communicating. Consider the following scenario: A baby begins crying. When the infant notices his mother is looking, he points at a cookie lying on the floor. The mother smiles in that sympathetic way mothers have, bends over, picks up the cookie and returns it to the baby. Delighted at the return of the treasure, the baby squeals and gives a quick clap of its hands before greedily grabbing the cookie.

This scene describes a simple sequence of events. But take a closer look. Examine the modes of communication that took place. Consider implementing a software system where either the baby or the mother is removed and the communication is facilitated by the system. You can quickly realize just how complex and complicated the communication methods employed by the two actors really are. There’s audio processing in understanding the baby’s cry, squeal of joy and the sound of the clap of hands. There’s the visual analysis required to comprehend the gestures repre­sented by the baby pointing at the cookie, as well as inferring the mild reproach of the mother by giving the sympathetic smile. As often is the case with actions as ubiquitous as these, we take for granted the level of sophistication employed until we have to implement that same level of experience through a machine.

Let’s add a little complexity to the methods of communication. Consider the following scenario. You walk into a room where several people are in the middle of a conversation. You hear a single word: “cool.” The others in the room look to you to contribute. What could you offer? Cool can mean a great many things. For example, the person might have been discussing the temperature of the room. The speaker might have been exhibiting approval of something (“that car is cool”). The person could have been discussing the relations between countries (“negotiations are beginning to cool”). Without the benefit of the context surrounding that single word, one stands little chance of understanding the meaning of the word at the point that it’s uttered. There has to be some level of semantic understanding in order to comprehend the intended meaning. This concept is at the core of this article.

Project Lily

I created Project Lily as the final project for CS895: Software for Context-Aware Multiuser Systems at George Mason University, taught by Dr. João Pedro Sousa. As mentioned, Lily is a virtual assistant placed in a typical business office setting. I used the Kinect device and the Kinect for Windows SDK beta 2. Kinect provides a color camera, a depth-sensing camera, an array of four microphones and a convenient API that can be used to create natural UIs. Also, the Microsoft Kinect for Windows site (microsoft.com/en-us/kinectforwindows) and Channel 9 (bit.ly/zD15UR) provide a plethora of useful, related examples. Kinect has brought incredible capabilities to developers in a (relatively) inexpensive package. This is demonstrated by Kinect breaking the Guinness World Records “fastest selling consumer device” record (on.mash.to/hVbZOA). The Kinect technical specifications (documented at bit.ly/zZ1PN7) include:

Project Information URL: http://msdn.microsoft.com/en-us/magazine/hh882450.aspx

Project Source URL: http://archive.msdn.microsoft.com/mag201204Kinect

image

image

image

image

Tags:

Follow the Discussion

  • sounds cool Angel

  • Sdk  beta ,not vertion 1.0 Sad

  • Greg Duncangduncan411 It's amazing what a professional photographer can do...

    GRRRR. Confirmed. Given how long v1 has been out and the fact the code was released at the end of March and was in the MSDN Mag April edition, I didn't even think about checking (nor did I catch the comments about it being Beta 2).

    Sorry about that. Again I try to avoid anything Beta 1 or 2 in the Gallery now, but this slipped through.

Remove this comment

Remove this thread

close

Comments Closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums,
or Contact Us and let us know.