A Simple Gesture Processing Framework for the Kinect For Windows
- Posted: Sep 25, 2012 at 6:00 AM
- 6,261 Views
Gesture handling is one of those things with the current Kinect for Windows SDK that we are all re-inventing. Until something is baked into the SDK, this is a ripe area for innovation and experimentation. Today's project is a good example of that...
This sample will show you the basic components of a gesture processing system for the Microsoft Kinect for Windows. Gesture processing is one of the most powerful abilities of an augmented reality application and the Kinect makes processing gestures straightforward. This sample is very rudimentary, but it demonstrates the basic gesture processing pipeline that you would have to implement in a real world application. In this case, the application is a simple Kinect controller for PowerPoint applications using keyboard shortcuts to control the PowerPoint application. Along the way, it also demonstrates some other useful concepts, like how to draw a skeleton from the Kinect depth stream. I hope you find this sample useful. Please leave any feedback or questions that you may have.
Description
This sample builds on the Beginning Kinect for Windows Programming sample. If you have not looked at that sample AND you are not familiar with Kinect for Windows programming, I would recommend that you start there first before tackling this sample.
This sample assumes that you already have a Kinect test environment set up with all of the tools and technologies installed. If you do not have that, the previous sample will also help you with that.
When you are ready to proceed with this sample, open up the source code file that is attached and take a look at the gesture.cs file within the GestureFramework project.
This simple gesture framework is based upon the relationships of skeletal joints to each other over some period of time. The enumeration JointRelationship describes the possible relationships of joints that are supported by the framework.
The framework has two major class hierarchies. One is a static description of the gestures, and the other is a really simple state engine that examines incoming skeletal information from the Kinect and tries to discern the relationships between joints that it is interested in using the static model. The figure below shows the relationships between the major classes in the framework.
Here are the descriptions of the major classes in the hierarchy:
GestureComponent -Describes a single relationship between two joints. The relationship contains only two joints and has two temporal parts; a beginning relationship and an ending relationship.
Gesture - Essentially this is a list of type GestureComponents with a unique identifier and a description.
GestureMap - Contains all of the utility functions to load a set of gestures from an XML file and manage their lifecycle.
GestureComponentState - Tracks the state of the relationships for a GestureComponent.
GestureState - Manages the state model for a set of GestureComponents and reports on the state. This class in our application also maps the keycode for the PowerPoint application to the gesture model, although this is probably sub-optimal design.
GestureMapState - Rolls up the current state of all gestures and is the primary interface of the application into the gesture state model.
If you look through the above classes, you will see that there is absolutely no magic or rocket science inside of them. The gesture recognition that they provide is very primitive, but they provide a platform onto which you could build a much more sophisticated model if you desire.
Project Information URL: http://code.msdn.microsoft.com/Simple-Gesture-Processing-097c5527
Project Download URL: http://code.msdn.microsoft.com/Simple-Gesture-Processing-097c5527
Project Source URL: http://code.msdn.microsoft.com/Simple-Gesture-Processing-097c5527
void SensorSkeletonFrameReady(AllFramesReadyEventArgs e)
{
using (SkeletonFrame skeletonFrameData = e.OpenSkeletonFrame())
{
if (skeletonFrameData == null)
{
return;
}
var allSkeletons = new Skeleton[skeletonFrameData.SkeletonArrayLength];
skeletonFrameData.CopySkeletonDataTo(allSkeletons);
foreach (Skeleton sd in allSkeletons)
{
// If this skeleton is no longer being tracked, skip it
if (sd.TrackingState != SkeletonTrackingState.Tracked)
{
continue;
}
// If there is not already a gesture state map for this skeleton, then create one
if (!_gestureMaps.ContainsKey(sd.TrackingId))
{
var mapstate = new GestureMapState(_gestureMap);
_gestureMaps.Add(sd.TrackingId, mapstate);
}
var keycode = _gestureMaps[sd.TrackingId].Evaluate(sd, false, _bitmap.Width, _bitmap.Height);
GetWaitingMessages(_gestureMaps);
if (keycode != VirtualKeyCode.NONAME)
{
rtbMessages.AppendText("Gesture accepted from player " + sd.TrackingId + "\r");
rtbMessages.ScrollToCaret();
rtbMessages.AppendText("Command passed to System: " + keycode + "\r");
rtbMessages.ScrollToCaret();
InputSimulator.SimulateKeyPress(keycode);
_gestureMaps[sd.TrackingId].ResetAll(sd);
}
PlayerId = sd.TrackingId;
if (_bitmap != null)
_bitmap = AddSkeletonToDepthBitmap(sd, _bitmap, false);
}
}
}
Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation,
please create a new thread in our Forums,
or
Contact Us and let us know.
Follow the Discussion
Oops, something didn't work.
What does this mean?
Following an item on Channel 9 allows you to watch for new content and comments that you are interested in. You need to be signed in to Channel 9 to use this feature.What does this mean?
Following an item on Channel 9 allows you to watch for new content and comments that you are interested in and view them all on your notifications page.sign up for email notifications?