"The Future of Kinect"
Zombies don’t have to be scary – especially when kids can create them in their own image. Using the Kinect for Windows v2 sensor and an app called YAKiT, children can step into the role of the undead and see it come to life using performance-based animation. Like so many who use the Kinect sensor, kids don’t need a laundry list of instructions to use it. They just step in front of it, creep like zombies and instantly, their animated figures move like them, sparking a cacophony of giggles.
While the latest version of Kinect has been available since the launch of Xbox One, the preorder of the Kinect for Windows version becomes available for all developers today. Both sensors are built on a set of shared technologies.
Companies such as Freak’n Genius, the Seattle-based company behind YAKit, have already had the chance to try the Kinect for Windows v2 sensor through its Developer Preview Program. “It’s so magical, honestly,” says Kyle Kesterson, Freak’n Genius founder. “We put people in front of it, and they light up without even having to do anything.”
But behind that magic is the culmination of years of machine learning. It’s all part of a complex 24-7 process that involves a legion of people and resources that gather data on voices, body gestures and facial expressions, then test the information and analyze it before the software makes its way to your living room.
Machine learning: Teaching software how to behave
At Microsoft, there’s a whole group of people in the NUI group focused on taking requests from different teams and gathering information about how people move and express themselves.
“We start with designing the hardware, getting the best eyes and ears into the living room. Then we go through the process of building the software for it – the brain that takes that raw signal and takes it into an understanding of the room and the people in it,” says Evans.
When it was released as part of Xbox One, Kinect was already programmed to recognize certain movements and objects as a baseline. But in order to improve that software, first Microsoft needs to document real people using it in their natural environments, then manually compare what Kinect sees with reality (“ground truth”). That data is then fed into a system, which runs algorithms to find where its software recognition doesn’t match the ground truth – and that’s where it knows to improve.
Collecting data for Kinect means bringing volunteers to labs on the Microsoft campus, suiting up for motion capture sessions and visiting Microsoft employees’ homes – a diverse group that spans age, gender, languages and ethnicity – to record video clips of bodies in natural motion.
The Ground Truth
All that data then goes to taggers who establish “ground truth.” It’s a tedious but necessary set of tasks that involve skeleton tracking, tagging 25 joints on the human body electronically, defined on a frame-by-frame basis. This is how movement is documented in 3D spaces and fed into machine learning. About 20 in-house taggers have to define where hand, shoulders, hands and feet are – as well as other areas on a body.
Passing the Gauntlet
Vince Ortado’s team at Microsoft processes up to 180,000 video clips an hour, running machine learning algorithms that improve Kinect’s software. More than 300 Xbox developer kits operate 24-7, divided into groups testing anything from hand gestures to identity.
It’s important to have all these millions of frames of video go through as fast as possible, as the teams working on Kinect can only act after they’ve received the results. And they’re on a schedule to act at a brisk pace with monthly software releases that give users an experience that continuously improves.
Right now, people can experience Kinect through Xbox One: playing games, choosing movies and using Skype. Or they might be out and about and interact with a Kinect for Windows sensor as part of a retail experience, or in other spaces such as museums, hotels or corporate offices. Or they may happen upon interactive animation experiences such as those Freak’n Genius has staged, that put people on stage dancing as a company mascot. The availability of preorders on Thursday will allow even more Kinect for Windows v2 sensors to get into the hands of developers and enable a wider variety of user scenarios.
As for the teams of people who continue working to improve Kinect, Kinect’s Evans says, “It’s all about making Kinect work whether or not you have a puffy couch or a ficus in your living room that might look like a person. Being able to always get it right and understand who you are in your natural environment, in every living room with every person. That’s the investment we make in doing the machine learning. It’s to get it right for everybody.”
Project Information URL: https://www.microsoft.com/en-us/news/features/2014/jun14/06-05kinect.aspx