How does the Kinect work for robotics?

Sign in to queue


Today's Kinecting to Hardware post provides a little background for how the Kinect actually works... From a robotic perspective. [Insert very weak quip, "Do Kinect's dream of electronic skeletons?" here]

Though this project doesn't have any source/code, given yesterday's post, FIRST Kinect kickoff, I thought it fitting.

Kinect for Robotics

The Kinect sensor is one of the best devices, for its price, to become available for robotics in the last decade. For $150 you get 3D range (distance) data and RGB color (webcam video) data. There is a microphone array thrown in as well. But wait! There’s more. The Kinect can detect people and generate skeleton data and you don’t have to write a single line of code to do the processing – the Kinect for Windows SDK does all the hard work for you.

As a programmer, the Kinect gives you an array of depth values that correspond to the pixels of the RGB image. This unusual coordinate space, consisting of x and y as pixel coordinates and z as a distance in millimeters, is called the Depth Image Space. All of the distances are measured from a virtual plane passing through the Kinect camera. You can convert the data into conventional (x, y z) in meters but this requires additional processing overhead.

Skeleton Space on the other hand uses conventional (x, y, z) coordinates. A skeleton consists of a set of 20 joints, each with its own 3D coordinates ...

How it Works

The Kinect uses an infrared (IR) laser to spray out a pseudo-random beam pattern. An IR camera captures an image of the dots that are reflected off objects (as in the picture below) and the electronics inside the Kinect figures out how much the dot pattern has been distorted. The distortion is a measure of distance from the camera. This approach is commonly called Structured Light. All this happens at 30 frames per second – not bad for a cheap consumer device.


Comparison to Laser Range Finders

Laser Range Finders, or LIDAR (Light Detection and Ranging) devices, have been on the market for a long time. A LRF works by sending out pulses of infrared laser light and timing the return signal. Therefore they are known as Time of Flight devices.

The German SICK brand of LRFs have long been the workhorses of the research community, but these cost thousands of dollars. More recently, Hokuyo in Japan has been selling a cheaper range of LRFs that are approaching the $1,000 barrier. However, for this amount of money you can buy six Kinect sensors, although you would need enough USB ports to plug them all in and the necessary processing power to make use of all the 3D data.

The primary differences between a LRF and a Kinect are the Field of View (FOV), Maximum Range and Resolution. A conventional LRF has a FOV of 180 degrees, and some are up to 270 degrees. In contrast, the Kinect only has a 57 degree FOV.


The Future for Kinect

Currently the Kinect has a limited range of 80cm to 4 meters. This means that the Kinect cannot see objects that are right in front of the robot, so you still need traditional obstacle sensors such as sonar (which is why they are included on the RDS Reference Platform). Next year, when the Kinect for Windows Hardware is released, there will be a new “near mode” that will range from 50cm to 3 meters. Although this helps with detecting nearby obstacles, a robot should still have other sensors for redundancy.

Going beyond next year, Microsoft will continue researching even better Kinect hardware. This means that 3D depth data is now here to stay, so sharpen up your 3D geometry skills and get cracking on applications that take full advantage of these new devices.

Project Information URL:


Contact Information:

The Discussion

Add Your 2 Cents