Say No to the noise... Real-Time Kinect depth frame smoothing


This is going to be an exciting week for the Kinect for Windows device and SDK and I wouldn't be surprised if we don't have some "Special Edition" posts later this week. But in the mean time, we're going to be talking code this week. The rest of this week is going to be a "Skeleton" week, but today's project is one that fired up the Coding4Fun team (I got a number of, "Hey did you see this?"

We've all seen depth "images" right? And we've all seen that they appear a little noisy/blotchy? Well Karl Sanford saw the same thing and decided to apply some smooth moves to the problem...

Smoothing Kinect Depth Frames in Real-Time

Removing noise from the Kinect Depth Frames in real-time using pixel filters and weighted moving average techniques.

I've been working with the Microsoft Kinect for Xbox 360 on my PC for a few months now, and overall I find it fantastic! However, one thing that has continued to bug me is the seemingly poor quality of rendered Depth Frame images. There is a lot of noise in a Depth Frame, with missing bits and a pretty serious flickering issue. The frame rate isn't bad from the Kinect, with a maximum of around 30 fps; however, due to the random noise present in the data, it draws your perception to the refresh. In this article I am going to show you my solution to this problem. I will be smoothing Depth Frames in real-time as they come from the Kinect, and are rendered to the screen. This is accomplished through two combined methods: pixel filtering, and weighted moving average.


The Problem of Depth Data

Before I dive into the solution, let me better express the problem. Below is a screen shot of raw depth data rendered to an image for reference. Objects that are closer to the Kinect are lighter in color and objects that are further away are darker.


What you're looking at is an image of me sitting at my desk. I'm sitting in the middle; there is a bookcase to the left and a fake Christmas tree to the right. As you can already tell, even without the flickering of a video feed, the quality is pretty low. The maximum resolution that you can get for depth data from the Kinect is 320x240, but even for this resolution the quality looks poor indeed. The noise in the data manifests itself as white spots continuously popping in and out of the picture. Some of the noise in the data comes from the IR light being scattered by the object it’s hitting, some comes from shadows of objects closer to the Kinect. I wear glasses and often have noise where my glasses should be due to the IR light scattering.

Another limitation to the depth data is that it has a limit to how far it can see. The current limit is about 8 meters. Do you see that giant white square behind me in the picture? That's not an object close to the Kinect; the room I'm in actually extends beyond that white square about another meter. This is how the Kinect handles objects that it can't see with depth sensing, returning a depth of Zero.

The Solution

As I had mentioned briefly, the solution I have developed uses two different methods of smoothing the depth data: pixel filtering, and weighted moving average. The two methods can either be used separately or in series to produce a smoothed output. While the solution doesn't completely remove all noise, it does make an appreciable difference. The solutions I have used do not degrade the frame rate and are capable of producing real-time results for output to a screen or recording.



As you can see, the demo application will do a side by side comparison of the Raw Depth Image and the Smoothed Depth Image. You can experiment with the smoothing settings in the application as well. The settings that you will find when you first run the application are what I recommend for general purpose use. It provides a good mix of smoothing for stationary objects, moving objects, and doesn't try to "fill in" too much from the filtering method.

For example: You can turn both band filters down to 1, and turn the weighted moving average up to 10, and you'll have the lowest flicker and noise for stationary blunt objects. However, once you move, you will have a very noticeable trail, and your fingers will all look like they are webbed if you don't have a wall close behind you.

Points of Interest

I have really enjoyed playing around with these smoothing techniques and learning that there is probably no 'one-size-fits-all' solution for it. Even with the same hardware, your physical environment and intentions will drive your choice for smoothing more than anything. I would like to encourage you to open the code and take a look for yourself, and share your ideas for improvement! At the very least, go borrow your neighbor kid’s Kinect for a day and give the demo application a whirl.

Another point of interest has been in seeing how reducing noise for rendering purposes can actually introduce noise for depth calculation purposes. The pixel filtering method of reducing noise works well on it's own to reduce noise in depth information prior to further calculations; removing 'white' noise and providing a best guess as to what that information should be. The weighted moving average method works well to reduce noise prior to rendering, but will actually introduce noise along the Z,Y perspective due to the effects of averaging depth information. I hope to continue learning about these effects and how to use them properly for different types of applications.


Project Information URL:

Project Download URL:

Project Source URL:


Contact Information: