Building a sound and movement detection application


Have you ever used a web cam application that had motion detection capabilities? Where it would "do stuff" when it detected some kind of movement? Ever wondered how that was done and more importantly how you could add it to your own apps?

Ever seen an application that did sound threshold detection, where sound changes would trigger something to happen? Ever wondered how that was done and more importantly how you could add it to your own apps?

Then check out today's project, courtesy of CodeProject...

How to use video and sound information to detect intruder



This article shows you how to use video and sound information to detect intruder. Image data and sound data are continually collected from the environment respectively through webcam and microphone. Once the conditions of surrounding environment is changing, Corresponding alarm will be raised immediately.

The package contains:

  • DirectShowLib: Manipulate the webcam, grab images. I have wrapped functions related to IntruderDetection in class CManipulateWebcam. More information about DirectShowLib, please see
  • ManipulateMicrophone: Manipulate the microphone, grab sound, play sound. Functions related to IntruderDetection is wrapped in class CManipulateMicrophone. More information about ManipulateMicrophone please see the part "Using the code" of the article: Sending and playing microphone audio over network.

  • WavStream: Read .wav file, save data into .wav file. The class CManipulateMicrophone uses WavStream.dll to save sound data into .wav file. Following code is useful.

    WaveStreamWriter wavwrite = WaveStreamWriter(FileName, SamplingRate, Channels, BitPerSample);
    // SoundBuffer is an array which contains the sound data you want to save
    wavwrite.Write(SoundBuffer, BufferLength);

  • IntruderDetection: The main application, needs (DirectShowLib.dll, ManipuateMicrophone.dll, WavStream.dll).
DirectShowLib, ManipulateMicrophone and WavStream are independent projects. You can use DirectShowLib in your application to interact with webcam. Use ManipulateMicrophone to interact with microphone. Use WavStream to interact with .wav file. Of course, use IntruderDetection to detect intruder.

How does the IntruderDetection work


While the math hurt my brain, I thought it pretty interesting to see if done natively, and not via a third party component...

Background knowledge of IntruderDetection

Video detection

For video, define the distance of two image a(i,j) (ra,ga,ba) and b(i,j)(rb,gb,bb) as

D1=∑((|ra-rb|+|ga-gb|+|ba-bb|)/3) i=1,2...; j=1,2,... or

D2=∑((|ra-rb|2+|ga-gb|2+|ba-bb|2)/3) i=1,2...; j=1,2,... or

D3=max((|ra-rb|+|ga-gb|+|ba-bb|)/3) i=1,2...; j=1,2,...

If the D1,D2 or D3 is bigger than a threshold value, we believe something is happening.

Here's a couple snaps of the Solution;


And a code snip from the video comparison function;

private bool CompareImage(Bitmap a, Bitmap b)
     int width = b.Width;
     int height = b.Height;
     BitmapData data_a = a.LockBits(new Rectangle(0, 0, width, height), ImageLockMode.ReadWrite, PixelFormat.Format24bppRgb);
     BitmapData data_b = b.LockBits(new Rectangle(0, 0, width, height), ImageLockMode.ReadWrite, PixelFormat.Format24bppRgb);
         byte* pa = (byte*)data_a.Scan0;
         byte* pb = (byte*)data_b.Scan0;
         int offset = data_b.Stride - width * 3;
         double Ra, Ga, Ba, Rb, Gb, Bb;
         double temp1 = 0, temp2 = 0;
         double max = 0;
         for (int y = 0; y < height; y++)
             for (int x = 0; x < width; x++)
                 Ra = (double)pa[2];
                 Ga = (double)pa[1];
                 Ba = (double)pa[0];
                 Rb = (double)pb[2];
                 Gb = (double)pb[1];
                 Bb = (double)pb[0];

                 temp1 = (Math.Abs(Ra - Rb) + Math.Abs(Ga - Gb) + Math.Abs(Ba - Bb)) / 3;
                 temp2 += temp1;

                 if (temp1 > max)
                     max = temp1;
                 pa += 3;
                 pb += 3;

             pa += offset;
             pb += offset;
         temp2 = temp2 / (height * width);

         if (max > 200 || temp2 > 5)
             return true;
             return false;

I also thought the DirectShowLib project cool in and of itself;

The purpose of this library is to allow access to Microsoft's DirectShow functionality from within .NET applications. This library supports both Visual Basic .NET and C#, and theoretically, should work with any .NET language.

Microsoft's managed solution to allowing access to DirectShow from .NET isn’t nearly as complete as the DirectShow interfaces for C++. For developers who want the complete range of functionality of DirectShow in .NET, this library provides the enums, structs, and interface definitions to access them.

Reviewing the source code will show that there is very little executable code in this library. There are a few helper functions (mostly in DsUtils.cs), but everything else in the library is just definitions.

Although there are ~541 interfaces defined in the source code, only some of them have been tested to ensure that they are working. See ReadMe.rtf for a discussion about the difference between tested and untested.

[Added as well worthy of a post of its own] Wink

Here's a snap of that;


If you're looking to play with DirectShow, compare to video frames, detect sound changes and build an app on top of all that, this project could easily be what you are looking for.

The Discussion

Comments closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to send us feedback you can Contact Us.