.NET Voice Recorder


In this article I demonstrate how to record from the microphone in .NET, with support for setting the recording level, trimming noise from the start and end, visualizing the waveform in WPF and converting to MP3.

Audio Recording in .NET

The .NET framework does not provide any direct support for recording audio, so I will make use of the open source NAudio project, which includes wrappers for a number of Windows audio recording APIs.

Note: It is important to point out that .NET is not an appropriate choice for high sample rate and low latency audio recording, such as that found in Digital Audio Workstation software used in recording studios. This is because the .NET garbage collector can interrupt the process at any point. However, for purposes of recording speech from the microphone, the .NET framework is more than capable. By default, NAudio asks the soundcard to give us data every 100ms, which gives plenty of time for the garbage collector to run as well as our own code.

We will make use of the wrappers for the waveIn API's, as these are the most universally supported, and allow us freedom to choose the sample rate. We will record in mono, 16 bit at 8kHz, which is more than good enough audio quality for speech, and will not overly tax the processor, which is important as we want to visualize the waveform as well.

Choosing a Capture Device

Normally, you will be able to use the default audio capture device without any difficulties, but should you need to offer the user a choice, NAudio will allow you to do so. You can use the WaveIn.DeviceCount and WaveIn.GetDeviceCapabilities to find out how many recording devices are present, and query for their name and number of supported channels.

On my computer, I have a single waveIn device (Microphone Array) until I plug my headset in, at which point, a new device appears and becomes the default (device 0 is always the default).

int waveInDevices = WaveIn.DeviceCount;
for (int waveInDevice = 0; waveInDevice < waveInDevices; waveInDevice++)
    WaveInCapabilities deviceInfo = WaveIn.GetCapabilities(waveInDevice);
    Console.WriteLine("Device {0}: {1}, {2} channels", 
        waveInDevice, deviceInfo.ProductName, deviceInfo.Channels);

This produces the following output on my computer:

Device 0: Microphone / Line In (SigmaTel , 2 channels
Device 1: Microphone Array (SigmaTel High, 2 channels

Unfortunately these device names are truncated because the WAVEINCAPS structure only supports 31 characters. There is a way of getting the full device name, but it is rather convoluted.

Normally, you will choose Device 0 (the default), but if you wish to select a different input device, simply set the DeviceNumber property on your WaveIn object to the desired number.

Checking the Recording Level

The first step in recording is usually to help the user determine if their microphone is working or not. This is especially important if the user has more than one input on their soundcard. The way we achieve this is simply by starting recording and displaying the level of audio detected to the user with a volume meter. The waveIn APIs do not write anything to disk, so no audio is actually being ‘recorded' at this point, we are simply examining the input level and then throwing the captured audio samples away.

To begin capturing audio from the soundcard, we use the WaveIn class in NAudio. We configure it with the WaveFormat in which we would like to record (in our case 8kHz mono), before calling StartRecording, to start capturing audio from the device.

waveIn = new WaveIn();
waveIn.DeviceNumber = selectedDevice;
waveIn.DataAvailable += waveIn_DataAvailable;
int sampleRate = 8000; // 8 kHz
int channels = 1; // mono
waveIn.WaveFormat = new WaveFormat(sampleRate, channels);

The DataAvailable event handler will notify us whenever a buffer of audio has been returned to us from the sound card. The data comes back as an array of bytes, representing PCM sample data. This is fine if we are planning to write the audio directly to disk, but what if we wish to have a look at the audio data itself? Each audio sample is 16 bits, i.e. two bytes, meaning that we will need to convert pairs of bytes into shorts to be able to make sense of the data.

Note: if we were recording in stereo, the 16 bit samples would themselves come in pairs, first the left sample, then the right sample.

The following code shows how we might process the raw bytes in the DataAvailable event, and read the individual audio samples out. Notice that we use the BytesRecorded field, not the buffer's Length property. Also, I have chosen to convert the samples to 32 bit floating point format and scaled them so the maximum volume is 1.0f. This makes processing them through effects and visualizing them much easier.

void waveIn_DataAvailable(object sender, WaveInEventArgs e)
    for (int index = 0; index < e.BytesRecorded; index += 2)
        short sample = (short)((e.Buffer[index + 1] << 8) | 
                                e.Buffer[index + 0]);
        float sample32 = sample / 32768f;

Note: One complication of using the waveIn and waveOut APIs is deciding on a callback mechanism. NAudio offers three options. First is function callbacks. This means that the waveIn API is given a (pinned) function pointer which it calls back onto. This means that your DataAvailable callback will come in on a background thread. In some ways this is the cleanest approach, but you need to beware of rogue soundcard drivers that can hang in calls to waveOutReset when using function callbacks (the SoundMAX chipset found on a lot of laptops is particularly prone to this problem).

The second is to supply a window handle. The waveIn APIs will post a message back to be handled on the message queue of that window handle. This method tends to be the most reliable and most commonly used. One gotcha to watch out for is that if you stop recording and immediately restart, a message from the old recording session could get handled in the new session resulting in a nasty exception.

The third is to let NAudio create its own new window and post messages to that. This gets round any danger of messages from one recording session getting muddled up with another. This is the callback method that NAudio will use by default if you call the default WaveIn constructor. But don't use this from a background thread or from a console application, or the new window that NAudio creates won't actually get round to processing its message queue.

Visualizing the Recording Level

We have seen how we can begin to capture audio from the soundcard for the purposes of checking the recording level. Now we need to give the user some visual feedback. We will use WPF for our sample recording application. The simplest control we have available to display a single numeric value graphically is the ProgressBar. And because it is WPF, we can fully customize the graphical appearance of the progress bar to look a little more like a volume meter. I have used a gradient going from green to red to show the current volume level. You can read more about how I created this ProgressBar template here.

Figure 1 - A Progress Bar Showing the Current Microphone Volume Level

To help provide the volume level to display, I have created a SampleAggregator class. This is passed every audio sample value we receive and keeps track of the maximum and minimum values. Then, after a specified number of samples, it raises an event allowing the GUI components to respond. We need to be careful not to raise too many of these events or performance will be badly affected. I am raising one every 800 samples, meaning we get 10 updates per second to the screen.

Because I am using data binding, when one of these updates fires, I must raise a PropertyChangedEvent on my DataContext object (also known as the “ViewModel” in the MVVM pattern). Here's the XAML syntax for binding to my CurrentInputLevel property:

<ProgressBar Orientation="Horizontal" 
    Value="{Binding CurrentInputLevel, Mode=OneWay}" 
    Height="20" />

And here's the code in the ViewModel that ensures that the GUI updates whenever we calculate a new maximum input level:

private float lastPeak;

void recorder_MaximumCalculated(object sender, MaxSampleEventArgs e) 
    lastPeak = Math.Max(e.MaxSample, Math.Abs(e.MinSample));

// multiply by 100 because the Progress bar's default maximum value is 100 
public float CurrentInputLevel { get { return lastPeak * 100; } }

Note: Model View ViewModel (MVVM) is a pattern that is growing in popularity amongst WPF and Silverlight developers. The basic idea is that you have no code behind whatsoever on your View (i.e. your xaml markup file), and simply specify all communications with your business logic by means of data binding. The ViewModel serves as an adapter to ease the process of data binding. This approach gives very good separation of appearance and behavior. For the most part, this pattern works very well, but there are a few tricky areas, for which you will need to either write a few lines of code behind, or make use of some cunning tricks such as attached dependency properties or custom triggers. There are several excellent open source helper libraries that can take some of the work out of getting an MVVM application up and running. Have a look here for a comprehensive list.

Adjusting the Recording Level

Suppose the current input level is too high or too soft. We would like to be able to support modifying the recording level. Again, we would like to use data binding to do so, so we will add a volume slider to our XAML:

<Slider Orientation="Horizontal" 
    Value="{Binding MicrophoneLevel, Mode=TwoWay}" 
    Margin="5" />

Now we have to get hold of the MixerLine that will allow us to access the input volume control for our waveIn device. This requires us to make use of the Windows mixer APIs, which also have wrappers in NAudio. Getting hold of this volume control is not always as straightforward as you might hope (and can require different approaches for XP and Vista), but the following is code that seems to work on most systems:

private void TryGetVolumeControl()
    int waveInDeviceNumber = 0;
    var mixerLine = new MixerLine((IntPtr)waveInDeviceNumber, 
                                   0, MixerFlags.WaveIn);
    foreach (var control in mixerLine.Controls)
        if (control.ControlType == MixerControlType.Volume)
            volumeControl = control as UnsignedMixerControl;        

Now we can use the Percent property on the UnsignedMixerControl to set volume to a value anywhere between 0 and 100.

Starting Recording

Now we have got our recording levels set up correctly, we are ready to actually start recording. But since we have already opened our waveIn device, all we need to do is start writing the data we have received into a file.

NAudio has a class called WaveFileWriter which will allow us to write our recorded data to a file. For now, we will write it to a temporary file in PCM format, and convert it later into a better compressed format such as MP3. The following code creates a new WAV file:

writer = new WaveFileWriter(waveFileName, recordingFormat);

Now we can write to the file as we receive notifications from the waveIn device:

void waveIn_DataAvailable(object sender, WaveInEventArgs e)
    if (recordingState == RecordingState.Recording)
        writer.WriteData(e.Buffer, 0, e.BytesRecorded);            

   // ...

Note: There are three main options for how to store audio while it is being recorded. First, you can write it to a MemoryStream. This saves the inconvenience of dealing with a temporary file, but you need to be careful not to run out of memory. Also, if your recording program crashes half way through, you have lost everything. At the sample rate we are using for this demo, one minute of audio takes just under 1 MB of memory, but if you were recording at 44.1kHz stereo (the standard for music), you would need about 10 MB per minute.

Second, you can write to a temporary WAV file to be converted to another format later, as we are doing here. While this is not a disk space efficient format, it is very easy to work with, and particularly useful if you are planning to apply any effects or edit the audio in any way after recording.

Third, you can pass the audio directly to an encoder (such as WMA or MP3) as it is being recorded. This might be the best option if you are making a longer recording, and have no need to edit it after recording.

Stopping Recording

Obviously we will stop when the user clicks the stop recording button, but we might also want to set a maximum recording duration to stop the user inadvertently filling up their hard disk. For this example, we will allow one minute of recording.

long maxFileLength = this.recordingFormat.AverageBytesPerSecond * 60;
int toWrite = (int)Math.Min(maxFileLength - writer.Length, bytesRecorded);
if (toWrite > 0)
    writer.WriteData(buffer, 0, bytesRecorded);

Note: Something that can be slightly confusing for users is that when using window callbacks with WaveIn, the last bit of audio you recorded comes in after you have asked recording to stop, so make sure you don't close the file you are saving to until you have got all the audio back. The FinishedRecording event on the WaveIn object will help you determine when it is safe to close the WaveFileWriter and clean up your resources.

Visualizing the Wave Form

It is often desirable to display the audio waveform to the user. Displaying the waveform while you are recording is sometimes called “confidence recording”, because it allows you to see that audio is being recorded as expected and the levels are still right.

There are a variety of possible approaches for drawing audio waveforms. The simplest is to draw a vertical line showing the minimum and maximum values every time our sample aggregator fires:

Figure 2 - Audio Waveform using vertical lines

At first glance it may seem that this would be trivial to implement in WPF, but there is a real danger of consuming too many resources. For example, simply adding a new line to a Canvas every time a new maximum sample is calculated performs very badly, so it is better to have a fixed number of vertical lines and resize them dynamically.

Another approach is to create a polygon. This requires us to add two points to a Polygon's Points collection every time we receive a new sample. The trick is to add these points in the middle of the Points collection, rather than at the end, so that the end result is a single shape. This means our waveform can have a different outline color and fill color. To stop the edges from appearing too jagged, we plot points two units apart along on the X axis.

Figure 3 - Audio Waveform rendered using a Polygon

Like the microphone volume meter, the waveform drawing control needs to receive several notifications a second of the maximum and minimum sample values received by the SampleAggregator. When each sample value is received, we either insert new points into our polygon, or, if the whole screen is full, we go back to the left-hand edge and continue drawing from there.

For the confidence recording display I have used the Polygon method, which is in a class called PolygonWaveFormControl. Here's the code which calculates the new points or updated point locations as we receive a new maximum sample:

public void AddValue(float maxValue, float minValue)
    int visiblePixels = (int)(ActualWidth / xScale);
    if (visiblePixels > 0)
        CreatePoint(maxValue, minValue);

        if (renderPosition > visiblePixels)
            renderPosition = 0;
        int erasePosition = (renderPosition + blankZone) % visiblePixels;
        if (erasePosition < Points)
            double yPos = SampleToYPosition(0);
            waveForm.Points[erasePosition] = 
               new Point(erasePosition * xScale, yPos);
            waveForm.Points[BottomPointIndex(erasePosition)] = 
               new Point(erasePosition * xScale, yPos);

private void CreatePoint(float topValue, float bottomValue)
    double topYPos = SampleToYPosition(topValue);
    double bottomYPos = SampleToYPosition(bottomValue);
    double xPos = renderPosition * xScale;
    if (renderPosition >= Points)
        int insertPos = Points;
        waveForm.Points.Insert(insertPos, new Point(xPos, topYPos));
        waveForm.Points.Insert(insertPos + 1, new Point(xPos, bottomYPos));
        waveForm.Points[renderPosition] = new Point(xPos, topYPos);
        waveForm.Points[BottomPointIndex(renderPosition)] = 
              new Point(xPos, bottomYPos);

The erase position calculation is to blank out some previous sample values to make it obvious where the new data is appearing after we have wrapped around once:

Figure 4 PolygonWaveForm control's “blank zone”

Note: There are faster ways to perform rendering in WPF. One option is to use the WriteableBitmap class and draw directly onto it. This could be a good approach if you were using the vertical lines method of rendering. The second is to use DrawingVisual objects, which are lightweight drawing objects offering better performance than using classes derived from Shape. The down-side is the loss of features such as DataBinding and the ability to fully describe the picture in XAML, but for WaveForm drawing this is not really a drawback. I use the DrawingVisual method in the Save Audio part of this application.

Another challenge was how the waveform drawing control could receive notifications since I am using MVVM so I have no direct access to the SampleAggregator. A simple way around this was to create a Dependency Property on PolygonWaveFormControl:

public static readonly DependencyProperty SampleAggregatorProperty = 
          new PropertyMetadata(null, OnSampleAggregatorChanged));

public SampleAggregator SampleAggregator
    get { return (SampleAggregator)this.GetValue(SampleAggregatorProperty); }
    set { this.SetValue(SampleAggregatorProperty, value); }
private static void OnSampleAggregatorChanged(object sender, DependencyPropertyChangedEventArgs e)
    PolygonWaveFormControl control = (PolygonWaveFormControl)sender;

This allows us to bind the PolygonWaveFormControl to the SampleAggregator made public on our DataContext:

    SampleAggregator="{Binding SampleAggregator}" />

Trimming the Audio

We have created a temporary WAV file, but before the user saves it to a file of their choosing, we want to allow them to trim off any unwanted parts from the start and end of the recording. To do this I would like to display the entire recorded waveform, with a selection rectangle superimposed on top to allow a sub-range to be selected.

Figure 5 - GUI to allow selection of a portion of the recorded audio

To accomplish this kind of interface we need three components. The first is a ScrollViewer. The ScrollViewer allows us to scroll left and right through the WaveForm if it is too big to fit onto a screen, which is likely if you record more than a few seconds of audio.

The second is a new type of WaveForm renderer that will render an entire file, rather than my PolygonWaveFormControl which started again at the left when the screen filled up. For this I created WaveFormVisual which uses DrawingVisual objects to draw the entire WaveForm. Obviously if we wanted to record for a long period, this approach would need to be optimised as the polygon it creates would have thousands of points, but for short recordings, it works fine.

The third piece was the hardest to get right – the selection rectangle to support mouse dragging selection of the waveform. For this I created the RangeSelectionControl.

The RangeSelectionControl is simply a blue rectangle with a solid outline and semi-transparent fill sitting on a Canvas. The magic occurs in the mouse handler. We need to detect when the user hovers over the left or right edge of the rectangle, and set the cursor to show a horizontal resizing icon. This can be done in the MouseMove event, checking the X coordinate and then setting the Cursor property:

Cursor = Cursors.SizeWE;

When the user clicks the left-button while over the edge, we begin to drag. Key to this is calling Canvas.CaptureMouse. If we don't do this, as soon as you try to drag the rectangle bigger, the mouse move events are lost to other controls underneath.

void RangeSelectionControl_MouseDown(object sender, MouseButtonEventArgs e)
    if (e.LeftButton == MouseButtonState.Pressed)
        Point position = e.GetPosition(this);
        Edge edge = EdgeAtPosition(position.X);
        DragEdge = edge;
        if (DragEdge != Edge.None)

Now in the MouseMove methods, we can change the Canvas.Left and Width properties of the rectangle to resize it.

The ScrollViewer is quite straightforward to use, but you must remember to set CanContentScroll property to true, and also to set the size of the items within the ScrollViewer correctly.

<ScrollViewer CanContentScroll="True" 
       <my:WaveFormVisual Height="100" 
           x:Name="rangeSelection" />

We set the appropriate Width of the WaveFormVisual and RangeSelectionControl based on the total number of points we have drawn in the waveform.

Saving the Audio

So we are finally ready to save the audio. We will offer the user two choices of format to save in. The first is simply to save as a WAV file. If the user has selected the entire recording, we only need to copy the audio across to their desired location. If, however, the user has selected a sub-range, then we need to trim the WAV file. This can be quickly accomplished using a TrimWavFile utility function that copies from a WAV file reader to a WAV file writer, skipping over a certain number of bytes from the beginning and end.

public static void TrimWavFile(string inPath, string outPath, 
                TimeSpan cutFromStart, TimeSpan cutFromEnd)
    using (WaveFileReader reader = new WaveFileReader(inPath))
        using (WaveFileWriter writer = 
               new WaveFileWriter(outPath, reader.WaveFormat))
            int bytesPerMillisecond = 
                reader.WaveFormat.AverageBytesPerSecond / 1000;

            int startPos = (int)cutFromStart.TotalMilliseconds * 
            startPos = startPos - startPos % reader.WaveFormat.BlockAlign;

            int endBytes = (int)cutFromEnd.TotalMilliseconds * 
            endBytes = endBytes - endBytes % reader.WaveFormat.BlockAlign;
            int endPos = (int)reader.Length - endBytes; 

            TrimWavFile(reader, writer, startPos, endPos);

private static void TrimWavFile(WaveFileReader reader, 
                    WaveFileWriter writer, int startPos, int endPos)
    reader.Position = startPos;
    byte[] buffer = new byte[1024];
    while (reader.Position < endPos)
        int bytesRequired = (int)(endPos - reader.Position);
        if (bytesRequired > 0)
            int bytesToRead = Math.Min(bytesRequired, buffer.Length);
            int bytesRead = reader.Read(buffer, 0, bytesToRead);
            if (bytesRead > 0)
                writer.WriteData(buffer, 0, bytesRead);

We also want to offer the ability to save as MP3. The easiest way to create MP3 files is to use the open source LAME MP3 encoder (do a web search for lame.exe to get hold of this application if you haven't already got it). Our application will look in the current directory, and prompt the user to find lame.exe if it is not present, as we do not include it in the application download. Assuming you do provide a valid path, we can then convert our (trimmed) WAV file to MP3 by simply calling lame.exe with the appropriate parameters.

public static void ConvertToMp3(string lameExePath, 
     string waveFile, string mp3File)
   Process converter = Process.Start(lameExePath, "-V2 \"" + waveFile 
                            + "\" \"" + mp3File + "\"");

We end up with a nice compact MP3 file containing the selected portion of our microphone recording.

Exploring the Sample Code Solution

The main WPF sample application is found in the VoiceRecorder project. This contains the main window along with the three views and their associated ViewModels. VoiceRecorder.Core contains some WPF helper classes and user controls to help with the plumbing and GUI of the application, while VoiceRecorder.Audio contains the classes that actually perform the recording, editing and converting of audio.

About the Author

Mark Heath is a software developer currently working for NICE CTI Systems in Southampton, UK. He specializes in .NET development with a particular focus on client side technologies and audio playback. He blogs about audio, WPF, Silverlight and software engineering best practices at http://mark-dot-net.blogspot.com. He is the author of several open source projects hosted at CodePlex, including NAudio, a low-level .NET audio toolkit (http://www.codeplex.com/naudio).

The Discussion

  • User profile image

    I like the fact that you don't gloss over architecture. One might think that because it is Coding4Fun that the design would be flat and monolithic. You have separate assemblies exposing different levels of functionality. You also have various implementations or the start of some common design patterns. Like IoC, Command Mediator, Helper/Services, MVVP, etc. This is being done without saying to much about it other than a small blurb on MVVP.  Great Job!

  • User profile image

    @Nate Greenwood, we may or may not have an article in the works for that Smiley

  • User profile image
    Nate Greenwood

    Awesome. Great article, and just in time as I was pondering a project to learn to take advantage of my built-in monitor webcam and microphone.

  • User profile image


    you make an incredible job, im your fans!!

  • User profile image

    @SomeONe  Thanks man, we try to make the articles both useful and show useful ways of doing stuff.  May not always be successful but we try.

  • User profile image
    Robson Felix

    Can this be used inside an XAML Browser Application (XBAP)?


  • User profile image

    How can I change the recording time to a value bigger than 60 seconds????

  • User profile image

    you do the best tutorials.... that was really nice explained.... congrats to autors

  • User profile image

    This has to be the most useful article on audio there has come to exist for C# programmers. Thanks a LOT for it! It'll help me develop my program for sure.

  • User profile image
    Bram Osterhout

    I want to save a series of notes/sounds which I had in an array which stored each note's frequency, amplitude, and duration. What would be the procedure for accomplishing this?


  • User profile image


  • User profile image

    Replay at person asking how to remove limit of 60 seconds.

    in AudioRecorder.cs of the project change the writetofile Function to following.

    private void WriteToFile(byte[] buffer, int bytesRecorded)
    if (recordingState == RecordingState.Recording
    || recordingState == RecordingState.RequestedStop)

    writer.WriteData(buffer, 0, bytesRecorded);


    Go to your solution and delete voicerecorder.audio and core DLLs.

    Add refrences and Browse to bin location of the built voicerecorder project and import those DLL. Seeing how the voicerecorder project was on my desktop the dll location is.


    Voila the limit is totaly removed.

  • User profile image

    Thank you! It is exactly what I've needed!

  • User profile image

    So what happened to: waveIn.DeviceNumber = selectedDevice;

    I've been trying to figure out how to select a device. I thought NAudio would make capturing mic audio easy. Unfortunately, it appears that all the examples were written befor the changes to the DLL.

    OK, how do I select my sound card now?

  • User profile image

    @John - if you noticed, in the first part he gives an example of how to see the devices currently available, after that you can use their "id's" returned by waveInDevice method in that for at the beginning of the example, in most of the cases will be 0 (default device) so actually you'll need to set waveIn.DeviceNumber = 0; in your code.

  • User profile image
    Ryan Smith

    Probably a really really ignorant question but this is the first time I've been on this site, could someone please tell me what wavein is and where it has come from? Would be much appreciated.

  • User profile image

    opening thread

  • User profile image

    @golnazal:با سلام

    خیلی ممنون از پیگیری های برای تصحیح فایل ها ،منظور شما ازباز کردن موضوع را متوجه نمیشوم .او راستی خیلی خوشحالم که که یک ایرانی تو کانال ماکروسافت میبینم Big Smile

Comments closed

Comments have been closed since this content was published more than 30 days ago, but if you'd like to continue the conversation, please create a new thread in our Forums, or Contact Us and let us know.