Autotune.NET

Sign in to queue

The Discussion

  • User profile image
    Eric

    That's cool!  Smiley  

  • User profile image
    John Dyer

    Great work Mark. Now I just have to find a use for it.

  • User profile image
    Lauri Kreutzwald

    Nice work Mark.

  • User profile image
    Ravi

    Whoa awesome! Great to see Awesomebox being used in the wild Smiley

  • User profile image
    BitFlipper

    I have also implemented a pitch detection algorithm which is used to display a realtime pitch graph in a VST plugin (although VST uses an unmanaged API, my plugin is written in C# and uses reverse P/Invoke). This plugin is used as a visual guide to train a singer to sing in key (or for vocal exercises), and can also display a "grade" based on previously entered notes (or via a MIDI clip) that has to be hit throughout the song. My algorithm is loosely based on auto-correlation, but it is heavily modified to solve the following two problems with it:

    1. Auto-correlation is slow:
      Since you need to do a compare of all samples with all of the shifted samples, and repeat that once for every single frequency you want to detect, this quickly becomes slow.

      The way I solved this is by first down-sampling to reduce the overall number of samples to work with (with a filter to prevent anti-alias noise - also conveniently removes frequencies I don't care about), and then to have 3 passes, one each with low, medium, and high resolution.

      In the first pass, I only test a total of 5 samples that are spaced out within the two sample windows. You can quickly tell if there just isn't going to be correlation if the 5 samples are all different from the shifted window's equally spaced out samples. In the second pass, there are more samples and I use the detected frequency as a starting point for the final pass, and in the last step I use more samples.

    2. Auto-correlation is inaccurate, especially at higher frequencies:
      As you have clearly seen in your results, the higher the frequency, the less accurate it becomes. This is because a higher frequency waveform has fewer total samples per cycle, and you are stepping in whole sample values, so the frequency steps become courser the higher it is.

      To solve this, for the 3rd pass in my algorithm (at which point the search is centered around the frequency that was detected in the second pass), I use interpolation in order to compare samples that are not limited to whole numbers. So sample 0 will be at position 0.0, sample 1 will be at position 0.674, or whatever. This allows me to space the sample steps so that it is exactly at the frequency I want to detect during that pass, as opposed to being quantized into ever-more courser frequency steps. I use a 4-point, 3rd-order Hermite interpolator.

      Each high resolution pass is 1.005 times the previous frequency, so I don't use linear frequency steps. Also, once I found the two passes with the highest correlation, I interpolate between those two, so the final detected frequency is even higher resolution than the 3'rd pass' step size.

    This results in a very fast and accurate pitch detection algorithm. From my tests, the accuracy is within 0.1% of the input frequency, which IIRC, is about 50 times higher than what humans can distinguish.

  • User profile image
    mary

    can you please give the code for pitch detection?  i have implemented fft algoritm, and jus need to detect the pitched at a high frequency. How can i choose the amplitude threshold to get the corect peaks? If the amplitue thresh is to high i get 0 peaks, if its; too low i  get incorect peaks

  • User profile image
    mary

    @BitFlipper:do u have the pitch detection code? I implement FFT and need pitch detection algorithm. I need to detect all peaks from the spectrum. All  peaks at 18000hz frequency for example. I also work with amplitude threshold..but it is;s too high it doesn;t show me the peaks, if it;s too low it dowsnt show corectly

  • User profile image
    BitFlipper

    @mary:

    Unfortunately right now my code isn't quite fit for public release. I think there are some dependencies on parts of other code that I don't want to post and which isn't really pitch-related. For instance I have a DSP class that the pitch-correction algorithm uses, but most of that code is unrelated to it.

    If I have some time available I will clean it up and post it. Most likely before the end of the weekend.

  • User profile image
    BitFlipper

    OK I had some time to clean up my code. I am working on creating a CodePlex project in order to publish it. I also first want to create some sort of sample app in order to demonstrate the code in use (even though you need just three lines of code to instantiate and get your first pitch results back). Hopefully I will be done with it by this weekend.

  • User profile image
    BitFlipper

    @Christian Louboutin Bottes:

    Wow is that spam or some weird C9 bug? I can't tell.

  • User profile image
    pollo

    It's Spam that's been through an auto-tune like comment modifier to make it look like a C9 bug.

  • User profile image
    Tania

    Hi Mark,
    "Autotune.NET" is really very different topic. I am very excited to read this blog. Very interesting and useful.After reading your blog only i came to know about this technology in dotnet. Thanks for giving your knowledge to us. Your code also works well.
    http://gloriatech.com/microsoft-net-development-services.aspx
     
     
     
     
     
     

  • User profile image
    BitFlipper

    OK, I finally had some time to isolate my pitch tracker class and create a CodePlex project. I would be interested to find out whether you can use my pitch tracker in your project, and what the results are.

    From my tests my algorithm has an error of less that 0.02% over the frequency range of 55Hz to 1.5kHz. Accuracy is unaffected by amplitude, frequency or complexity of the waveform.

    Please let me know if you end up trying it out.

  • User profile image
    markheath

    hit BitFlipper, looks interesting. Fancy submitting a patch to the VoiceRecorder project (http://voicerecorder.codeplex.com/) that uses your algorithm?

  • User profile image
    Eugene

    Hi Mark, thanks for the code, it is rather interesting, but I can not understand one thing - in the code
    public float DetectPitch(float[] buffer, int frames){ if (prevBuffer == null) { prevBuffer = new float[frames]; } float maxCorr = 0; int maxLag = 0; for (int lag = maxOffset; lag >= minOffset; lag--) { float corr = 0; // sum of squares for (int i = 0; i < frames; i++) { int oldIndex = i - lag; float sample = ((oldIndex < 0) ? prevBuffer[frames +oldIndex]:prevBuffer[oldIndex]);  corr += (sample * buffer[i]); } ..........................................
    Here we first initialize the array -  prevBuffer = new float[frames] - and evidentially that all its members have zero as value. 
    And  then we have "sample = prevBuffer[frames +oldIndex] or sample=prevBuffer[oldIndex]", but as the prevBuffer has only zero values so sample will always have the same zero value.
    Could you explain this thing or maybe I am wrong?

  • User profile image
    Wonde

    Hey. It is nice lib. But I have a qns. How to record a voice for longer time ? say an hour or more.

  • User profile image
    markheath

    @Wonde, to record for that length of time I would recommend storing the saved audio in a WAV file rather than the current implementation which keeps it in memory. Use the WaveFileWriter class.

  • User profile image
    markheath

    @Eugene - look at the end of the function - the contents of the current buffer are copied across into prevBuffer

  • User profile image
    Cleveland

    @Eric:Hi, it is a very nice introduction for recording voice on .net or from microphone. I am looking for a streaming audio recorder to record voice or sound that can be passed through on my computer audio device.

  • User profile image
    Asad Siddiqi

    Hi there,
    I want to know if its possible (or is there an algorithm I could use ) to convert a byte array sound data and change the pitch, frequency ... I did it with SoundEffectInstance in XNA but there is no way I could save the stream that was passed in. I would greatly appreciate any help. This is some code that i did (I want to change the pitch of bStream). Thank you very much:

    if (states == PlayerStates.Ready || states == PlayerStates.Stopped)
    {
    InitTimer();

    byte []bStream = stream.ToArray();

    sound = new SoundEffect(bStream, microphone.SampleRate, AudioChannels.Mono);
    SoundEffectInstance soundInstance = sound.CreateInstance();
    soundInstance.Pitch += 1;
    soundInstance.Play();
    textBlock1.Text = "Now Playing";
    states = PlayerStates.Playing;
    }

  • User profile image
    browncha

    Allowed to post the source code for the project? not just the snippets here and there?

Add Your 2 Cents