Tech Off Thread

6 posts

Forum Read Only

This forum has been made read only by the site admins. No new threads or comments can be added.

Want to give me a function for X-MAS? :-)

Back to Forum: Tech Off
  • User profile image
    dotnetjunkie

    Hi, and Merry Christmas to everyone!

    Well, what's more fun to do on xmas than some coding in VS2005 Big Smile

    Now, I was wondering what would be the most efficient way to do the following:

    I have a couple of arrays (let's say, 2 to 5 arrays typical), each containing a (potentially large) set of numbers.

    Now I need a resulting array (or list, or whatever) with only the numbers that are present in each of those original arrays.

    Really simple indeed, but can anybody advice what would be the most performant way to do this in .NET 2.0?
    It needs to be very fast, because the function needs to work real-time and the arrays can contain thousands of numbers.

    Thanks!!!

  • User profile image
    SimonJ

    Arrange the lists in order, shortest to longest.
    Compare the items in the shortest list to those in the next longest.
    If a number is found in the list, look for it in the next longest list.
    Use "AndAlso" to ensure shortcutting of boolean logic.

    SimonJ

  • User profile image
    geekling

    Have you considered using a Dictionary instead? I believe they are hash-based in implementation, so look-ups should be very efficient; the basis would be to keep the shortest array, but put all the bigger arrays into a seperate Dictionary per array.

    Then iterate through the smallest array, query each Dictionary to see if the current element is present.

  • User profile image
    Blue Ink

    A lot depends on the kind of data you are going to manipulate... replicates and sorting comes to mind.
    As far as sorting is concerned, I would say that if the data are not already sorted it would be a safe bet to sort them anyway. A linear scan will be O(N^2), while the sorted + binary search version will be O(N*log(N))... well, ok, the Quicksort has a wort case O(N^2), so in that case you would lose something, but that rather rare.

    For the algorithm, here is a crude attempt that seems to be reasonably efficient, at least with random values.

    public static int [] Match (int [] [] arrays) {
    for (int i = 0; i < arrays.Length; i++) {
    Array.Sort (arrays [i]);
    }
      ArrayList res = new ArrayList ();
      int current = 0;
    int start = 0;
    int index = 0;
      while (index < arrays [current].Length) {
    int currentValue = arrays [current] [index];
    for (;;) {
    current = (current + 1) % arrays.Length;
    if (current == start) {
    res.Add (currentValue);
    index++;
    break;

    } else {
    int foundIndex = Array.BinarySearch (arrays [current], currentValue);
    if (foundIndex < 0) {
    index = ~foundIndex;
    start = current;

    break;
    }
    }
    }
    }

    return (int []) res.ToArray (typeof (int));
    }
    Happy new year (too late for Christmas).
    --m

  • User profile image
    WillemM

    Assuming you need to get the arrays sorted and merged you can best use a heapsort/merge combo. I dont have a sample ready here, but I could look it up for you on my code CD's once my office at home is ready (I'm getting new paint and a new floor Big Smile)

    If I remember it correctly heapsort is O(N log N) and merging should be O(N). I'm not sure on the merging, but the heapsort is really fast and cheap in memory usage if you implement it right Smiley. And it doesnt have a worstcase scenario of O(N^2)

  • User profile image
    Rossj

    If I were playing about, I'd be trying to do this with bit ops where the bit set at position X means that X is present in that array.  Then you could test with & - unfortunately C# doesn't support Integers with that many bits Sad 

    It would of course mean that you'd have to have as many bits as the largest number in the array, and it would be saving space rather than increasing performance.. but that's what playing with code is for I guess.

Conversation locked

This conversation has been locked by the site admins. No new comments can be made.