0

I need to write an algorithm to merge two sorted integer arrays into a single one without duplicates.

I managed to merge them, but I am not sure how to remove the duplicates from the final array, also initiating the merged array with the sum of both arrays doesn't seem to be correct, since the duplicates items will have the default int value of 0.

public static int[] MergeArrays(int[] firstArray, int[] secondArray)
{
    var firstArrayLength = firstArray.Length;
    var secondArrayLenght = secondArray.Length;
    var mergedArray = new int[firstArrayLength + secondArrayLenght];
    var i = 0;
    var j = 0;
    var k = 0;

    while (i < firstArrayLength && j < secondArrayLenght)
    {
        if (firstArray[i] < secondArray[j])
        {
            mergedArray[k] = firstArray[i];
            i++;
        }
        else
        {
            mergedArray[k] = secondArray[j];
            j++;
        }

        k++;
    }

    while (i < firstArrayLength)
    {
        mergedArray[k++] = firstArray[i++];
    }

    while (j < secondArrayLenght)
    {
        mergedArray[k++] = secondArray[j++];
    }

    return mergedArray;
}

How you would do this without using LINQ or array specific methods?

7
  • 5
    If they're all sorted, surely all you need to do is check that mergedArray[k - 1] (where k > 0) isn't equal to the value you're trying to insert? And obviously not increment k until you've inserted a new value. Commented Oct 19, 2020 at 7:57
  • 2
    Since you don't know how many duplicates there are in advance, you'll either have to do this in two passes (one to count dupes, one to merge) so you can size the results array correctly or use a list for the output (and convert to an array after merging, if you really need the results in an array). Or just resize the final array at the end of the merge (less efficient since it will often oversize the array, but that's probably not going to be a problem unless the arrays are huge). Commented Oct 19, 2020 at 7:57
  • 1
    I'm afraid it's not very clear what you're asking here. The question seems like a homework question, since the usual way to approach this in real code would be to use LINQ. All you have to do is exclude new values as you iterate through the merge sort that you've already got, when those values are the same as what you've just written (since duplicates will necessarily be grouped together). Did you try that? Did you try anything? What is it specifically you need help with? Commented Oct 19, 2020 at 8:02
  • @PeterDuniho "the usual way to approach this in real code would be to use LINQ" Well, that depends. Given sorted input, a merge should be O(N) but using Linq is likely to make it O(N.Log(N)), so if the arrays are large this could be an issue. If the OP is sorting the arrays first just to do a merge afterwards, then of course Linq would be much better. Commented Oct 19, 2020 at 8:07
  • @Matt: LINQ's Distinct() method is O(n) (hash set). It's not as cheap as the merge sort O(n), but there's nothing in the question that suggests the solution requires the fastest option. Commented Oct 19, 2020 at 8:09

1 Answer 1

1

You are almost there. You have the main part of the merge algorithm working.

There's just two things left for you to do:

  1. Return an array of the correct size. Since you are already tracking the size of the merged array in your k variable, you can use that to resize the array using Array.Resize(ref mergedArray, k);.

  2. You want to deduplicate the array. Because you are maintaining the merged array in sorted order, you can check for a duplicate value before you add a new value to the array, and just skip adding it if it is already at the end of the merged array. Note that you must be careful to handle the case where the merged array is currently empty!

Trying to preserve your original code as much as possible, here's one solution.

It introduces a candidate value which we use to check if the value that would be inserted into the result array is already in that array. It also resizes the array to the correct size before returning it:

public static int[] MergeArrays(int[] firstArray, int[] secondArray)
{
    var firstArrayLength  = firstArray.Length;
    var secondArrayLength = secondArray.Length;
    var mergedArray       = new int[firstArrayLength + secondArrayLength];
    var i                 = 0;
    var j                 = 0;
    var k                 = 0;
    int candidate;

    while (i < firstArrayLength && j < secondArrayLength)
    {
        if (firstArray[i] < secondArray[j])
        {
            candidate = firstArray[i];
            i++;
        }
        else
        {
            candidate = secondArray[j];
            j++;
        }

        if (k == 0 || mergedArray[k-1] != candidate)
        {
            mergedArray[k] = candidate;
            k++;
        }
    }

    while (i < firstArrayLength)
    {
        candidate = firstArray[i++];

        if (k == 0 || mergedArray[k-1] != candidate)
        {
            mergedArray[k] = candidate;
            k++;
        }
    }

    while (j < secondArrayLength)
    {
        candidate = secondArray[j++];

        if (k == 0 || mergedArray[k-1] != candidate)
        {
            mergedArray[k] = candidate;
            k++;
        }
    }

    Array.Resize(ref mergedArray, k);
    return mergedArray;
}

You say that you cannot use "array specific" methods. If this precludes the use of Array.Resize() then you could do something like this instead:

int[] result = new int[k];

for (int i = 0; i < k; ++i)
    result[i] = mergedArray[k];

return result;

This more-or-less does what Array.Resize() does - it creates a new array of the correct size and copies the elements of the source array into it.

You might think that you could use Linq to do this using something like

var result = array1.Union(array2).ToArray();

The problem with this is that the result is not sorted, so given

int[] array1 = {1, 3, 5, 7, 9};
int[] array2 = {2, 4, 6, 8, 10};
int[] merged = array1.Union(array2).ToArray();

Console.Write(string.Join(", ", merged));

The output is 1, 3, 5, 7, 9, 2, 4, 6, 8, 10 which is not sorted.

You'd have to add a sorting step:

int[] merged = array1.Union(array2).OrderBy(i => i).ToArray();

But now we've turned an O(N) algorithm into an O(N.Log(N)) one because OrderBy() is O(N.Log(N)). Whether this is actually an issue depends on your circumstances.

Also note that you could also remove duplicates from the array before returning it by returning mergedArray.Distinct().ToArray(), but that introduces an unnecessary extra copy of the data plus an addition O(N) operation. If this is homework, that's probably not what the tutor is looking for!

As a final note, you may notice that there's some repeated code that checks for duplicated values. You could put that code into a local function to avoid repeating it. If you do that, the final method could look something like this:

public static int[] MergeArrays(int[] firstArray, int[] secondArray)
{
    var firstArrayLength  = firstArray.Length;
    var secondArrayLength = secondArray.Length;
    var mergedArray       = new int[firstArrayLength + secondArrayLength];
    var i                 = 0;
    var j                 = 0;
    var k                 = 0;

    void addIfNotDupe(int candidate) // Local function.
    {
        if (k != 0 && mergedArray[k - 1] == candidate)
            return;

        mergedArray[k++] = candidate;
    }

    while (i < firstArrayLength && j < secondArrayLength)
    {
        addIfNotDupe(firstArray[i] < secondArray[j] ? firstArray[i++] : secondArray[j++]);
    }

    while (i < firstArrayLength)
    {
        addIfNotDupe(firstArray[i++]);
    }

    while (j < secondArrayLength)
    {
        addIfNotDupe(secondArray[j++]);
    }

    Array.Resize(ref mergedArray, k);
    return mergedArray;
}
Sign up to request clarification or add additional context in comments.

1 Comment

If you are going to post an answer to a homework question, it would be better to follow community guidelines for such answers. See also meta.stackoverflow.com/questions/274630/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.