0

I was looking for a good search algorithm, and I ran into the interpolation search one, which has a time complexity of O(log(log(n))) and can only be applied on uniformly distributed arrays. But, I found it was possible to create a search algorithm that requires the same conditions, but that has a time complexity of O(1). Here's what I've come up with: (Code in C++)

int search(double Find, double* Array, const int Size) {
    if (Array[0] == Array[1] && Find == Array[0]) { return 0; }
    if (Array[0] == Array[1] && Find != Array[0]) { return -1; }
    const double Index = (Find - Array[0]) / (Array[Size - 1] - Array[0]) * (Size - 1.0);
    if (Index < 0 || Index >= Size || Array[(int)(Index + 0.5)] != Find) { return -1; }
    return Index + 0.5; 
}

In this function, we pass the number we want to find, the array pointer, and the size of the array. This function returns the index where the number to be found is, or -1 if not found.

Explanation: Well, explaining this without any picture is going to be difficult, I'm going to try my best...

Because the array is uniformly distributed, we can represent all its values on a graph, with the index number(let's say x) as abscissa, and the value contained in the array at this index(let's say Arr[x]) as ordinate. Well, we can see that all the points represented on the graph belong to a function of equation:

Array[x] = tan(A) * x + Arr[0] -> with A as the angle formed between the function and the x-axis of the graph.

So now, we can transform the equation to:

x = (Arr[x] - Arr[0]) / tan(A)

And that's it, x is the index to find in relation to Arr[x](the given value to search). We can just change the formula to x = (Arr[x] - Array[0]) / (Array[Size - 1] - Array[0]) * (Size - 1.0) because we know that tan(A) = (Array[Size - 1] - Array[0]) / (Size - 1.0).

The question (Finally...)

I guess that this formula is already used in some programs, it was rather easy to find... So my question is, why would we use the interpolation search instead of this? Am I not understanding something?

Thanks for you patience and help.

1 Answer 1

1

When speaking of a uniform distribution, this does not mean that once you know the first and last value in a sorted array, that you also know the other values. This is what your algorithm is assuming. A uniform distribution has to do with probabilities involved when an array is generated, but in actual arrays produced along this uniform distribution principle, you can still get arrays that look not so evenly spread. It is a matter of probability. On average the values will be evenly spread, but this is not a guarantee.

Secondly, your algorithm is in fact an interpolation algorithm, with that difference that it assumes that after the first lookup it must have arrived at the place where the value should be. Yet in randomly produced arrays (uniformly distributed) this is not assured, and so the operation must be repeated on a smaller interval, until the interval closes in to one index.

See also:

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.