79

It's not an interview question per se, as I came across this in my project, but I figured it could be a decent intervew question.

You have N pairs of intervals, say integers. You're required to indentify all intervals that overlap with each other in O(N) time. For example, if you have

{1, 3} {12, 14} {2, 4} {13, 15} {5, 10}

the answer is {1, 3}, {12, 14}, {2, 4}, {13, 15}. Note that you don't need to group them, so the result can be in any order like in the example.

I just threw in O(N) time because the KMP algorithm takes O(N) for string search. :D

The best I came up with and what I'm using right now in the project is O(N^2). Yeah, brute force is pretty sad, but no one complains so I won't refactor it. :P Still, I was curious if a greater mind has a more elegant solution.

8
  • 8
    Two things are not clear: (1) you say "N pairs of intervals", though I'm fairly sure you actually mean "N intervals", since if there are only N pairs all overlaps can be trivially found in O(N) :-P Assuming N = number of intervals: (2) It is not possible to report all overlapping pairs in O(N) time because there could be O(N^2) of them! OTOH it is reasonable to ask for the O(N)-sized set of all intervals that overlap at least one other interval. Is that what you're asking for? Commented Aug 1, 2013 at 12:36
  • 4
    gbenison's answer (stackoverflow.com/a/9775727/47984) is the only one of the 9 currently here that actually answers the question in O(nlog n). Please consider marking that answer correct. Commented Aug 1, 2013 at 13:06
  • 2
    It's funny, because I had an interview with amazon and they asked me a similar question .... Commented Mar 4, 2014 at 22:06
  • @j_random_hacker: Can you please explain why the answer from marcog from is not O(n lg n)? Commented Aug 3, 2018 at 0:04
  • 1
    @stackoverflowuser2010: The problem is mainly that the question is very badly formulated, as I wrote in my first comment. Interpreted literally, it has no solution, so answerers have (reasonably) looked for "similar" problems that do. If we interpret marcog's claim "We can find which intervals overlap with which ..." to mean listing all pairs of overlapping intervals, this contradicts his later claim that "This is an O(N logN) solution" -- there could be O(n^2) pairs, which no algorithm can list in O(n log n) time. Commented Aug 3, 2018 at 10:32

10 Answers 10

107

Throw the endpoints of the intervals into an array, marking them as either start- or end-points. Sort them by breaking ties by placing end-points before start-points if the intervals are closed, or the other way around if they're half-open.

1S, 2S, 3E, 4E, 5S, 10E, 12S, 13S, 14E, 15E

Then iterate through the list, keeping track of how many intervals we're in (this equates to number of start-points processed minus number of end-points). Whenever we hit a start-point while we are already in an interval, this means we must have overlapping intervals.

1S, 2S, 3E, 4E, 5S, 10E, 12S, 13S, 14E, 15E
    ^                          ^
   overlap                    overlap

We can find which intervals overlap with which by storing data alongside the end-points, and keeping track of which intervals we're in.

This is an O(N logN) solution, with sorting being the main factor.

Sign up to request clarification or add additional context in comments.

11 Comments

"breaking ties by placing end-points before start-points" - depending how the intervals are defined. If they're half-open, then {1,2} and {2,3} don't overlap. If they're closed intervals, then that is an overlap. Question doesn't specify which.
@marcog: Not sure about this, but is the algorithm really O(nlogn)? If you need to return which intervals overlap with each other, it seems more like O(n^2). When all intervals overlap (like in {1,8}, {2,7}, {3,6}, {4,5}) O(n^2) intervals are in the solution.
@Gruber: I think you're right. Still, if we just want the O(N_intervals)-size set of intervals that overlap some other interval, we can get this by repeating the algorithm a second time, but running backwards from the end, and taking the union of this and the result of the first run. We must also check top-level intervals as we go. Why? If an interval X overlaps some other interval Y, then at least one of the following is true: Y's starting point precedes X's (X caught on phase 1); Y's end follows X's (X caught on phase 2); Y is contained entirely within X and X is at the top level.
@Gruber: See gbenison's cute answer for a different approach.
@faizan The question asked by the original poster is not clear, which causes the confusion. As j_random_hacker said in his comment at the top: "It is not possible to report all overlapping pairs in O(N) time because there could be O(N^2) of them! OTOH it is reasonable to ask for the O(N)-sized set of all intervals that overlap at least one other interval". marcog and gcbenison both answer the second question in O(N log N)
|
34

Sort the intervals by start point. Then fold over this list, merging each interval with its neighbor (i.e. (1,4),(2,6) -> (1,6)) if they overlap. The resulting list contains those intervals with no overlapping partner. Filter these out of the original list.

This is linear in time after the initial sort operation which could be done with any (n log n) algorithm. Not sure how you'd get around that. Even the 'filter out duplicates' operation at the end is linear if you take advantage of the already-sorted order of the input and output of the first operation.

Here is an implementation in Haskell:

-- Given a list of intervals, select those which overlap with at least one other inteval in the set.
import Data.List

type Interval = (Integer, Integer)

overlap (a1,b1)(a2,b2) | b1 < a2 = False
                       | b2 < a1 = False
                       | otherwise = True
                                     
mergeIntervals (a1,b1)(a2,b2) = (min a1 a2, max b1 b2)
                                     
sortIntervals::[Interval]->[Interval]
sortIntervals = sortBy (\(a1,b1)(a2,b2)->(compare a1 a2))

sortedDifference::[Interval]->[Interval]->[Interval]
sortedDifference [] _ = []
sortedDifference x [] = x
sortedDifference (x:xs)(y:ys) | x == y = sortedDifference xs ys
                              | x < y  = x:(sortedDifference xs (y:ys))
                              | y < x  = sortedDifference (x:xs) ys

groupIntervals::[Interval]->[Interval]
groupIntervals = foldr couldCombine []
  where couldCombine next [] = [next]
        couldCombine next (x:xs) | overlap next x = (mergeIntervals x next):xs
                                 | otherwise = next:x:xs
                                               
findOverlapped::[Interval]->[Interval]
findOverlapped intervals = sortedDifference sorted (groupIntervals sorted)
  where sorted = sortIntervals intervals
                                     
sample = [(1,3),(12,14),(2,4),(13,15),(5,10)]

3 Comments

This is actually the only answer I see here that will indeed find all intervals that overlap some other interval, and do so in O(nlog n) time. (marcog's algorithm is a start but is actually O(n^2).) I like the idea of "subtracting out" the combined intervals (which include all those that don't overlap anything else) to find the overlapping ones.
I must say that i'm generally a little slow, but i think i do not fully grasp this solution. Are you sure this solution finds all possible pairs of overlapping intervals? I also see, in the header of your solution, the comment saying: -- Given a list of intervals, select those which overlap with at least one other inteval in the set. Which is not the same as the question asked. Am I missing something here?
I think your overlap condition can be simplified to overlap (_, b1) (a2,_) | b1 > a2 = True | otherwise = False, or simply: overlap (_, b1) (a2,_) = b1 > a2 assuming that the a1's are sorted. Otherwsie, just overlap (_, b1) (a2,_) = (b1>a2) && (a1<a2)
10

The standard approach for intervales-on-the-line problems is to sort them according to starting point and then just walk from first to last. O(n*logn) (O(n) if already sorted)

end = 0;
for (current in intervals) {
    if current.start < end {
        // there's an intersection!
        // 'current' intersects with some interval before it
        ...
    }
    end = max(end, current.end)
}

9 Comments

You still need to check if the current interval intersects with the one to come before declaring it isolated.
@jbx I didn't say current interval is 'declared isolated' immediately, did I? I didn't even say this is a solution. There're many ways to adapt the approach to this particular problem, e.g. isolated[current - 1] = false or the one you mentioned.
This is only a tiny part of the solution. You're forgetting that there can be independent sets of overlapping intervals. For example: {0, 5}, {1, 6}, {40, 45}, {41, 46}
@Nikita so u just gave part of the solution and left the rest to the imagination? :)
@Ilya In the pseudocode comment it doesn't say "let's check statement X", it says "statement X is true".
|
5

Not sure about O(N) but what if we first sort them by the first number in each tuple, and then sequentially find those where the first number of the tuple is greater than that of the largest number seen in previous tuples, which also do not overlap with the next tuple.

So you would first get:

{1, 3}, {2,4}, {5, 10}, {12, 14}, {13, 15}

since 4 (largest) < 5 and 10 < 12, {5, 10} is isolated.

This would entail that we keep track of the largest number we encounter, and each time we find a tuple whose starting number is greater we check if it overlaps with the next.

This becomes dependent on the efficiency of the sorting algorithm then, because the latter process would be O(N)

2 Comments

Not so simple. Your algorithm will say last two intervals here are 'isolated': {1, 10} {2, 3} {4, 5} {6, 7}
You're right... we'd have to keep track of the largest 2nd number.
2

Suppose the difference between start and endpoints is small, say < 32. eg. 1..32. Then each interval can be written as a bit pattern in a 32 bit word. e.g [1, 2] -> 001; [2, 3]-> 010; [1, 3] -> 011; [2, 3, 4] -> 110. Two intervals, or combinations of intervals, overlap if their bitwise AND is non-zero. eg. [1,2] overlaps [1,3] because 001&011 == 001, non-zero. O(n) alg is to keep a running bitwise OR of intervals seen so far and AND each new one:

bitsSoFar = 0
for (n=0; n < bitPatterns.length; n++)
    if (bitPatterns[n] & bitsSoFar != 0)
        // overlap of bitPatterns[n] with an earlier pattern
        bitsSoFar |= bitPatterns[n]

Left as an exercise:

  • modify algorithm to also identify overlap of a bit pattern with a later one

  • work out the bit pattern for an interval in O(1)

Comments

2

If N pairs of intervals is integers, then we can get it in O(n).

Sort it by first number in the pair then the second number. If all are integers, we can use bucket sort or radix sort to get it by O(n).

{1, 3}, {2,4}, {5, 10}, {12, 14}, {13, 15}

Then combine one by one,

{1,3}

{1,4} with overlap {1,3} and {2,4}

{1,4}, {5,10}

{1,4}, {5,10}, {12,14}

{1,4}, {5,10}, {12,15} with overlap {12,14} and {13,15}

The combination would take O(N) time

Comments

2

Here's an O(N lg N) implementation in Java that extends the answer provided by @Nikita Rybak.

My solution finds every interval that overlaps with at least one other interval and counts them both as overlapping intervals. For example, the two intervals (1, 3) and (2, 4) from OP's original question overlap each other, and so in this case there are 2 overlapping intervals. In other words, if interval A overlaps with interval B, then I add both A and B to the resulting set of intervals that overlap.

Now consider the intervals (1, 100), (10, 20) and (30, 50). My code will find that:

[ 10,  20] overlaps with [  1, 100]
[ 30,  50] overlaps with [  1, 100]

Resulting intervals that overlap with at least one other interval:
[  1, 100]
[ 30,  50]
[ 10,  20]

In order to prevent (1, 100) from being counted twice, I use a Java Set that keeps only unique Interval objects.

My solution follows this outline.

  1. Sort all intervals by starting point. This step is O(N lg N).
  2. Keep track of intervalWithLatestEnd, the interval with the latest end point.
  3. Iterate over all the intervals in the sorted list. If an interval overlaps with intervalWithLatestEnd, then add both to a Set. Update intervalWithLatestEnd when needed. This step is O(N).
  4. Return the Set (and convert to a List if needed).

The total running time is O(N lg N). It requires an output Set of size O(N).

Implementation

In order to add intervals to a set, I created a custom Interval class that override equals(), as expected.

class Interval {
    int start;
    int end;
    Interval(int s, int e) { 
        start = s; end = e; 
    }

    @Override
    public String toString() {
        return String.format("[%3d, %3d]", start, end);
    }
    @Override
    public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result + start;
        result = prime * result + end;
        return result;
    }
    @Override
    public boolean equals(Object obj) {
        if (this == obj)
            return true;
        if (obj == null)
            return false;
        if (getClass() != obj.getClass())
            return false;
        final Interval other = (Interval) obj;
        if (start != other.start)
            return false;
        if (end != other.end)
            return false;
        return true;
    }
}

And here is the code that runs the algorithm:

private static List<Interval> findIntervalsThatOverlap(List<Interval> intervals) {

    // Keeps unique intervals.
    Set<Interval> set = new HashSet<Interval>();

    // Sort the intervals by starting time.
    Collections.sort(intervals, (x, y) -> Integer.compare(x.start, y.start));

    // Keep track of the interval that has the latest end time.
    Interval intervalWithLatestEnd = null;

    for (Interval interval : intervals) {

        if (intervalWithLatestEnd != null &&
            interval.start < intervalWithLatestEnd.end) {

            // Overlap occurred.
            // Add the current interval and the interval it overlapped with.
            set.add(interval); 
            set.add(intervalWithLatestEnd);

            System.out.println(interval + " overlaps with " +
                               intervalWithLatestEnd);
        }

        // Update the interval with latest end.
        if (intervalWithLatestEnd == null ||
            intervalWithLatestEnd.end < interval.end) {

            intervalWithLatestEnd = interval;
        }
    }
    // Convert the Set to a List.
    return new ArrayList<Interval>(set);
}

Test cases

Here is a test case that runs the OP's original intervals:

public static void testcase() {

    List<Interval> intervals = null;
    List<Interval> result = null;

    intervals = new ArrayList<Interval>();

    intervals.add(new Interval(1, 3));
    intervals.add(new Interval(12, 14));
    intervals.add(new Interval(2, 4));
    intervals.add(new Interval(13, 15));
    intervals.add(new Interval(5, 10));


    result = findIntervalsThatOverlap(intervals);
    System.out.println("Intervals that overlap with at least one other interval:");
    for (Interval interval : result) {
        System.out.println(interval);
    }
}

with the result:

[  2,   4] overlaps with [  1,   3]
[ 13,  15] overlaps with [ 12,  14]
Intervals that overlap with at least one other interval:
[  2,   4]
[  1,   3]
[ 13,  15]
[ 12,  14]

Finally, here is a more advanced test case:

public static void testcase() {

    List<Interval> intervals = null;
    List<Interval> result = null;

    intervals = new ArrayList<Interval>();

    intervals.add(new Interval(1, 4));
    intervals.add(new Interval(2, 3));
    intervals.add(new Interval(5, 7));
    intervals.add(new Interval(10, 20));
    intervals.add(new Interval(15, 22));
    intervals.add(new Interval(9, 11));
    intervals.add(new Interval(8, 25));
    intervals.add(new Interval(50, 100));
    intervals.add(new Interval(60, 70));
    intervals.add(new Interval(80, 90));


    result = findIntervalsThatOverlap(intervals);
    System.out.println("Intervals that overlap with at least one other interval:");
    for (Interval interval : result) {
        System.out.println(interval);
    }
}

with the result:

[  2,   3] overlaps with [  1,   4]
[  9,  11] overlaps with [  8,  25]
[ 10,  20] overlaps with [  8,  25]
[ 15,  22] overlaps with [  8,  25]
[ 60,  70] overlaps with [ 50, 100]
[ 80,  90] overlaps with [ 50, 100]
Intervals that overlap with at least one other interval:
[  2,   3]
[  8,  25]
[  9,  11]
[ 50, 100]
[  1,   4]
[ 15,  22]
[ 10,  20]
[ 60,  70]
[ 80,  90]

Comments

0

It's been quite a while since I've used it, but the solution I used was an derivative of the red-black tree described in Introduction to Algorithms called an interval tree. It is a tree sorted by interval start, so you can quickly (binary search) first the first eligible node. IIRC, the nodes were ordered by a property that let's you stop "walking" the tree when the candidates nodes can't match your interval. So I think it was O(m) search, where m is the number of matching intervals.

I search found this implementation.

Brett

[edit] Rereading the question, this isn't what you asked. I think this is the best implementation when you have a list of (for instance) meetings already scheduled in conference rooms (which are added to the tree) and you want to find which rooms are still available for a meeting with a new start and duration (the search term). Hopefully this solution has some relevance, though.

Comments

0

This problem can be reduced to the element uniqueness problem.

Element uniqueness has Omega(n log n) lower bound (counting number of comparisons), so you can't do better than that.

1 Comment

When providing an answer, you are expected to be clear and specific. I'm not sure what's your deleted post look like, but in this post you can at least tell people how to reduce element uniqueness into interval overlapping.
-1

You can go over the list once and keep a hash table of all the intervals encountered so far. If an input interval is part of some interval from the hash table, merge it into the hash table interval. Mark the non-primitive intervals (merged intervals that consist of more than one interval).

Then you go over the list a second time, and for each interval check in the hash table whether it's contained in a merged interval or not.

I don't know if it's O(N), but it's much better than O(N^2).

4 Comments

The only problem is, hashtable doesn't support 'interval intersection' operation :)
Are you referring to some specific implementation of a hash table? Because I was talking about the concept. You can always implement the operation by yourself.
@IlyaKogan I think the point Nikita is trying to make is that there is no obvious way to quickly find the intervals that intersect with a given query interval. The naive approach would take O(n) time per query, which would be an O(n^2) algorithm. You could use an interval tree, but that isn't really related to hash tables at all.
@JohnKurlak Agree. Looks like I missed this part.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.