1

In one of my projects it is necessary to remove certain elements from a std::vector<double> values. The indices I have to remove are given as a vector of intervals. For example {1,3} means, that I have to remove indices from 1 to 3 inclusive from values.

I can assume that the intervals given are mutually exclusive.

The code shown below illustrates, what the desired behavior should look like.

#include <iostream>
#include <vector>

int main(int argc, char** args) {
    // Intervals of indices I have to remove from values
    std::vector<std::pair<int, int>> intervals = { {1,3},{7,9},{13,13} }; 

    // Vector of arbitrary values. 
    std::vector<double> values = {4.2,6.4,2.3,3.4,9.1,2.3,0.6,1.2,0.3,0.4,6.4,3.6,1.4,2.5,7.5 }
    removeIntervals(values, intervals);
    // intervals should contain 4.2,9.1,2.3,0.6,6.4,3.6,1.4,7.5
}

What might be the shortest amount of code necessary to achieve this?

My best solution so far is:

 void removeIntervals(std::vector<double>& values, const std::vector < std::pair<int, int>>& intervals) {
    std::vector<bool> flags(values.size(), true);
    std::vector<double> ret;
    for (auto interval : intervals) {
        std:fill(flags.begin() + interval.first, flags.begin()+interval.second+1, false);
    }
    for (auto i = 0; i < values.size(); i++) {
        if (flags[i]) ret.push_back(values[i]);
    }
    values = ret;
 }

I can assume, that my intervals are non-overlapping and consecutive. It seems, that it boils down to perform erase from back to front.

void removeIntervals2(std::vector<double>& values, const std::vector < std::pair<int, int>>& intervals) {
    auto revIntervals = intervals;
    std::reverse(revIntervals.begin(), revIntervals.end());
    for (auto interval : revIntervals) {
        values.erase(std::begin(values) + interval.first, std::begin(values) + interval.second + 1);
    }
}
7
  • take a look at std::remove_if. It should be actually quite straightforward Commented Dec 12, 2016 at 12:10
  • 2
    in the example scenario, if you delete {1, 3} and {7, 9}, your vector will have 8 elements, hence deleting {13, 13} won't be possible. Maybe your problem description needs an update? Commented Dec 12, 2016 at 12:11
  • 1
    Can you assert that your intervals don't overlap; and that they are always in incrementing order? Commented Dec 12, 2016 at 12:23
  • 1
    @UKMonkey I can assure that the intervals don't overlap. And I can assume in my case, that they are in increasing order. Maybe it is possible to call consecutive erase, but starting from the back? Commented Dec 12, 2016 at 12:49
  • 2
    Asking for the minimum amount of code is (1) ill-defined and (2) non-sensical. Better ask for efficient code or similar. Commented Dec 12, 2016 at 13:06

6 Answers 6

2

Since you can assume that the intervals don't overlap and are increasing order, the solution is to start at the back (so that the indexes don't change) and remove each range in turn:

So for the minimum amount of code, which you asked for:

for (auto& it = intervals.rbegin(); it != intervals.rend(); ++it) {
  values.erase(values.begin() + it->first, std::next(values.begin() + it->second));

The down side of this is that this will involve lots of shuffling of the vector. Really what you would want to do is swap the last unswapped item at the end of the vector, with the item you want to remove, and then resize when you're finished to cut off the end; but this requires more code.

Sign up to request clarification or add additional context in comments.

4 Comments

Many thanks. That seems to be the shortest possible solution, that I was looking for. So easy! :-)
erase() erases a range of elements [first,last). So your code would fail for inclusiveness. Ex: in case of {13, 13}, there won't be any deletion. i have edited your answer to get real last by using std::next().
This may be shortest in terms of your code, but not necessarily in terms of overall code or code generated. Moveover, it certainly is not the most efficient solution, as it requires a lot of copying/moving of elements.
@Walter as stated in my answer - there's a better algorithm that would do less shuffling in the general case, but the number of shuffles required may be minimal depending on what's being erased, and improvements are premature optimisation.
1

The problem is non-trivial since after a first call to vector::erase() all indices/iterators to elements past the first erased one are invalidated, including further interval to be removed.

Therefore, using vector::erase() has to be done in descending order of the elements to be erased.

Another inconvenience originates from using int indices instead of iterators for the interval boundaries. Finally, vector::erase() copies (ore moves) all elements past the last removed elements to fill the gap. This preserves the order of values, but causes excessive copying (moving) in case of several intervals.

A more efficient way is to swap only the elements to be removed and finally down-size the vector.

2 Comments

Using a reverse iterator is still pretty trivial.
Using a reverse_iterator is straightforward. 'Trivial' relates to the problem, not to the solution. Note also that if efficiency is of importance your solution is terrible.
1

What you surely want is a solution with not only short code but also good efficiency, minimizing the copies and shifts in the vector of values.

I would definitely go with the first part of your solution, that is falgging the positions to be kept or deleted.

std::vector<bool> flags(values.size(), true);
for (auto interval : intervals) {
    std:fill(flags.begin() + interval.first, flags.begin()+interval.second+1, false);
}

For the second part, the shortest and most efficient would be the erase/remove_if idiom:

 values.erase(std::remove_if(begin(values), end(values),
    [&](const auto& v) { return !flags[&v - &(*values.begin())];}),
  values.end());

The efficiency here is due to that remove_if will first mark the elements that need to be removed, then it will compact the vector by putting first the elements to stay and returning the position of the first element to remove. Finally, the erase will shrink the vector. From an algorithmic point of view, this solution is likely optimal. It should pay out for large vectors.

Comments

1

Thought I'd post an answer that was a little more fault tolerant. If your intervals are larger than the input array for example if intervals had included {15, 15} this will still function correctly. Additionally this is faster than UKMonkey's solution because it does all the work in a single pass:

It has come to my attention that this code is implementation defined, and only works on Clang and Visual Studio 2015 Update 3:

values.resize(distance(begin(values), remove_if(begin(values), end(values), [i = 0U, it = cbegin(intervals), end = cend(intervals)](const auto&) mutable { return it != end && ++i > it->first && (i <= it->second || (++it, true)); })));

Live Example

You can accomplish the same thing in a for-loop though:

size_t write = 0U;
auto it = cbegin(intervals);

for (size_t read = 0U; read < size(values); ++read) {
    if (it == cend(intervals) || read < it->first) {
        values[write++] = values[read];
    } else if (read == it->second) {
        ++it;
    }
}

values.resize(write);

Live Example

If you are hooked on "the shortest amount of code necessary to achieve this," you can use my evil , from the lambda in the for-loop too:

for (size_t read = 0U; read < size(values); ++read) if (it == cend(intervals) || read < it->first || (read == it->second && (++it, false))) values[write++] = values[read];

2 Comments

this solution seems broken to me. (1) vector::erase takes iterator arguments, not int indices; (2) after the first call to vector::erase() all iterators past the last erased one will be invalidated, breaking further calls of vector::erase() (including UB).
@Walter Good call. You were correct. I've updated my answer.
1

Well, the answers so far are all bad -- either making whole new vectors or requiring O(N^2) time -- so I'll add this one.

Instead of erasing the elements you don't want to keep, and shifting the rest every time, you move the ones you do want to keep into the proper position, and then just truncate the vector.

O(N) time and no extra space:

void removeIntervals(std::vector<double>& values, const std::vector < std::pair<int, int>>& intervals) {
    if (intervals.size()<=0)
        return;

    //keep the part before the first interval
    auto dest = values.begin()+intervals[0].first;

    for (size_t i=0; i<intervals.size(); ++i) {

        //copy the part to keep after each interval
        auto s = values.cbegin()+intervals[i].second+1;
        auto e = (i+i >= intervals.size() ?
                  values.cend() : 
                  values.cbegin()+intervals[i+1].first);
        while(s<e) {
            *dest++=*s++;
        }
    }
    values.erase(dest,values.end());
 }

Comments

1

In complement of Matt Timmermans answer : It's not the question, but if you want to keep only values in intervals, in C++17, you can write :

void remove_if_not_in_interval(std::vector<double>& result, const std::vector<std::pair<int,int> >& intervals)
    {
      if (intervals.size() == 0)
        result.clear();

      auto dest = result.begin();
      for (auto [first, last] : intervals)
        {
          while(first!=last+1)
            {
              *dest++ = *(result.begin() + first++);
            }
        }

      result.erase(dest,result.end());
    }

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.