0

Recently I have found a bug in a huge code because of dereferencing set.begin() while the set was empty. Is there a way (like setting a compiler flag) to force containers to throw an exception when dereferencing an invalid iterator like an empty set's .begin() or .end()?

std::set<int> s;
*s.begin(); // force this to throw an exception because std::set is empty
4
  • 2
    "set.begin() will never throw an exception. I want to know what is the reason of that" - what would you have it do otherwise, throw even when set.begin() == set.end() ? It sounds like the real problem is a bonehead move on the coder; not the architects of the standard library. Commented Oct 8, 2016 at 18:25
  • If you can use Visual Studio, it provides debug iterator support. It won't throw exceptions, but incorrect use cannot be ignored either. Commented Oct 8, 2016 at 18:28
  • Some implementations of the standard library have debug versions that check iterator validity; check your documentation. Caution: this can make your code run very slowly. Commented Oct 8, 2016 at 18:28
  • @IInspectable My bad. Of course .begin() should never throw an exception. And *s.begin() throwing exception doesn't have anything to do with .begin(). I edited the question. Commented Oct 8, 2016 at 18:38

2 Answers 2

2

set.begin() will never throw an exception. I want to know what is the reason of that.

An empty container is a perfectly valid container, and it is perfectly reasonable to perform algorithms on them. It would be pretty surprising if:

if (std::find(c.begin(), c.end(), v) == c.end()) {
   // not present
}

worked fine for a container with 1 or more elements that didn't contain v, but threw an exception if it were empty! It's not exceptional for a container to be empty. That would be insane. That would require every programmer at every point to specially check emptiness as a special case - when it isn't really.

The rule is simply that dereferencing the end() iterator is undefined behavior. For empty containers, begin() == end() so that extends to begin() as well. It is up to your code to control those accesses.


Also is there a way (like setting a compiler flag) to force containers to throw an exception when dereferencing an invalid iterator like an empty set's .begin() or .end()?

Some implementations will assert if you try to dereference an invalid iterator (e.g. libstdc++). But you could always simply write a wrapper implementation that will do this for you:

#ifdef NDEBUG
    template <class T>
    using my_set = std::set<T>;
#else
    template <class T>
    class my_set {
        // implement your own set that carefully manages all the lifetimes
        // of its entries such that it's possible to check the validity
        // of them in the iterators, and then throw on bad dereference
    };
#endif
Sign up to request clarification or add additional context in comments.

4 Comments

My bad. Of course .begin() should never throw an exception. And *s.begin() throwing exception doesn't have anything to do with .begin(). I edited the question.
@Tempux Think about all the work that set would have to do to be able to verify the validity of nodes on dereference. I don't want to have to pay for all of that.
@Tempux Besides, dereferencing invalid iterators is programmer error, not exceptional control flow. Why would that be an exception?
What about just .begin() and .end()? Is that expensive too? Just to prevent the programmer from making these mistakes. I had a very hard time finding this bug in a huge code. Maybe if there was such an exception this code would have got fixed right away. I am not sure but I think java throws such exception.
1

There's nothing about any particular iterator that announces, in some way, whether the iterator can be validly dereferenced. C++ is not a virtual machine-based code, like Java or C#, with the virtual machine keeping track of each object's validity.

It is possible that there might be compiler-specific options that enable additional run-time sanity checks. But since you haven't even identified your compiler, the answer here would simply be "check your compiler's documentation".

And if none of the options works for you, the answer will be "write it yourself". Based on a compile-time macro and by using certain coding conventions it would be possible, for example, to implement an iterator-compatible interface that performs additional sanity checks. For example, instead of declaring

std::set<int> set_of_ints;

and

std::set<int>::iterator b=set_of_ints.begin();

// Or something else that references std::set<int>::iterator

Declare and always use aliases:

typedef std::set<int> set_of_ints_t;

set_of_ints_t set_of_ints;

set_of_ints_t::iterator p=set_of_ints.begin();

... and so on. With this coding convention in place, it becomes easy to use a compile-time macro to enable additional sanity checks:

#ifndef DEBUG
    typedef std::set<int> set_of_ints_t;
#else
    typedef my_sanity_checked_set sets_of_ints_t;
#endif

With my_sanity_checked_set being a custom container that's interface-compatible with a std::set<int>, and with an iterator whose operators perform additional sanity checks on every operation (such as, for example, not incrementing or decrementing past the set's boundaries, dereferencing the end() value, etc...)

All this checking comes with extra overhead, of course. You would use this during development, then turn the whole thing off and compile using the native std::set for the release build. That's how it's done.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.