0

I assume that there is a reason behind this design choice. Boost seems to have an implementation for it, hence it should be possible to use vectors as hash table keys. Are there any theoretical properties for hash functions applied to arrays that make them more prone to collisions or other undesirable behavior?

6
  • I don't think there's any reason why you can't use vectors as keys to an unordered_map; you'll just need to specify a hash function that computes a hash value from a vector. Commented May 15, 2020 at 0:52
  • Can you provide an example of a hash function suitable for vectors? Commented May 15, 2020 at 1:00
  • 3
    Not being constant time complexity seems like something that might have influenced the decision not to specialize std::hash for std::vector, making sure that the user knows what they're getting into if they want to use one as a key. That's a guess, though. Commented May 15, 2020 at 1:03
  • Good observation @chris Commented May 15, 2020 at 1:07
  • 1
    You need to roll your own hash function for that. Not difficult at all. Commented May 15, 2020 at 1:09

1 Answer 1

3

You'll notice Boost doesn't actually have an extension to accept a vector<T> as a key specifically - instead it has an extension that lets you use any Iterable - and it generates the hash only as a function of the Iterable's contents...

This may or may not be desirable, depending on:

  • If you want to use object-identity rather than object-value as the basis for hashing... or not.
  • If you're comfortable with hashing being a non-constant-time operation... or not.
    • Just because boost::hash_range appears to be O(n) doesn't mean the underlying iterable won't take 5 minutes to return all hashable values for each call...
  • If the order of elements does - or doesn't matter.
    • (I believe) using boost::hash_range or boost::hash_combine with one of two distinct but equivalent unordered_set objects will result in different hash-codes despite their value-equivalence.
  • If two conceptually different objects that can iterate over the same values (e.g. a vector<uint8_t> representing a data buffer, or queue<SomeEnum> where SomeEnum : uint8_t representing a queue of values) should have the same hahs-code... or not.

I suspect the team behind the STL doesn't like the fact that there's so many contextual "if"s described above which would mean it wouldn't be sensible to provide default behaviour and so they require you to always be more explicit with your hash-generation for arbitrary objects (besides, if you want Boost's behaviour, then just use Boost in the first place - it's not like Boost is competing with the STL).

Also see this QA: C++ unordered_map using a custom class type as the key

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.