1

In the implementation of GetHashCode below, when Collection is null or empty will both result in a hash code of 0.

A colleague suggested return a random hard coded number like 19 to differentiate from a null collection. Why would I want to do this? Why would I care that a null or empty collection produces a different hash code?

public class Foo
{
    public List<int> Collection { get; set; }
    // Other properties omitted.

    public int override GetHashCode()
    {
        var hashCode = 0;
        if (this.Collection != null)
        {
            foreach (var item in this.Collection)
            {
                var itemHashCode = item == null ? 0 : item.GetHashCode();
                hashCode = ((hashCode << 5) + hashCode) ^ itemHashCode;
            }
        }

        return hashCode;
    }
}
13
  • 2
    "should I return a random hard coded number like 123 to differentiate from a null collection" How should this ever be correctly answered? It´s completely up to your scenario. In fact there is no single correct hashcode-implementation. that fits them all. Personally I won´t bother too much for one collision more, as long as not all possible values return the exact same hashcode. Commented Jun 12, 2019 at 7:52
  • 3
    Why override GetHashCode in tie first place? What do you want to achieve? Commented Jun 12, 2019 at 7:53
  • 1
    If you assign a different meaning to an empty collection vs. a null-reference and no collection, then yes, those two things should produce different hash codes. If you don't, then you can just use the same, usually 0 is fine. Commented Jun 12, 2019 at 7:56
  • 2
    Jon Skeet has a good hashcode implementation for lists, but it really depends what you are trying to do here. Are you testing for list equality or list sequence equality? Commented Jun 12, 2019 at 7:57
  • 1
    My two cents is that you should never have a null-reference to a collection anyway, because usually you don't treat that differently from an empty one. Commented Jun 12, 2019 at 7:57

1 Answer 1

2

The design of GetHashCode is that it is supposed to minimize the number of collisions that will take place, as best as it can. While having some hash collisions is inevitable, you'll want to be mindful of what types of objects are colliding, what type of data are going to be stored in your hash based collections, and working to ensure that types of objects stored together in the same collection are less likely to collide.

So if you happen to know something about how hash-based collections of this type are going to be used, and that there are likely to be both null and empty objects in them, then it would improve the performance to have them not collide. If you suspect that having both a null and empty value in the same collection is not particularly likely, then having them collide isn't actually a concern.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.