1

In this MSDN page it says:

Warning:

If you override the GetHashCode method, you should also override Equals, and vice versa. If your overridden Equals method returns true when two objects are tested for equality, your overridden GetHashCode method must return the same value for the two objects.

I have also seen many similar recommendations and I can understand that when overriding the Equals method I would also want to override the GetHashCode. As far as I can work out though, the GetHashCode is used with hash table look-ups, which is not the same as equality checking.

Here is an example to help explain what I want to ask:

public class Temperature /* Immutable */
{
    public Temperature(double value, TemperatureUnit unit) { ... }

    private double Value { get; set; }
    private TemperatureUnit Unit { get; set; }

    private double GetValue(TemperatureUnit unit)
    {
        /* return value converted into the specified unit */
    }

    ...

    public override bool Equals(object obj)
    {
        Temperature other = obj as Temperature;
        if (other == null) { return false; }
        return (Value == other.GetValue(Unit));
    }

    public override int GetHashCode()
    {
        return Value.GetHashCode() + Unit.GetHashCode();
    }
}

In this example, two Temperature objects are considered equal, even if they are not storing the same things internally (e.g. 295.15 K == 22 Celsius). At the moment the GetHashCode method will return different values for each. These two temperatures objects are equal but they are also not the same, so is it not correct that they have different hash codes?

9
  • 1
    If you are never going to store your Temperature object in a HashSet or a Dictionary, then you could get away with ignoring GetHashCode, but you really, really shouldn't. It would be horrible practice. Commented Jan 15, 2016 at 19:29
  • 2
    @MattBurland I would add to the list also LINQ Distinct, GroupBy, ToLookup, Union, Intersect etc. Commented Jan 15, 2016 at 19:34
  • 1
    @IvanStoev: Absolutely, which really illustrates why you shouldn't. You may be confident that you didn't put something into a HashSet, but, especially with something like LINQ, you can't be sure that the internal implementation of one of those is using the HashCode (and almost certainly it is). Commented Jan 15, 2016 at 19:36
  • Just to clarify I am trying to implement both. Commented Jan 15, 2016 at 19:37
  • 2
    @Ben: You really need to distinguish between what it means to be equal in the context of the physical world of temperature measurements and what it means to be equal in the world of your C# code. If your Equals method returns true, then the two object must return the same hash code. If they don't, your code will break in many interestingly frustrating ways... Commented Jan 15, 2016 at 20:12

3 Answers 3

6

When storing a value in a hash table, such as Dictionary<>, the framework will first call GetHashCode() and check if there's already a bucket in the hash table for that hash code. If there is, it will call .Equals() to see if the new value is indeed equal to the existing value. If not (meaning the two objects are different, but result in the same hash code), you have what's known as a collision. In this case, the items in this bucket are stored as a linked list and retrieving a certain value becomes O(n).

If you implemented GetHashCode() but did not implement Equals(), the framework would resort to using reference equality to check for equality which would result in every instance creating a collision.

If you implemented Equals() but did not implement GetHashCode(), you might run into a situation where you had two objects that were equal, but resulted in different hash codes meaning they'd maintain their own separate values in your hash table. This would potentially confuse anyone using your class.

As far as what objects are considered equal, that's up to you. If I create a hash table based on temperature, should I be able to refer to the same item using either its Celsius or Fahrenheit value? If so, they need to result in the same hash value and Equals() needs to return true.

Update:

Let's step back and take a look at the purpose of a hash code in the first place. Within this context, a hash code is used as a quick way to identify if two objects are most likely equal. If we have two objects that have different hash codes, we know for a fact they are not equal. If we have two objects that have the same hash code, we know they are most likely equal. I say most likely because an int can only be used to represent a few billion possible values, and strings can of course contain the complete works of Charles Dickens, or any number of possible values. Much in the .NET framework is based on these truths, and developers that use your code will assume things work in a way that is consistent with the rest of the framework.

If you were to have two instances that have different hash codes, but have an implementation of Equals() that returns true, you're breaking this convention. A developer that compares two objects might then use one of of those objects to refer to a key in a hash table and expect to get an existing value out. If all of a sudden the hash code is different, this code might result in a runtime exception instead. Or perhaps return a reference to a completely different object.

Whether 295.15k and 22C are equal within the domain of your program is your choice (In my opinion, they are not). However, whatever you decide, objects that are equal must return the same has code.

Sign up to request clarification or add additional context in comments.

11 Comments

If you don't override equals it will default to reference equality.
Moreover, when the framework looks in the dictionary, it'll use the hash code first, so you could store 295.15K in the dictionary, but when you look for 22C, it won't be found. Is this what you want? Maybe, maybe not.
yes this is what I would expect, 295.15K is not the same as 22C so it should be stored in a different bucket right?
If it were me, I wouldn't do unit conversion implicitly like that. A Temperature class should have a Unit property, and instances that different in which unit they're measured in should not be considered equal. Now, you could have a .ToCelsius() and .ToKelvin() method, and that would return an instance that is equal.
Agreed. Anyone who uses your class expects objects that are equal can be used to refer to the same key in a hash table. Every other class works this way. If you break this rule, you can potentially create some really weird, really hard to track down bugs in the future. The same hash code suggests objects are probably equal, Equals tells you if they are definitely equal. Those two concepts need to go hand in hand.
|
2

Warning:

If you override the GetHashCode method, you should also override Equals, and vice versa. If your overridden Equals method returns true when two objects are tested for equality, your overridden GetHashCode method must return the same value for the two objects.

This is a convention in the .NET libraries. It's not enforced at compile time, or even at run-time, but code in the .NET library (and likely any other external library) expects this statement to always be true:

If two object return true from Equals they will return the same hash code

And:

If two objects return different hash codes they are NOT equal

If you don't follow that convention, then your code will break. And worse it will probably break in ways that are really hard to trace (like putting two identical objects in a dictionary, or getting a different object from a dictionary than the one you expected).

So, follow the convention, or you will cause yourself a lot of grief.

In you particular class, you need to decide, either Equals returns false when the units are different, or GetHashCode returns the same hash code regardless of unit. You can't have it both ways.

So you either do this:

public override bool Equals(object obj)
{
    Temperature other = obj as Temperature;
    if (other == null) { return false; }
    return (Value == other.Value && Unit == other.Unit);
}

Or you do this:

public override int GetHashCode()
{
    // note that the value returned from ConvertToSomeBaseUnit
    // should probably be cached as a private member 
    // especially if your class is supposed to immutable
    return Value.ConvertToSomeBaseUnit().GetHashCode();
}

Note that nothing is stopping you from also implementing:

public bool TemperaturesAreEqual(Temperature other)
{
    if (other == null) { return false; }
    return (Value == other.GetValue(Unit));
}

And using that when you want to know if two temperatures represent the same physical temperature regardless of units.

1 Comment

Thanks for this answer Matt, I think I always knew I was going to have to follow the convention, but when the convention seems to be hard to understand, then it makes me want to find out more about why the convention is the way it is.
1

Two objects that are equal should return the same HashCode (two objects that are different could return the same hashcode too, but that's a collision).

In your case, neither your equals nor your hashcode implementations are a good one. Problem being that the "real value" of the object is dependant on a parameter: there's no single property that defines the value of the object. You only store the initial unit to do equality compare.

So, why don't you settle on an internal definition of what's the Value of your Temperature?

I'd implement it like:

public class Temperature
{
    public Temperature(double value, TemperatureUnit unit) { 
       Value = ConvertValue(value, unit, TemperatureUnit.Celsius);
    }

    private double Value { get; set; }

    private double ConvertValue(double value, TemperatureUnit originalUnit,  TemperatureUnit targetUnit)
    {
       /* return value from originalUnit converted to targetUnit */
    }
    private double GetValue(TemperatureUnit unit)
    {
       return ConvertValue(value, TemperatureUnit.Celsius, unit);
    }    
    public override bool Equals(object obj)
    {
        Temperature other = obj as Temperature;
        if (other == null) { return false; }
        return (Value == other.Value);
    }

    public override int GetHashCode()
    {
        return Value.GetHashCode();
    }
}

That way, your internal Value is what defines if two objects are the same, and is always expressed in the same unit.

You don't really care what Unit the object has: it makes no sense, since for getting the value back, you'll always pass a value. It only makes sense to pass it for the initial conversion.

2 Comments

Nope, they won't be equal. The value is converted in the constructor. The stored "value" will be celsius
Value is private in OPs example, who cares how you internally store it?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.