8

I am wondering what would be the time complexity on Java HashMap resizing when the load factor exceeds the threshold ? As far as I understand for HashMap the table size is always power of 2 an even number, so whenever we resize the table we don't necessary need to rehash all the keys (correct me if i am wrong), all we need to do is to allocate additional spaces without and copy over all the entries from the old table (I am not quite sure how does JVM deal with that internally), correct ? Whereas for Hashtable since it uses a prime number as the table size, so we need to rehash all the entries whenever we re-size the table. So my question is does it still take O(n) linear time for resizing on HashMap ?

1
  • You could always just study the source for HashMap. :) Commented Jan 10, 2013 at 5:25

2 Answers 2

9

Does it still take O(N) time for resizing a HashMap?

Basically, yes.

And a consequence is that an insertion operation that causes a resize will take O(N) time. But that happens on O(1/N) of all insertions, so (under certain assumptions) the average insertion time is O(1).

so could a good load factor affect this performance ? like better and faster than O(N)?

Choice of load factor affects performance, but not complexity.

If we make normal assumptions about the hash function and key clustering, when the load factor is larger:

  • the average hash chain length is longer, but still O(1),
  • frequency of resizes reduces, but is still O(1/N),
  • the cost of a resize remains about the same, and the complexity is still O(N).

... so whenever we resize the table we don't necessary need to rehash all the keys (correct me if i am wrong.

Actually, you would need to rehash all of the keys. When you double the hash table size, the hash chains need to be split. To do this, you need to test which of two chains the hash value for every key maps to. (Indeed, you need to do the same if the hash table had an open organization too.)

However, in the current generation of HashMap implementations, the hashcode values are cached in the chained entry objects, so that the hashcode for a key doesn't ever need to be recomputed.


One comment mentioned the degenerate case where all keys hash to the same hashcode. That can happen either due to a poorly designed hash function, or a skewed distribution of keys.

This affects performance of lookup, insertion and other operations, but it does not affect either the cost or frequency of resizes.

Sign up to request clarification or add additional context in comments.

13 Comments

So does it mean the insertion takes O(n) in worst case ?
What insertion? We were talking about resizing weren't we?
Insertion would be O(n) only in a degenerate case where ALL the keys hash to the same value.
I mean resizing occurs (when exceeds threshold) right after insertion isn't it ?
@user1389813 - In that case, yes. The average cost of HashMap.insert() is O(1) but the worst case is O(N). But this isn't that strange. The same thing happens with StringBuffer.append, appending to an ArrayList and so on.
|
0

When the table is resized, the entire contents of the original table must be copied to the new table, so it takes O(n) time to resize the table, where n is the number of elements in the original table. The amortized cost of any operation on a HashMap (assuming the uniform hashing assumption) is O(1), but yes, the worst case cost of a single insertion operation is O(n).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.