Internal implementation of hashMap

Question

I am understanding hashMap internal implementation and i need some help regarding the same. HashMap store data by using linkedlist in java 8 it is using tree data structure. below is the node Node class constructor could you please help me to understand why node is having hash value of the key?

Node(int hash, K key, V value, Node<K,V> next) { this.hash = hash; this.key = key; this.value = value; this.next = next; }

Don't you think it would be easier to understand how it works by reading from some external source instead of internal implementation itself? Even a Wiki article would help a lot: en.wikipedia.org/wiki/Hash_table — Amongalen
– Amongalen, Commented Aug 20, 2019 at 14:03
Well, calculating the hash could be costly so caching it is wise. Also note that the decision whether to use a linked list or a tree doesn't depend on the Java version (although there might be internal changes) but rather on the size of the list: a short list often is easier to use and faster than a tree so you'll mostly get a list and only if there a too many collisions will it be converted into a tree (via the protected method treeifyBin(...)). — Thomas
– Thomas, Commented Aug 20, 2019 at 14:04
The source code for Java 8's HashMap, for those who wish to read it. — Tim Biegeleisen
– Tim Biegeleisen, Commented Aug 20, 2019 at 14:12

Avi · Accepted Answer · 2019-08-20 14:29:34Z

You use the hashes to compute the placement of an element within the HashMap. It would be supremely inefficient if adding a key to your HashMap required re-computation of the hashes every time a collision needed to be resolved. Not only that, but some objects may have expensive hash functions (for example, String hashCode goes through every character in the String). It would always be desirable to compute such functions once and cache them for later (since you should not change hashCode/equals for an object placed within a Map).

Let's consider a HashMap that is half full (n/2), with independent elements being placing into the entry set. There is a 1/2 probability of collision (minimum) for any given element being added. The amount of entries possible to fill the HashMap is n, but the default load factor is 0.75, which means we have 3n/4 - n/2 = n/4 entries left to fill. All of these entries must be without hash collisions, as we save time for each collision (by caching). Assuming the maximum possible probability of 1/2 for not having a collision, we see that we have probability 1/2^n/4 of no collisions occurring before HashMap expansion. This means that for any sizeable HashMap (n=29+, out of which only 0.5*29=(about 15) keys have to be filled), there is a greater than 99% chance that you get time savings off of caching hash values.

tl;dr It saves time, especially when collisions become frequent for addition/lookups.

Collectives™ on Stack Overflow

Internal implementation of hashMap

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related