2

Why the capacity must be a multiple or 2? Why use "&" in the indexFor functions? Why recompute the hash in the hash function instead of directly using the key's hash code?

I think there are some important differences between the this implementation and the description on the "Introduction to Algorithm".

What does ">>>" mean?

static int hash(int h) { // This function ensures that hashCodes that differ only by // constant multiples at each bit position have a bounded // number of collisions (approximately 8 at default load factor). h ^= (h >>> 20) ^ (h >>> 12); return h ^ (h >>> 7) ^ (h >>> 4); } 

Can anyone give me some guide ? I appreciate If some one can explain the hash algorithm. Thanks a lot!

2
  • I know using "&", the key can be mapped to the limited slots. What about influence to the collision in the hash map? Commented Apr 3, 2012 at 21:30
  • >>> is unsigned right shift. Regular >> in Java will preserve and propagate the sign bit and leave a negative number negative. >>> will fill the sign bit with zeros as shifting occurs. Commented Apr 4, 2012 at 16:10

2 Answers 2

5

This is a performance optimization. The usual way to map a hash code to a table index is

table_index = hash_code % table_length; 

The % operator is expensive. If table_length is a power of 2, then the calculation:

table_index = hash_code & (table_length - 1); 

is equivalent to the (much) more expensive modulo operation.

Sign up to request clarification or add additional context in comments.

3 Comments

This is why it's a good idea to learn to code in assembly on an old processor.
Danger Will Robinson! 50% of hash codes are negative numbers, so hash_code % table_length could be negative, resulting in an ArrayOutOfBoundsException. You need to use Math.abs(hash_code % table_length) to get a usable (ie safe) index
@Bohemian - Yes, negative numbers are an added complication. That's another advantage of the second method, which works just fine with negative hash code values.
0

Pay no attention to the man behind the curtain.

The actual algorithm is no doubt a combination of "what feels good" to the developer, fixes for some odd degenerate cases, and simple tradition (for which users often develop obscure dependencies).

And note this:

 * Applies a supplemental hash function to a given hashCode, which * defends against poor quality hash functions. This is critical * because HashMap uses power-of-two length hash tables, that * otherwise encounter collisions for hashCodes that do not differ * in lower bits. Note: Null keys always map to hash 0, thus index 0. 

Net: So long as it works and the performance is good, you don't care.

1 Comment

Thanks for everyone's answer. They are answered my question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.