Skip to main content
1 of 3
JimmyJames
  • 31.1k
  • 3
  • 59
  • 111

It's hard to know the exact reason for this. That it truncates the hash to fit into 31 bits suggests that it's related to compatibility with 32-bit signed integers.

One thing to realize, though, is that, in practice, you are unlikely to ever have anywhere near 2 billion 'buckets' in a hashmap. It just doesn't make sense to have an underlying list that large when you are only using a small fraction of the locations. The typical way these are implemented is with some small number of buckets (by default) and a load factor which determines how when to grow the underlying array and rehash the items.

For example, you could start with 16 'buckets'. When an item is added, its hash is calculated and that is then modded by 16 to determine which bucket it will end up in. Each 'bucket' is a list which can hold multiple items. At some point, when the number of items nears or exceeds 16, the number of buckets is increased, and the hashes are re-modded to find their new location.

Due to this, in effect, the hash will always be truncated to something smaller than 32 bits anyway. I'm struggling to imagine a case where these extra bits are going to have a significant impact on the distribution of items in the map.

JimmyJames
  • 31.1k
  • 3
  • 59
  • 111