1

I have been failing to understand the design of hash functions. I was going through an example. As you can see in the function comments, why should you choose 31 as the number to be multiplied. How do you decide? Is that something random or does it mean something?

unsigned int hash(hash_table_t *hashtable, char *str) { unsigned int hashval; /* we start our hash out at 0 */ hashval = 0; /* for each character, we multiply the old hash by 31 and add the current * character. Remember that shifting a number left is equivalent to * multiplying it by 2 raised to the number of places shifted. So we * are in effect multiplying hashval by 32 and then subtracting hashval. * Why do we do this? Because shifting and subtraction are much more * efficient operations than multiplication. */ for(; *str != '\0'; str++) hashval = *str + (hashval << 5) - hashval; /* we then return the hash value mod the hashtable size so that it will * fit into the necessary range */ return hashval % hashtable->size; } 
1
  • 3
    That's known as the Bernstein hash, Torek hash, or simply the "times 33 hash". I suggest reading the comments from the Apache Portable Runtime. Commented Jan 21, 2012 at 0:09

2 Answers 2

3

The hash in question is known as the Bernstein Hash, Torek Hash, or simply the "times 33" hash. It is pretty popular due to its simplicity, speed, and decent distribution with English string data.

Your comments note it is actually multiplying by 31, which seemed arbitrary to you and actually is a bit arbitrary. The Apache Portable Runtime has a comment in their hash algorithm source which notes that many possible constants work well (33 being the most common). They all are odd and close to a power of 2, meaning they translate well into shifts and addition or subtraction.

Some other resources to help understand hashing:

Sign up to request clarification or add additional context in comments.

1 Comment

I was basically pondering over "what is a good way for designing a hash function".. Now atleast the fact that most of them are trial and error relieves me. And i understand the times33 hash. Thank you
1

Here is a lecture on hash functions with 65k views. On youtube: http://www.youtube.com/watch?v=KW0UvOW0XIo

This is not exactly what you are asking for, however your questions reveals that your knowledge in hashing is limited. Better to read a tutorial or check a presentation.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.