0

So I am given a key that is in the format XYYYYZ, where X is a char from 'A'-'Z', YYYY is and int from 0 to 9999, and Z is a char from 'A'-'C'. I am suppose to make a unique hash function without any collisions.

I was told the smallest someone has made is a table size of 780,000 but I have no idea how.

The one I can think of is X-'A' to get a number from 0 to 26 and multiplying that by 100,000 then multiplying YYYY by 10 and then add (Z - 'A')

So Z1025A would be 2,610,250 and L4444C would be 1,144,443

And the make possible combo is 2699993 and / 2,700,000 would have about a 29% usage rate.

But is there any other way to reduce the the size of the table?

2 Answers 2

1

just do

((Z - 'A') * 26 + (X - 'A')) * 10000 + YYYY 
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you! It was bothering me a lot that it was so much.
1

The smallest possible hash table size for a key in this format is 780000, because there are 26 ways to choose the first character, 10 ways to choose each of the next four, and 3 ways to choose the final character. That is, there are 26 * 10 * 10 * 10 * 10 * 3 = 780000 possible keys. To find a hash function, think of the hash key like a counter. Rearrange the elements like this:

ZXYYYY

Starting with all elements at zero, each of the 'Y' elements rolls over after reaching 9. 'X' rolls over after reaching 25, and 'Z' rolls over after reaching 2. So, we can assign a number to the four 'Y' elements with:

y4 y3 y2 y1 --> y1 + (y2 * 10) + (y3 * 100) + (y4 * 1000)

This part of the key is just a base 10 counter. The remaining pair of elements forms a base 26 counter, and you can assign a number to this pair by assigning a number from 0 to 25 to the first value ('X'), 26 times a number from 0 to 25 to the second, and adding the results:

z x --> x + (z * 26)

For y4 y3 y2 y1 we will get a value from 0 to 9999, and for z x we will get a value from 0 to 675. If we multiply this value by 10000, we can add the value obtained for y4 y3 y2 y1 to get a unique value for the key. That is, the four low order positions count from 0 to 9 in ones, 0 to 90 in tens, 0 to 900 in hundreds, and 0 to 9000 in thousands, while the two high order positions can be viewed as counting from 0 to 6750000 in ten-thousands. This gives a possible 6760000 unique keys with this hash function. But since your specific case limits 'z' to three characters, we only have 3 * 26 = 78 possibilities for z x, and so there are 780000 unique hashes obtainable with this method, and the hash function can then be written:

hval = y1 + (y2 * 10) + (y3 * 100) + (y4 * 1000) + (x + z * 26) * 10000

where y1, y2, y3, y4, x, and z all represent integer values. Or, using C chars:

int y1, y2, y3, y4; char x, z; long hval; hval = y1 + (y2 * 10) + (y3 * 100) + (y4 * 1000) + ((x - 'A') + (z - 'A') * 26) * 10000; 

I should add that, converting the characters of the Latin alphabet to integers in this way is not guaranteed to work by the standard, but so long as you have an ASCII or UTF-8 character set it will work.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.