"I'm more interested in the way Hash Tables look up the key and how the key is generated."
Hashing transforms ana key object to a number. This is called "hashing" -- it makes a hash out of the object. See Hash Function. Summing the bytes of a string, for example, is a standard hash technique. You compute the sum modulo 232 to keep the hash to a manageable size. Hash always gives the same answer. This is O(1).
The number isgives you a "slot" in the HashTable. Given an arbitrary key object, the hash value computes a hash value. The hash value then gives you the slot in table. Usually
mod( hash, table size ). This is O(1), also.
That's itthe general solution. Two numeric calculations and you've gone from arbitrary object as key to arbitrary object as value. Few things can be as fast.
The transformation from object to numberhash value happens in one of threethese common ways.
If it's a "primitive" object of 4 bytes, then the object's native value is a number.
The object's address is 4 bytes, then the object's address iscan be used as a numberhash value.
A simple hash algorithm hash function (MD5, SHA1, whatever) sumsaccumulates the bytes of the object to create a 4-byte number. The advanced hashes aren't simple sums of bytes, a simple sum doesn't reflect all the original input bits fairly enough.
The slot in the hash table is mod( number, size of table ).
If that slot has the desired value, you're done. If that's not the desired value, you need to look somewhere else. RandomThere are several popular probing works wellalgorithms to look for a free spot in the table. Linear is a simple search for the next free spot. Quadratic is a non-linear hopping around looking for a free slot. A random number generator (with a fixed seed) can be used to generate a series of probes that will spread data evenly but arbitrarily.
The probing algorithms are not O(1). If the table's big enough, the odds of collision are low, and probes don't matter. If the table's too small, then collisions happen and probing happens. At that point, it becomes a matter of "tuning and tweaking" to balance probing and table size to optimize performance. Usually we just make the table bigger.
See Hash Table.