Fastest hash code generator .NET

Question

I'm implementing a custom GetHashCode for the System.Drawing.Point class in C#. My method currently fails the following requirement:

var hashA = MyGetHashCode(new Point(1, 0)); var hashB = MyGetHashCode(new Point(0, 1)); var hashC = MyGetHashCode(new Point(0, 0)); var hashD = MyGetHashCode(new Point(1, 1)); Assert.AreNotEqual(hashA ^ hashB, hashC ^ hashD);

To pass this test I'm sure that using new SHA256Managed().ComputeHash(currentHash) will do. But there is any other faster hashing algorithm? I know SHA256 is all about security, and I don't need that.

How did you come up with the idea that your hash function should pass that test? — mqp
– mqp, Commented Jun 8, 2009 at 13:08
@mquander Surely seems strange. But some other class Equals functions, relies on a simple GetHashCode implementation, which in its turn relies on my custom Point.GetHashCode method — Jader Dias
– Jader Dias, Commented Jun 8, 2009 at 13:16
@mquander It's all about not repeating code in Equals and GetHashCode, and making them equivalent. — Jader Dias
– Jader Dias, Commented Jun 8, 2009 at 13:19

Marc Gravell · Accepted Answer · 2009-06-08 13:13:43Z

6

A simple hash? how about something like:

 (17 * point.X) + (23 * point.Y);

Or for more obvious entropy:

int hash = -1047578147; hash = (hash * -1521134295) + point.X; hash = (hash * -1521134295) + point.Y;

(numbers from C#'s anonymous type code)

edited Jun 8, 2009 at 13:13

answered Jun 8, 2009 at 13:05

Marc Gravell

1.1m273 gold badges2.6k silver badges3k bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Jorge Córdoba Over a year ago

Marc, this will surely fulfill the Assert but is not bounded (large X or Y will overflow)... if you allow overflow then it doesn't have a good distribution

Marc Gravell Over a year ago

It will wrap (unchecked); the distribution is, AFAIK, fine... This is the approach used by the C# compiler ;-p

Marc Gravell Over a year ago

Those numbers are just (as stated) the numbers that the C# compiler will use for an anon type of the form: new {X = 123, Y = 456}

mqp · Accepted Answer · 2009-06-08 13:07:35Z

3

Why are you doing this? Surely System.Drawing.Point has a fine hashing function already?
You understand that test doesn't represent a strict requirement, right? Hash codes don't have to be unique.
If you really want a really good hash of the coordinates in question, you might want to start at this page about hashing multiple integers.

answered Jun 8, 2009 at 13:07

mqp

72.3k15 gold badges99 silver badges124 bronze badges

3 Comments

Marc Gravell Over a year ago

Re "fine hashing function" - x ^ y... that isn't great; it means that anything on the diagonal is zero, and anything symmetric, i.e (5,7) and (7,5) - is equal.

mqp Over a year ago

It's not great, but unless you have a fairly pathological distribution of points, it's OK. I have the feeling the OP isn't working off any concrete performance requirement, if he's considering using an SHA hash, so I doubt anything better is required.

Jader Dias Over a year ago

I answered the first question in the comments of the question.

Joel Martinez · Accepted Answer · 2009-06-08 13:06:44Z

I know this isn't going to answer your question, but I must mention for the sake of other readers that you are changing the default behavior of a built in method of the framework. As per the documentation:
http://msdn.microsoft.com/en-us/library/system.object.gethashcode.aspx

The default implementation of the GetHashCode method does not guarantee unique return values for different objects. Furthermore, the .NET Framework does not guarantee the default implementation of the GetHashCode method, and the value it returns will be the same between different versions of the .NET Framework. Consequently, the default implementation of this method must not be used as a unique object identifier for hashing purposes.

Could you please use a blockquote, not a code block (prefix the quote with ">" instead of indenting)? It'll make it much easier to read.
That is no problem in this case. What they mean is that you should not use the default GetHashcode as a unique value, because different objects can return the same hash. When Jader implements it as a (practically) unique value (by using sha256 or whatever) it will not break anything. It will be just slow ...
To be fair, it will be quite slow if he actually has a big hash table of points -- slow enough that it's pretty ridiculous.
Yeah sure - it is not really clever and he should never use a sha256 or something similar for this purpose. But the statement by Joel is not right, nevertheless.

TWA · Accepted Answer · 2009-06-08 13:10:10Z

1

Here's an interesting article about hashing speed:

A Hash Function for Hash Table Lookup

answered Jun 8, 2009 at 13:10

TWA

12.9k13 gold badges59 silver badges92 bronze badges

Comments

Jorge Córdoba · Accepted Answer · 2009-06-08 13:12:26Z

A simple Elf hash implementation (it's in delphi, shoudl be easy to translate)

function ElfHash(id : string; tableSize : integer) : integer; var i : integer; h,x : longint; begin h := 0; // Obtener el valor numérico for i := 1 to Length(id) do begin h := (h shl 4) + Ord(id[i]); x := h and $F0000000; if x <;>; 0 then h = h xor (x shr 24) xor x; end; // Ajustar al tamaño de la tabla result := h mod tableSize; end;

:) Talk to the stackoverflow sanitizer code... That's ovbiously "<>"
Says "Easy to translate" while asking to do shift left, shift right, and exclusive or in managed code.

Fantius · Accepted Answer · 2009-06-08 17:09:17Z

I don't know what your application is, but you may be looking for Zobrist hashing.

http://en.wikipedia.org/wiki/Zobrist_hashing

It can be updated incrementally, which makes it very fast.

Brian · Accepted Answer · 2009-06-08 17:31:43Z

If you know in advance that your point value is between 0 and N, you can use hashcode = X+Y*N; This is a rather obvious possible hash. It's not random at all, has ugly repetition, and is generally pretty silly. It's equivalent to concatenating the bits of your two points, assuming N is a power of 2. And it's got perfect uniform distribution and no collisions.

I've used this strategy to excellent effect in the past, but admit it has some real (but obvious) limitations. The biggest one being what happens when N is large enough that N^2 won't fit into your hash value (i.e. painful collisions.

My current implementation is ((x << 16) | (x >> 16)) ^ y (in C#), that fits in your description

Collectives™ on Stack Overflow

Fastest hash code generator .NET

7 Answers 7

3 Comments

3 Comments

4 Comments

Comments

3 Comments

Comments

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

7 Answers 7

3 Comments

3 Comments

4 Comments

Comments

3 Comments

Comments

1 Comment

Related