Hashcodes - need a math whiz
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
15 seems like it might be better.
My rationale is this:
1. It's unlikely to be the result of a bug in the method.
2. It's smaller than the prime seed used for most hash calculations.
3. 2^32 + 15 (4294967311) is prime, and so can't (I think) be the result of a multiplication that results in an overflow.
Basically, I'm wondering if my 3rd assumption is correct. I get tied in knots when 2's complement signs get involved. Any math/binary whizzes out there?
Winston
"Leadership is nature's way of removing morons from the productive flow" - Dogbert
Articles by Winston can be found here
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Build a man a fire, and he'll be warm for a day. Set a man on fire, and he'll be warm for the rest of his life.
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Winston Gutkowski wrote:
3. 2^32 + 15 (4294967311) is prime, and so can't (I think) be the result of a multiplication that results in an overflow.
Basically, I'm wondering if my 3rd assumption is correct. I get tied in knots when 2's complement signs get involved. Any math/binary whizzes out there?
I'm a math whiz, and no, that isn't correct. Multiplication of signed 32-bit integers can be anything up to 2^62, just about. So if you're concerned about primes (and it's not clear to me why you are) then you'd have to be looking at a very large set of numbers of the form 2^32 * N + 15 -- and I can guarantee you that not all 2^30-ish of them are prime. (If N = 3 then it isn't prime, for example.)
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Paul Clapham wrote:I'm a math whiz, and no, that isn't correct. Multiplication of signed 32-bit integers can be anything up to 2^62, just about.
Hol' on hoss. They then need to be truncated in order to fit back into a 32-bit integer.
My thinking was along these lines: If we find a number that is larger than 2^32 and prime (And the smallest number greater than 2^32 that is prime is 2^32 + 15), and work out what it will be when truncated (15), then the result of a multiplication that overflows can't be equal to it (I realise that lots of other ops might also be involved in a hash, but '*' is usually one of them, and we can certainly eliminate any results that are negative).
I've had a bit of time to think about it, and I'm pretty sure I'm right for positive integers, but plainly not for other combinations; what I'm not sure is if the choice of a prime helps for other combos. The only thing (I suspect) that helps is that prime numbers are odd which, given the nature of hash calculations is likely to give a 3:1 chance of 15 NOT being the answer to a multiplication.
As I say, Tuesday night nonsense; but you haven't convinced me I'm entirely wrong - at least not yet.
Winston
"Leadership is nature's way of removing morons from the productive flow" - Dogbert
Articles by Winston can be found here
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Very famous paper written about the database and the hashing (i.e.: how they got the db that fast ).
Anyway one point in the paper where they had demonstrated everything, but the prime capacity issue. So i wrote a utility that would "pre hash" everything for a range of keys. we ran it for millions and millions of iterations - prime capacity (or key bases) turned out to be the worst in most cases.
If i remember correctly:
integer keys worked best with large, odd, non-prime numbers.
character based actually did well with large powers of 2
and real keys (yes they used them) actually did very well with 31 and i think 79.
-steve
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Steve Fahlbusch wrote:integer keys worked best with large, odd, non-prime numbers.
character based actually did well with large powers of 2
and real keys (yes they used them) actually did very well with 31 and i think 79.
I've heard of #1 and #3 (especially the magical '31', which is also often an optimized constant) on your list - not familiar with #2.
The only other thing I rather like about 15 is that the business of most hash algorithms I know is "bit shuffling", and 15 is 28 0's followed by 4 '1's.
But I'm enough of a mathematician to know that that don't mean nothin'
Winston
"Leadership is nature's way of removing morons from the productive flow" - Dogbert
Articles by Winston can be found here
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Winston Gutkowski wrote:
Paul Clapham wrote:I'm a math whiz, and no, that isn't correct. Multiplication of signed 32-bit integers can be anything up to 2^62, just about.
Hol' on hoss. They then need to be truncated in order to fit back into a 32-bit integer.
Right. That's what you're doing, isn't it? You are finding a prime number which truncates to 15 as a 32-bit integer.
My thinking was along these lines: If we find a number that is larger than 2^32 and prime (And the smallest number greater than 2^32 that is prime is 2^32 + 15), and work out what it will be when truncated (15), then the result of a multiplication that overflows can't be equal to it (I realise that lots of other ops might also be involved in a hash, but '*' is usually one of them, and we can certainly eliminate any results that are negative).
Which is backwards. I've just demonstrated that there are plenty of composite 64-bit numbers which truncate to 15 as a 32-bit integer. And since those numbers are composite, that means they are the product of smaller numbers. Not all of them will be the product of two smaller numbers which are both less than 2^32 but many of them will be. So there are two 32-bit numbers A and B whose product truncates to 15.
Why that's relevant to anything I still have no idea, but that's the math of it.
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Paul Clapham wrote:Which is backwards. I've just demonstrated that there are plenty of composite 64-bit numbers which truncate to 15 as a 32-bit integer.
Gotcha (finally). Thanks.
I still reckon that an odd number has better odds of not being the product of two random numbers than an even one though.
Winston
"Leadership is nature's way of removing morons from the productive flow" - Dogbert
Articles by Winston can be found here
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
-
-
Number of slices to send:Optional 'thank-you' note:
-
-
Mike Simmons wrote:Winston - you also seem to be assuming that the last operation in generating a hashCode() will be a multiplication...
Yup, you're quite right, all of you. Sometimes these things just pop into my head and I have to get them purged. And thanks, Paul, for setting me straight; I was plainly having a "Homer moment".
Winston
"Leadership is nature's way of removing morons from the productive flow" - Dogbert
Articles by Winston can be found here
| Let's get him boys! We'll make him read this tiny ad! The new gardening playing cards kickstarter is now live! https://www.kickstarter.com/projects/paulwheaton/garden-cards |











