2

I'm looking for a good hash utility or framework to map the string to some int value in range say {1 .. k} I should use it in the runtime component that receives a lot of concurrent requests from users with 'userId' that is string and i want to map that value to defined integer range.

Of course the length of the userId value could be different but it is ok with me to assumt on some minimum lenght for using this function

It should be very fast

Thanks

6
  • Are you after a hash for a hashtable, or a unique ID? Commented Mar 7, 2012 at 7:49
  • What happens when two userids map to the same integer? This is called a "collision" and must be dealt with. String.hashCode() already does a reasonable job, and you can fit the result into {1..k} by doing str.hashCode()%k + 1 Commented Mar 7, 2012 at 8:01
  • But i see the problem - this would not work since the hashcode is for the reference thus two same strings will get different hashcode ` public static void main(String[] args) { Scanner input=new Scanner(System.in); System.out.println("Enter str: " ); String str1=input.nextLine(); System.out.println("Enter str: " ); String str2=input.nextLine(); System.out.println(str1.hashCode() + " " + str2.hashCode()); } ` This is the output: Commented Mar 7, 2012 at 8:29
  • @Julias That is not correct. I suggest you look at the code for String.hashCode(), or try it yourself. Commented Mar 7, 2012 at 8:31
  • If you want to add code and examples, I suggest you include this in your question. Comments don't handle code well. :( Commented Mar 7, 2012 at 8:33

4 Answers 4

3

Every Java object has a builtin hashCode method, which returns an int. For String it is pre-defined for you (it needs to be implemented for custom objects).

To map this to 1..k, where k is an integer, consider using the modulus:

String hi = "Hello"; int hash = (hi.hashCode() % K) + 1; 
Sign up to request clarification or add additional context in comments.

5 Comments

hashCode() returns an int not a long
All genious ideas are the simplest :-) Thanks, Will i have a good distribution of values?
Note that using % destroys the presumably even hashCode distribution. You might consider scaling it with division instead.
hashCode() can be negative e.g. try a longer string. ;)
@StevenSchlansker This is a problem for very large values, in which case you need to use a long hashCode (or even BigDecimal). Normally, small values (less than a million)
1

You can use String.hashCode().

String a1 = "Hello World"; String a2 = new String(a1); // don't do this unless you have to have a different object. System.out.println("Identity hashCode " + System.identityHashCode(a1) + " != " + System.identityHashCode(a2)); System.out.println("String.hashCode " + a1.hashCode() + " == " + a2.hashCode()); 

prints

Identity hashCode 551677275 != 1353056826 String.hashCode -862545276 == -862545276 

In terms of performance, hashCode() is much faster than creating the String itself. If this is not fast enough, I would avoid using/creating String in the first place. (highly unlikely you need to so this)

The identity hash code changes each time the program is run. Note: hashCode() can be negative so you have to adjust for this.

int hash = (text.hashCode() & 0x7FFFFFFF) % K + 1; 

or if you don't want to discard the top bit

int hash = (int) ((text.hashCode() & 0xFFFFFFFFL) % K + 1); 

Comments

0

If you're after a secure hash (one that cannot be reversed easily), then use message digest:

try { MessageDigest msgDigest = MessageDigest.getInstance("MD5"); byte digest[] = msgDigest.digest(username.getBytes()); int secureHash = 1 + new BigInteger(digest).mod(BigInteger.valueOf(k)).intValue(); System.out.println("Secure hash " + secureHash); } catch (NoSuchAlgorithmException e) { e.printStackTrace(); } 

Comments

0

You could go ahead with HashTable or HashMap that comes with JDK. Choosing between the two, goes down to this,

  • access to the Hashtable is synchronized on the table while HashMap isn't.

  • Iterator in the HashMap is fail-safe while the enumerator for the Hashtable isn't.

  • HashMap permits null values in it, while Hashtable doesn't.

HashMap is part of the new collections framework (since JDK 1.2)

If you have millions of entries go for a database. Go for NoSQL.

Similar question here HashMap in Java, 100 Million entries

1 Comment

I have millions of different users

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.