0

I have below c# code which return

 public static void Main() { //var bytes = "7A5EC53415E34F288269EBB05B73AFD4".ToByteArray(); for(int i=0;i<1000;i++) { var a = Guid.NewGuid().ToString("N").ToUpper(); Console.WriteLine($"{a}->{a.GetHashCode()}"); } } 

Which results

C5D9DB1ACBFB43E2B0DC2812F22C3899-> -1816084758 42B7C0BF6C0341DB96DE5084BA403193-> 1197814565 B4062E3C129E478BA8E69552DDC700F2-> 1349563395 863C9FBF0369496E90A1B0246F855D6E-> -772372816 B42562EDB97346F48DE37ADE3FB6620E-> 2019192158 

My Question here is, will the return values be a unique value so that I can use it for generating some random unique number? Sometime GetHashCode()return negative value and applying the Absolute will be considered a bad idea since i need without sign.

12
  • 1
    Why not use Random for generating a random number? Anyway you might want to look up the "pigeonhole principle". Like all hashes, GetHashCode is not guaranteed unique. Commented Apr 6, 2020 at 2:41
  • 2
    While Guid.NewGuid() should be unique, and is usually pretty good, if you are generating lots of them, you'll still get collisions. string.GetHashCode() is even less likely to be unique. Random numbers from a large enough address space, should be unique, but you will still have to cope with duplicates. Commented Apr 6, 2020 at 2:53
  • 1
    Go read up on the birthday paradox, if you haven't already. Commented Apr 6, 2020 at 2:54
  • 1
    The GUID combinations are 2^122, while the int combinations are 2^32. So there's no way that all GUIDs will have different hashes. If you want only positive numbers, instead of making negative values positive (which will of course give a you half the combinations, a total of 2^31), use a uint instead. But if 2^32 combination is good for you, that's something you need to determine. Commented Apr 6, 2020 at 3:16
  • 1
    I wonder if you have an XY problem? Commented Apr 6, 2020 at 3:44

1 Answer 1

3

Sorry, no. It's harder than that, especially since you mentioned in the comments that this runs across multiple processes.

Ideas that won't work

GUIDs are somewhat unique across machines (in C# they use the MAC address and other strategies), but GetHashCode() is a big problem since that is in no way guaranteed to be unique.

You also can't rely on Random() for multiple reasons:

  • It uses Environment.Tick for its seed, and could indeed result in the same seed twice if called within a very narrow interval (low tens of milliseconds - 16ms?) from different processes.
  • Even if the seed was unique, some of the resulting values won't be. At least not at scale. We are looking for uniqueness, not just randomness.

One idea that will definitely work

If you want to be absolutely certain of uniqueness in a mechanical way across distributed processes running on different machines (or, worse, the same machine...), you could delegate the creation of unique values to a service. That service could save all unique values to a database table, forever. Or for some period of time if your requirements allow... That would be much easier.

Any time the service generates a new value, it would check the table first. If that value was ever used in the past, try again until a unique is found. Then save it to the database and return the value.

Doing this efficiently at scale would probably require a bit of engineering. There may be a more elegant purely mathematical solution, but this would be reliable and easy to understand (it does not rely on magic).

Sign up to request clarification or add additional context in comments.

2 Comments

To be a bit picky :) The idea that will definitely work may not work if all values within the integer range have been assigned to a string. At the end of the day, there can only be 4,294,967,295 pairs in the table like how we run out of IPv4 addresses.
@weichch I suppose that is true... If billions of values is not enough, you could switch to a different data type. However, someone might still come along and point out that even that has theoretical limits. But let's assume from here that scale issues can all be solved with more engineering, and that people will try to stop somewhere around what's appropriate for their problem. :) Otherwise we might end up writing a book, not an SO answer.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.