6

I have a user_info table in five different locations. Now I need to integrate all the rows into a central user_info table. To do this I need a unique id for each row in source tables. Because when all the rows come into the central table, then each user must have a unique ID.

Now my questions are:

  1. if I use uniqueidentifier NEWID() for each source table then is it will be unique for globally & life time or have any chance to be duplicate?

  2. How does SQL Server create the NEWID()? I need to know the key generation structure.

0

1 Answer 1

8

Yes, there is no chance of a duplicate between machines.

NEWID() is based on a combination of a pseudo random number (from the clock) and the MAC address of the primary NIC.

However, inserting random numbers like this as the clustered key on a table is terrible for performance. You should consider either NEWSEQUENTIALID() or a COMB-type function for generating GUIDs that still offer the collision-avoidance benefits of NEWID() while still maintaining acceptable INSERT performance.

Sign up to request clarification or add additional context in comments.

5 Comments

Any documentation about what windows API call SQL Server uses for NEWID? I know that NEWSEQUENTIALID uses the MAC address as this is clearly visible in the GUID but couldn't see any documentation for NEWID.
CoCreateGuid most likely? Eric Lippert had a [blog post series][1] about GUIDs in general. See also: stackoverflow.com/questions/39771/… For NEWID() the first hex digit of the third section is always 4 (when I try it), for NEWSEQUENTIALID() it uses the MAC address mentioned above. I can't remember what happens when there is no MAC address in the computer - anyone know what happens in SQL in this case? [1]: blogs.msdn.com/b/ericlippert/archive/2012/05/07/…
How accurate time does newid() use? It sounds like if I fetch two ids at the same time I should get a duplicate, yet I've never seen it happening.
@tubelius, the newid() value is based on a pseudorandom number generated from the clock and other factors. It is not the clock value itself, so the chances of duplication are so infinitesimal that it isn't even worth worrying about. If you want the timestamp to be explicitly built into the value (to make it sort roughly sequentially without the disadvantages of newsequentiaid()), you'll need a COMB library. I have an open-source one (RT.Comb) on Github / Nuget.
@richardtallent No, you should mention that there is a negligible change. Saying there is no change means that there is zero probability, which is obvious not!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.