Efficiently generate a 16-character, alphanumeric string

Question

I'm looking for a very quick way to generate an alphanumeric unique id for a primary key in a table.

Would something like this work?

def genKey(): hash = hashlib.md5(RANDOM_NUMBER).digest().encode("base64") alnum_hash = re.sub(r'[^a-zA-Z0-9]', "", hash) return alnum_hash[:16]

What would be a good way to generate random numbers? If I base it on microtime, I have to account for the possibility of several calls of genKey() at the same time from different instances.

Or is there a better way to do all this?

possible duplicate of Random string generation with upper case letters and digits in Python — Amir Ali Akbari
– Amir Ali Akbari, Commented Dec 26, 2014 at 17:43

David Schumann · Accepted Answer · 2024-07-23 11:37:18Z

A lot of the answers do not provide a random string consisting of characters 0-9, a-z, A-Z: Here is a working solution which will give you one of approx. 62^16 = 4.76724 e+28 keys:

import random, string x = ''.join(random.choice(string.ascii_uppercase + string.ascii_lowercase + string.digits) for _ in range(16)) print(x)

It is also very readable without knowing ASCII codes by heart.

There is an even shorter version since python 3.6.2:

import random, string x = ''.join(random.choices(string.ascii_letters + string.digits, k=16)) print(x)

The number was actually wrong. I updated it. It is calculated by taking the number of possible characters to the power of the length of the string.

Mark Byers · Accepted Answer · 2010-03-24 20:42:58Z

65

You can use this:

>>> import random >>> ''.join(random.choice('0123456789ABCDEF') for i in range(16)) 'E2C6B2E19E4A7777'

There is no guarantee that the keys generated will be unique so you should be ready to retry with a new key in the case the original insert fails. Also, you might want to consider using a deterministic algorithm to generate a string from an auto-incremented id instead of using random values, as this will guarantee you uniqueness (but it will also give predictable keys).

answered Mar 24, 2010 at 20:42

Mark Byers

844k202 gold badges1.6k silver badges1.5k bronze badges

6 Comments

nikola Over a year ago

random is not random but pseudo-random according to the documentation. Please use os.urandom instead.

aaronasterling Over a year ago

@prometheus. is os.urandom not psuedo-random?

nikola Over a year ago

I was responding to Mark Byers loose usage of the term "random values". os.urandom is still pseudo-random, but cryptographically secure pseudo-random, which makes it much more suitable for a wide range of use cases compared to random.

yamm Over a year ago

@nikola its doesnt really matter if the keys are only pseudo random, they are used for indexing.

Mark Over a year ago

Perhaps obvious, but 'deterministic' doesn't mean unique, you have to actually check that the algorithm has a very long repetition period. get_key = lambda n: n % 10 is deterministic, but not unique for long.

|

Brachamul · Accepted Answer · 2019-03-17 15:26:31Z

In Python 3.6, released in December 2016, the secrets module was introduced.

You can now generate a random token this way :

import secrets secrets.token_hex(16)

From the Python docs :

The secrets module is used for generating cryptographically strong random numbers suitable for managing data such as passwords, account authentication, security tokens, and related secrets.

In particularly, secrets should be used in preference to the default pseudo-random number generator in the random module, which is designed for modelling and simulation, not security or cryptography.

https://docs.python.org/3/library/secrets.html

Docstring: "Return a random text string, in hexadecimal. The string has nbytes random bytes, each byte converted to two hex digits. If nbytes is None or not supplied, a reasonable default is used." Meaning 32 == len(secrets.token_hex(16)).

Jurakin · Accepted Answer · 2023-02-12 02:15:21Z

42

Have a look at the uuid module (Python 2.5+).

A quick example:

import uuid uid = uuid.uuid4() print(uid.hex)

df008b2e24f947b1b873c94d8a3f2201

Note that the OP asked for a 16-character alphanumeric string, but UUID4 strings are 32 characters long. You should not truncate this string, instead, use the complete 32 characters.

edited Feb 12, 2023 at 2:15

Jurakin

1,1711 gold badge10 silver badges22 bronze badges

answered Mar 24, 2010 at 20:43

ChristopheD

117k30 gold badges167 silver badges182 bronze badges

5 Comments

Brian Over a year ago

This is 32 characters, and truncation of Guids is unsafe.

ChristopheD Over a year ago

True (about the truncation). On the other hand: I'd just store 32 characters (unless you have a very specific reason to only store 16).

Adiyat Mubarak Over a year ago

@Brian Hi, I need to know why guids is not safe? do you have some reference?

Brian Over a year ago

@AdiyatMubarak: Fundamentally, you don't need a reference. Guids are documented as being unique. Half of a Guid is not documented as being unique. That said, blogs.msdn.microsoft.com/oldnewthing/20080627-00/?p=21823 runs through what happens when you truncate one particular GUID algorithm .

kta Over a year ago

this returns hexadecimal with letters only from a - f.

mathandy · Accepted Answer · 2021-09-11 00:50:58Z

There's an official recipe:

import string import secrets alphabet = string.ascii_letters + string.digits password = ''.join(secrets.choice(alphabet) for i in range(16)) print(password)

This will create output similar to 'STCT3jdDUkppph03'.

I think this is the best answer now. More "properly" random; denser than token_hex.

Jurakin · Accepted Answer · 2023-02-12 02:14:49Z

5

For random numbers a good source is os.urandom:

import os import hashlib random_data = os.urandom(128) hashlib.md5(random_data).hexdigest()[:16]

edited Feb 12, 2023 at 2:14

Jurakin

1,1711 gold badge10 silver badges22 bronze badges

answered Sep 28, 2010 at 21:53

rlotun

8,1034 gold badges30 silver badges23 bronze badges

4 Comments

m3nda Over a year ago

I forgot the so much great urandom function :V and that's nice, better than adding charsets into a string then loop then. Builtin ;)

bman Over a year ago

this has been mentioned in other answers too, you should not truncate the md5 hash.

max Over a year ago

@bman: I'm aware that there a serious issues truncating vertan UUIDs because the randomness is not linearly distributed. vor MD5 this should not be an issue.

Jurakin Over a year ago

@rlotun You can't just cut out the hash and say it's random. If you do that, the hash function won't be guaranteed to be random.

Henry Ecker · Accepted Answer · 2023-02-12 02:15:56Z

3

import random ''.join(random.sample(map(chr, range(48, 57) + range(65, 90) + range(97, 122)), 16))

Outputs something like:

'CDh0geq3NpKtcXfP'

edited Feb 12, 2023 at 2:15

Henry Ecker♦

35.8k19 gold badges48 silver badges67 bronze badges

answered Mar 9, 2015 at 16:27

Jan Matejka

2,0001 gold badge15 silver badges33 bronze badges

1 Comment

David Schumann Over a year ago

Your solution would omit characters 9, Z and z. Also, sample() chooses every character only once. So it would give you a lot less permutations. This would give you a string of 16 random digits and upper/lower case letters: ''.join(random.choice(string.ascii_uppercase + string.ascii_lowercase + string.digits) for _ in range(6666))

Brian · Accepted Answer · 2010-03-24 21:02:45Z

This value is incremented by 1 on each call (it wraps around). Deciding where the best place to store the value will depend on how you are using it. You may find this explanation of interest, as it discusses not only how Guids work but also how to make a smaller one.

The short answer is this: Use some of those characters as a timestamp and the other characters as a "uniquifier," a value increments by 1 on each call to your uid generator.

pchatzou · Accepted Answer · 2021-09-30 15:04:09Z

I would prefer urandom over secrets.token_hex, as it samples from a richer character set and hence needs a smaller length to achieve the same entropy.

os.urandom, which reads from urandom, is considered secure (see the relevant answer in a question if urandom is secure). You can then read as much as you like from urandom and produce a random alphanummeric as follows:

import math import os def random_alphanumeric(str_len: int) -> str: rand_len = 3 * (math.ceil(str_len / 3) + 1) return base64.b64encode(os.urandom(rand_len), altchars=b'aA').decode('ascii')[:str_len]

NOTE: The above function is not secure. Since you need a "very quick way to generate an alphanumeric", this function sacrifices performance over security, since the frequencies of a and A (or whatever characters you choose to replace + and / with) will be increased compared to what urandom would give you otherwise.

If you put randomness above performance, you could do something like:

def secure_random_alphanumeric(str_len: int) -> str: ret = '' while len(ret) < str_len: rand_len = 3 * (math.ceil((str_len - len(ret)) / 3) + 2) ret += base64.b64encode(os.urandom(rand_len)).decode('ascii').replace('+', '').replace('/', '').replace('=', '') return ret[:str_len]

Note that chaining replace turns out to be faster than sequntially calling it, as per this answer.

Also, in the above, +1 is replaced by +2 when determining rand_lento reduce the number of iterations needed to achieve the requested length. You could even replace by +3 or more to reduce even more the possibility for an iteration, but then you would loose in performance at the chained replace calls.

Edward Corrigall · Accepted Answer · 2021-05-09 21:15:12Z

import math import secrets def random_alphanum(length: int) -> str: if length == 0: return '' elif length < 0: raise ValueError('negative argument not allowed') else: text = secrets.token_hex(nbytes=math.ceil(length / 2)) is_length_even = length % 2 == 0 return text if is_length_even else text[1:]

uuid method is inefficient and limited because uuid only returns 36 characters, and is then truncated.
default psuedo-random number generator is not suitable for security or cryptographic applications, a standard module secrets is available and is designed for these applications.

Chirag Maliwal · Accepted Answer · 2017-08-02 04:40:33Z

-3

simply use python inbuilt uuid :

import uuid print uuid.uuid4().hex[:16].upper()

answered Aug 2, 2017 at 4:40

Chirag Maliwal

44211 silver badges25 bronze badges

Comments

Praveen Kumar Sridhar · Accepted Answer · 2020-10-23 11:46:27Z

You could use the choice function in np.random which chooses the number of characters specified from a list of characters:

import numpy as np chars = np.array(list('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789')) np_codes = np.random.choice(chars,16) print(''.join([val for val in np_codes]))

this outputs something like the following: 591FXwW61F4Q57av

Henry Ecker · Accepted Answer · 2023-02-12 02:16:27Z

-3

Simply use python builtin uuid:

If UUIDs are okay for your purposes use the built in uuid package.

One Line Solution:

import uuid str(uuid.uuid4().get_hex().upper()[0:16])

Outputs something like:

'40003A9B8C3045CA'

edited Feb 12, 2023 at 2:16

Henry Ecker♦

35.8k19 gold badges48 silver badges67 bronze badges

answered Apr 24, 2014 at 11:51

Jaykumar Patel

27.7k12 gold badges79 silver badges76 bronze badges

2 Comments

Diaa Mohamed Kasem Over a year ago

UUID is 32 characters length, by using only characters from 0 to 15, you will get duplicates.

Brian Over a year ago

GUIDs are globally unique, but substrings of GUIDs aren't

Collectives™ on Stack Overflow

Efficiently generate a 16-character, alphanumeric string

13 Answers 13

2 Comments

6 Comments

1 Comment

5 Comments

1 Comment

4 Comments

1 Comment

Comments

Comments

Comments

Comments

Comments

Simply use python builtin uuid:

One Line Solution:

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

13 Answers 13

2 Comments

6 Comments

1 Comment

5 Comments

1 Comment

4 Comments

1 Comment

Comments

Comments

Comments

Comments

Comments

Simply use python builtin uuid:

One Line Solution:

2 Comments

Linked

Related