How to specify a random seed while using Python's numpy random choice?

Question

I have a list of four strings. Then in a Pandas dataframe I want to create a variable randomly selecting a value from this list and assign into each row. I am using numpy's random choice, but reading their documentation, there is no seed option. How can I specify the random seed to the random assignment so every time the random assignment will be the same?

service_code_options = ['899.59O', '12.42R', '13.59P', '204.68L'] df['SERVICE_CODE'] = [np.random.choice(service_code_options ) for i in df.index]

Trenton McKinney · Accepted Answer · 2020-07-31 21:42:11Z

You need define it before by numpy.random.seed, also list comprehension is not necessary, because is possible use numpy.random.choice with parameter size:

np.random.seed(123) df = pd.DataFrame({'a':range(10)}) service_code_options = ['899.59O', '12.42R', '13.59P', '204.68L'] df['SERVICE_CODE'] = np.random.choice(service_code_options, size=len(df)) print (df) a SERVICE_CODE 0 0 13.59P 1 1 12.42R 2 2 13.59P 3 3 13.59P 4 4 899.59O 5 5 13.59P 6 6 13.59P 7 7 12.42R 8 8 204.68L 9 9 13.59P

Question, "np.random.seed(123)" does it apply to all the following codes that call for random function from numpy. If so, is there a way to terminate it, and say, if I want to make another variable using a different seed, do I declare another "np.random.seed(897)" to affect the subsequent codes?
Got the ans here stackoverflow.com/questions/49966770/…. Thanks.

medium-dimensional · Accepted Answer · 2022-12-22 10:36:55Z

According to the notes of numpy.random.seed in numpy v1.2.4:

Best practice is to use a dedicated Generator instance rather than the random variate generation methods exposed directly in the random module.

Such a Generator is constructed using np.random.default_rng.

Thus, instead of np.random.seed, the current best practice is to use a np.random.default_rng with a seed to construct a Generator, which can be further used for reproducible results.

Combining jezrael's answer and the current best practice, we have:

import pandas as pd import numpy as np rng = np.random.default_rng(seed=121) df = pd.DataFrame({'a':range(10)}) service_code_options = ['899.59O', '12.42R', '13.59P', '204.68L'] df['SERVICE_CODE'] = rng.choice(service_code_options, size=len(df)) print(df)

 a SERVICE_CODE 0 0 12.42R 1 1 13.59P 2 2 12.42R 3 3 12.42R 4 4 899.59O 5 5 204.68L 6 6 204.68L 7 7 13.59P 8 8 12.42R 9 9 13.59P

piRSquared · Accepted Answer · 2018-10-25 14:30:47Z

Documentation numpy.random.seed

np.random.seed(this_is_my_seed)

That could be an integer or a list of integers

np.random.seed(300)

Or

np.random.seed([3, 1415])

Example

np.random.seed([3, 1415]) service_code_options = ['899.59O', '12.42R', '13.59P', '204.68L'] np.random.choice(service_code_options, 3) array(['899.59O', '204.68L', '13.59P'], dtype='<U7')

Notice that I passed a 3 to the choice function to specify the size of the array.

numpy.random.choice

What would a list of integers do? Use the n-th element as seed after n random() calls?
No, the whole list is just a thing that provides a starting point for the randomization.

Collectives™ on Stack Overflow

How to specify a random seed while using Python's numpy random choice?

3 Answers 3

3 Comments

Comments

Example

4 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

Comments

Example

4 Comments

Linked

Related