7

How can I write a function that gives me the random-index of a list-element, but based on the probabilities in the list?

The list looks like this with 5 elements.

a = [0.1, 0.2, 0.4, 0.2, 0.1] 

Is there a simple and fast solution for this? Thanks

2
  • 1
    I think this is what you are looking for: stackoverflow.com/questions/10803135/… Commented May 17, 2016 at 1:11
  • Not a duplicate. The question has some different particulars. Commented May 17, 2016 at 23:05

2 Answers 2

9

This sounds like a job for Numpy's numpy.random.choice(), and its p parameter:

p : 1-D array-like, optional The probabilities associated with each entry in a. If not given, the sample assumes a uniform distribtion over all entries in a. 

So if there's only one list (where an element is both the probability of each element, and the element itself to be selected, you can do it like this:

from numpy.random import choice elementsAndProbabilities = [0.1, 0.2, 0.4, 0.2, 0.1] randomElement = choice(elementsAndProbabilities, p=elementsAndProbabilities) print randomElement 

If you have a list of elements and a list of probabilities for each element (separate), you can do it like this:

from numpy.random import choice elements = ["first", "second", "third", "fourth", "fifth"] probabilities = [0.1, 0.2, 0.4, 0.2, 0.1] randomElement = choice(elements, p=probabilities) print randomElement 

Now, you said you wanted the index, not the element, so we can get the index like this:

from numpy.random import choice probabilities = [0.1, 0.2, 0.4, 0.2, 0.1] randomElement = choice(range(len(probabilities)), p=probabilities) print randomElement 
Sign up to request clarification or add additional context in comments.

Comments

3

NumPy will probably be faster if you have it, but if not, here's a pure Python solution.

from random import random a = [0.1, 0.2, 0.4, 0.2, 0.1] def randombin(bins): r = random() p = 0 for i, v in enumerate(bins): p += v if r < p: return i # p may not equal exactly 1.0 due to floating-point rounding errors # so if we get here, just try again (the errors are small, so this # should not happen very often). You could also just put it in the # last bin or pick a bin at random, depending on your tolerance for # small biases return randombin(bins) print randombin(a) 

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.