1

I have a dictionary where the keys represent groups, and the values are a list of elements (for simplicity let's assume the values are integers).

An example of such a dictionary, d, is:

d = {2: [0, 1, 7, 8, 9], 1: [2, 4], 4: [3], 3: [5, 6]} 

I wish to sample this dictionary multiple times randomly, where the sample will consist of a random sampling of a single element from the first group, a single element from the second, etc.

Simple output examples are:

[9, 2, 3, 5] [1, 2, 3, 6] [7, 4, 3, 6] ... 

I can iterate over the dictionary, d, M times and iterate over the keys and sample 1 element, I was wondering if there is a simpler method, hopefully, more efficient as it will be a part of a larger more complex algorithm.

The naive approach is:

ll = [] for i in range(10): tmp = [] for k, l in d.items(): tmp.append(random.choice(l)) ll.append(tmp) print(ll) # [[5, 6, 9, 7], [8, 4, 3, 7], [2, 1, 3, 7], [5, 1, 3, 7], [0, 4, 3, 7], [2, 1, 3, 7], [2, 6, 9, 7], [2, 1, 9, 7], [8, 4, 9, 7], [2, 1, 9, 7]] 

I'm not strict about the dictionary, I can use other structures as well but the logic stays the same.

Would appreciate some help.

4
  • Check random.sample or random.choice: docs.python.org/3/library/random.html Commented Sep 7, 2020 at 15:01
  • @Mike67 yeah, I'm familiar with them, take a look at my naive approach. I was looking for a different approach. Commented Sep 7, 2020 at 15:18
  • You want an option without using a random library? Commented Sep 7, 2020 at 15:23
  • @Mike67 No, I wanted an option without the need to iterate over the keys. Commented Sep 7, 2020 at 15:24

2 Answers 2

2

There won't really be an alternative, since you're working with dictionaries and lists. You'll have to iterate over the dictionary values M times, as you mention, and take a random.choice of each inner list on each iteration. A simple way of doing so would be:

from random import choice n = 5 l = list(d.values()) [list(map(random.choice, l)) for _ in range(n)] # [[9, 2, 3, 6], [7, 2, 3, 5], [9, 4, 3, 5], [0, 2, 3, 5], [7, 4, 3, 5]] 

Another way could be to sample from the inner list with replacement using random.choices, and then transpose the resulting nested list with zip. That way we reduce the above to a single iteration over the values:

from random import choices list(zip(*(random.choices(i, k=n) for i in l))) # [(8, 4, 3, 5), (1, 2, 3, 6), (9, 4, 3, 6), (1, 2, 3, 5), (8, 4, 3, 5)] 
Sign up to request clarification or add additional context in comments.

4 Comments

Nice one-liner. I guess there isn't a simpler way then that. Do you think its worth using pandas maybe?
Probably overkill @DavidS I'd personally go with lists for this
Do you mean list of lists? I built the dictionary from the output of scipy.cluster.hierarchy.cut_tree so to split my original list of the element to groups with the proper class. I calculate the distance matrix -> hierarchical clustering -> cutting the tree to k groups and then built the dictionary
Yeh a list of lists, still from the list class @DavidS
1

This should work:

import random result=[] for i in range(M): result.append([random.choice(d[i]) for i in d]) 

3 Comments

you might as well just use the built-in random, no need for numpy
Yeah, i have amended it. I usally use numpy that's why it came as first option, thanks
@archer yeah this is a variation of the "naive" way I used, I was hoping for a simpler way

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.