4

I am trying to generate datasets following truncated negative binomial distribution consisting of numbers such that the number set has a max value.

def truncated_Nbinom(n, p, max_value, size): import scipy.stats as sct temp_size = size while True: temp_size *= 2 temp = sct.nbinom.rvs(n, p, size=temp_size) truncated = temp[temp <= max_value] if len(truncated) >= size: return truncated[:size] 

I am able to get results when the max_value and n are smaller. However when I try with:

input_1= truncated_Nbinom(99, 0.3, 99, 5000).tolist() 

The kernel keeps dying. I tried to change the port of python and raising the recursion limit, but they didn't work. Do you have any ideas to make my code faster?

3
  • What do you mean by "dying"? Commented May 19, 2021 at 10:41
  • I am using jupyter notebook, while working on the code after some long time the console says "the kernel has died and it will be restarted" before returning the code. Commented May 19, 2021 at 10:56
  • I suspect you have a potential infinite loop and the doubling of temp_size each time will eat memory Commented May 19, 2021 at 11:09

1 Answer 1

1

Here is one approach. You can compute the probability of x being selected under the negative binomial, then normalize the probabilities for xs below max_value to sum to one. Now, you can simply call np.random.choice with appropriate probabilities.

import numpy as np import pandas as pd from scipy import stats def truncated_Nbinom2(n, p, max_value, size): support = np.arange(max_value + 1) probs = stats.nbinom.pmf(support, n, p) probs /= probs.sum() return np.random.choice(support, size=size, p=probs) 

Here is an illustration:

arr1 = truncated_Nbinom(9, 0.3, 9, 50000) arr2 = truncated_Nbinom2(9, 0.3, 9, 50000) df_counts = pd.DataFrame({ "version_1": pd.Series(arr1).value_counts(), "version_2": pd.Series(arr2).value_counts(), }) 

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.