binomial distribution using numpy

Question

I do not understand this code, is it correct?

# simulate 1 million tests of five fair coin flips tests =np.random.binomial(5, 0.5, int(1e6)) # proportion of tests that produced 1 head (tests == 1).mean() print(tests)

why is the first argument of binomial = 5 if the outcomes are only 2 (heads=1, tails=0)? Next, the size param should IMO be a 2d array to take into account the 5 flips per test. I see in the numpy docs that this option is available, but I did not find any example using size=array. Thanks for any further insight.

Stokolos Ilya · Accepted Answer · 2020-06-10 14:08:57Z

The terminology there is definitely a bit confusing, so let's consider an example:

Say you have a coin and you want to flip it 6 times. The probability of getting heads after each flip (i.e successful attempt) is 0.5). And in total you will be doing this experiment 10 times (i.e each experiment involves flipping coin 6 times). Then the whole thing can be expressed as

numpy.random.binomial(6,0.5,10) array([2, 2, 5, 3, 3, 2, 3, 4, 4, 3])

Where the first argument of the function corresponds to "you flipped coin 6 times", 0.5 corresponds too "probability of heads (success) is 0.5", and 10 to "You are doing the experiment 10 times (each experiment involves flipping coin 6 times)"

Now, each entry in the output array represents the result of the "experiment". For example, the first value in the output array can be understood as: "You flipped a coin 6 times where probability of heads is 0.5, and in the result you got 2 heads".

Conor · Accepted Answer · 2020-06-10 14:12:43Z

So you're right that the trial-by-trial outcome is either heads or tails. But this outcome (heads = 1, tails = 0) only coincides with the output of the function np.random.binomial if the first parameter n is 1.

This is because the true output of the function is the total number (read: integer) of successful trials (where heads = 1). Note the distinction between this and whether a given trial is 1 or 0. The case of the output being an indicator of whether a given trial is 1 or 0 only occurs if the first parameter n is 1, because then the 'total number of successful trials' is the same thing as: 'is the current trial a success?'. This is because the total number of trials == 1 in such a case.

So in the case of n = 5 (the first parameter = 5) and size = 1e6, the output will be 1 million integers, with each one telling you the number of successes with a single, 5-trial binomial experiment. So you'll have the total number of successes (x / 5) for 1 million 5-trial binomial experiments. Does that make sense?

Regarding the size parameter: is it not an array, but it is an integer (for a 1D array) or a tuple of integers. This integer (or integers) specify/specifies the dimensions, or size, of the output array.

thank you so much. The number of successes is the key to understand the real intent. This is also what I missed before, i.e. with n=5 I was under the impression that other meaningless figures were showing up, like 2, 3, etc

Collectives™ on Stack Overflow

binomial distribution using numpy

2 Answers 2

Comments

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Related