1
$\begingroup$

In the Monobit Test for randomness, the threshold for passing or failing is based on a confidence interval. I understand that a 1% significance level (99% confidence) results in a larger threshold compared to a 5% significance level (95% confidence).

However, this seems counterintuitive—since higher confidence should mean less deviation from the expected 50:50 ratio, why does it allow a larger difference between the number of ones and zeros? Shouldn't a stricter test (lower alpha) demand a smaller deviation instead?

Could someone clarify how the threshold is mathematically set and why this behavior occurs?

What I Tried: I reviewed the Monobit Test formula, which sets the threshold based on the Z-score corresponding to the chosen confidence level. I also looked into how the critical values for 95% and 99% confidence are derived from the normal distribution.

What I Expected: I expected that a higher confidence level (99%) would result in a stricter test, meaning a smaller allowed difference between the number of ones and zeros.

What Actually Happened: Instead, I found that the higher confidence level (99%) allows a larger difference between ones and zeros compared to 95%. This seems counterintuitive because I assumed that a more stringent test should tolerate less deviation. I’d like clarification on why this happens mathematically.

$\endgroup$

1 Answer 1

1
$\begingroup$

This looks like a possible misunderstanding of "higher confidence". Here it seems

  • the null hypothesis is that each bit of a random number is equally likely to be $0$ or $1$
  • the alternative hypothesis is that either $0$ or $1$ bits are more likely to be generated than the other

So a $1\%$ significance level critical value ($99\%$ confidence) is one which, if the null hypothesis is correct, will only be reached exceeded $1\%$ of the time and will not be reached $99\%$ of the time.

Similarly a $5\%$ significance level critical value ($95\%$ confidence) is one which, if the null hypothesis is correct, will only be reached exceeded $5\%$ of the time and will not be reached $95\%$ of the time.

This means $1\%$ significance level critical value is more extreme than a $5\%$ significance level critical value as it is less likely to be reached or exceeded. It is a less stringent test of the null hypothesis. You have observed this meaning and it surprises you.

In other words, when you are $99\%$ confident that the test statistic will fail to reach the critical value when the null hypothesis is correct, you have a more extreme or looser critical value than when you are $95\%$ confident that the test statistic will fail to reach the critical value when the null hypothesis is correct. This should seem reasonable.

You get the same issue with "confidence intervals" for a parameter: a $99\%$ confidence interval results from a method where the probability of the generated interval covering the true value of the parameter is $99\%$. To get this high confidence, you need a looser or wider interval than if you only wanted a $95\%$ confidence interval.

The confusion may come that if test statistic reaches or exceeds the more extreme $1\%$ significance level critical value ($99\%$ confidence), then you might want to say that you are more confident that the null hypothesis should be rejected and the alternative hypothesis accepted than if it only reaches or exceeds only the $5\%$ significance level critical value ($95\%$ confidence); essentially it might be seen as a more stringent test of the alternative hypothesis. There is a sense in which this makes sense, but you should not be applying numbers such as $99\%$ or $95\%$ to such confidence as those numbers only apply to the null hypothesis. Instead you should do a power calculation, and do it before you look at the test statistic.

Related: Is a 99% confidence level more precise than 95%?

$\endgroup$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.