7
$\begingroup$

The rule of three states that, if we observe $Y\sim \text{Bin}(n,p)$ to be 0, then $[0,3/n]$ is a 95% confidence interval for $p$. I'm confused about the derivation for this rule on Wikipedia and elsewhere.

Wikipedia equates finding a 95% confidence interval to finding all $p$ such that $P_p(Y=0)\geq 0.05$. I'm struggling to reconcile this with my own understanding that a 95% confidence interval is a random region $C(Y)$ such that $P_p(C(Y)\text{ covers }p)=0.95$ for all $p$.

Edit: I realized that my question was vague (and I've deleted a mistaken guess about Wikipedia's underlying logic). My main question is: how is the Wikipedia argument justified? My other, related question is: How do you verify the coverage probability for the interval, given that it's only defined for one possible value of $Y$?

$\endgroup$

2 Answers 2

5
$\begingroup$

Hanley and Lippman-Hand (1983) give something like the following argument which provides motivation for the rule. Taking $n$ as fixed, $P(X=0|p) =(1-p)^n$.

Solving $(1-p)^n \geq \alpha\,$ for $p\,$ we get $p\geq 1-\alpha^{\frac{1}{n}}$. The smallest $p$ that keeps the probability of $0$ no less than $\alpha$ is $1-\alpha^{\frac{1}{n}}$.

Now $\alpha^{\frac{1}{n}}=e^{\frac{1}{n}\log{\alpha}} = 1+\frac{1}{n}\log{\alpha}+\frac12 (\frac{1}{n}\log{\alpha})^2 + ...$.

Taking to the first order we get $p\geq -\frac{1}{n}\log{\alpha}$. When $\alpha=0.05$, $-\log(0.05)/n\approx 3/n$.

Jovanovic & Levy (1997) improve this, basing it more clearly a in a CI argument, by casting it as a Clopper-Pearson interval and obtaining the same $(1-p)^n=\alpha$ bound, and hence the same approximate upper bound on $p$:

if $X = x$ is the observed number of events in n trials, the Clopper-Pearson (max-P) upper $(1- \alpha)$ 100% bound may be obtained as a solution to

$$\sum_{t=0}^x {n\choose t}p^t (1-p)^t =\alpha$$

Clearly, when $x = 0$ the expression reduces to $(1- p)^n = \alpha$

They also discuss some other arguments.

Hanley, J. A., and Lippman-Hand, A. (1983),
"If nothing goes wrong, is everything all right? Interpreting zero numerators"
Journal of the American Medical Association, 249(13), 1743-1745.

Jovanovic, B. D. and Levy, P. S. (1997),
"A Look at the Rule of Three"
The American Statistician, 51(2), 137-139

$\endgroup$
2
  • $\begingroup$ The approximation makes sense but am I right to be skeptical of the logic? For example, if you observe $Y\sim Pois(\lambda)$ to be 0, can you say that $[0,-\log\alpha]$ is a $1-\alpha$ CI for $\lambda$ because it's implied by $P_\lambda(Y=0)=\exp(-\lambda)\geq \alpha$? $\endgroup$ Commented Dec 19, 2017 at 23:24
  • $\begingroup$ As you'll see from my second reference, the Hanley&Lippman-Hand argument needs work to be a solid argument - it's not an argument about an interval - but Jovanovic & Levy give a confidence interval-based argument (I now have added more details). I believe you could probably follow a similar interval-based argument as Jovanovic and Levy give to obtain the bound you got for the Poisson; I have not attempted to do so, however. $\endgroup$ Commented Dec 20, 2017 at 0:03
3
$\begingroup$

Given $k=0$ successes of $n$ trials, the exact (Clopper–Pearson) confidence interval is $[p_1,p_2]$ such that $$P(K \ge k=0 \vert p=p_1)=\frac{\alpha}{2}$$ and $$P(K \le k=0 \vert p=p_2)=\frac{\alpha}{2},$$

where $K \sim Binomial(n,p)$. You can calculate $[p_1,p_2]$ using the inverse binomial function. However, it is also possible to solve by hand for $p_i$ since the cumulative binomial is very simple in this special case: $(1−p)^n= \alpha$. You can also use a Poisson approximation in the same way to get $e^{-n \cdot p}=\alpha$.

Here's a graph of the approximation as a function of sample size:

enter image description here

$\endgroup$