How to perform a 2 sided binomial test with the alternative being larger

Question

Given data for two groups:

Group A: $\left\{ (n_1, s_1), (n_2, s_2), \ldots, (n_k, s_k) \right\}$.
Group B: $\left\{ (n_1, s_1), (n_2, s_2), \ldots, (n_l, s_l) \right\}$.

Where $n_i$ is the number of trials and $s_i$ is the number of successes.

The assumption is that both are samples from a binomial distribution with unknown $p$.
Namely, each observation in the 2 samples is from a binomial distribution with known $n$ but unknown $p$ yet while the $n$ changes per observation, the $p$ is constant for each group.

I want to apply a binomial test with the assumption $p_B = p_A$.
Yet in Wikipedia I found only a binomial test for one sample versus the hypothesis of $p_0$.

Is there a way to have 2 samples binomial test?

utobi · Accepted Answer · 2022-12-28 12:24:37Z

Let $X_1,\ldots,X_k$ be the number of successes in group $A$, and $Y_1,\ldots,Y_l$ for group $B$. By assumption

$$X_i \sim \text{Binomial}(n_i, \theta_A),\quad i=1,\ldots,k,$$ $$Y_j \sim \text{Binomial}(m_i, \theta_B), \quad j=1,\ldots,l$$ and $X_i$'s are independent, $Y_j$'s are independent and $X_i,Y_j,$ are also independent.

Let $S = \sum_i X_i$, $T = \sum_j Y_j$, $n = \sum_i n_i$ and $m = \sum_j m_j$. Then by the closure properties of the binomial distribution

$$ S \sim \text{Binomial}(n, \theta_A), \quad T \sim \text{Binomial}(m, \theta_B). $$

Thus the test for $H_0:\theta_A=\theta_B$ boils down to testing for the difference between two binomial samples. This problem can be solved either by a Wald, likelihood ratio test, Rao score test or by an exact $\alpha$-level test. I'll work out the details of the Wald test here.

Let $\hat \theta_A,\hat\theta_B$ be the maximum likelihood estimator (MLE) of $\theta_A$ and $\theta_B$ respectively. Then, by the large sample properties of the MLE, we have

$$ \hat\theta_A\,\dot\sim\, N\left(\theta_A, \frac{\hat\theta_A(1-\hat\theta_A)}{n}\right),\quad \hat\theta_B\,\dot\sim\, N\left(\theta_B, \frac{\hat\theta_B(1-\hat\theta_B)}{m}\right), $$ and $\hat\theta_A$ is independent from $\hat\theta_B$. By the closure properties of the Normal distribution we have

$$ \hat\theta_A-\hat\theta_B \,\dot\sim\, N\left(\theta_A-\theta_B,\frac{\hat\theta_A(1-\hat\theta_A)}{n}+\frac{\hat\theta_B(1-\hat\theta_B)}{m}\right). $$

Thus

$$ W = \frac{\hat\theta_A-\hat\theta_B-(\theta_A-\theta_B)}{\left(\frac{\hat\theta_A(1-\hat\theta_A)}{n}+\frac{\hat\theta_B(1-\hat\theta_B)}{m}\right)^{1/2}}\,\dot\sim\, N(0,1). $$

Here "$\dot\sim$" means "distributed as for large $n+m$".

The Wald test is:

Reject $H_0:\theta_A-\theta_B$ if $|W^{obs}|>z_{\alpha/2}$

where $W^{obs}$ is $W$ computed at the observed data and $z_{\alpha/2}$ is the upper $\alpha$th quantile of the standard normal distribution.

An approximate test of this kind can be computed with R using the prop.test command; but see also chisq.test for a chi-squared goodness-of-fit test. For an exact test, you can either use Fisher's exact test (fisher.test) or have a look at the exact2x2 package. I presume that in your case the sample size is sufficiently large so approximate tests, such as the Wald test, will be fine.

thank you @User1865345 for the advice! I added some details for the sake of completeness. — utobi
– utobi, Commented Dec 28, 2022 at 12:25
This is great! What would you do for small $m$ and $n$? Say about 10-40? — Mark
– Mark, Commented Dec 28, 2022 at 13:58
@Mark, the usual recommendations are: (i) Fisher's exact test, (ii) chi-squared test plus p-value computed by simulation, (iii) permutation or bootstrap tests. I'd add (iv) exact tests implemented in exact2x2; (i) and (ii) may be the easiest to perform. — utobi
– utobi, Commented Dec 28, 2022 at 14:11
yes, you find all the details at the help page of chisq.test. — utobi
– utobi, Commented Dec 28, 2022 at 14:20

Stack Exchange Network

How to perform a 2 sided binomial test with the alternative being larger

1 Answer 1

Hot Network Questions

How to perform a 2 sided binomial test with the alternative being larger

1 Answer 1

Related

Hot Network Questions