4
$\begingroup$

The following question speaks about binomial distribution with known probability $p$, but unknown number of trials $n$.

Binomial confidence interval over the number of trials

Trying to think of how a Bayesian interval would be constructed for such a case I passed first at the stage to thinking about the Jeffreys prior. However, for a discrete parameter space this is not defined because the derivative does not exist.

Are there approaches to find a prior according to the same ideas? Of course, the property of invariance of the distribution under coordinate transformations is obsolete since probability mass functions don't transform like probability density functions. Is that the only property/motivation for Jeffreys prior, or are there other properties that can be applied to probability mass functions as well?

$\endgroup$
9
  • $\begingroup$ Possibly we could try some continuous analogue of a binomial distribution? That might lead to different options, but I am sure there are some more simple/desired cases among them. $\endgroup$ Commented Dec 13, 2024 at 16:16
  • $\begingroup$ The binomial distribution has a relationship between variance and expectation like $Var[X] = (1-p) E[X]$. Are there known distributions that match that? Or otherwise we could maybe model it as a normal distribution $X \sim N(np,np(1-p))$ and for that case a Jeffreys prior should exist and might be computed. $\endgroup$ Commented Dec 13, 2024 at 16:22
  • 1
    $\begingroup$ Relevant: stats.stackexchange.com/questions/500781/…, stats.stackexchange.com/questions/275600/…, stats.stackexchange.com/questions/113851/…, stats.stackexchange.com/questions/588863/…, stats.stackexchange.com/questions/502124/… and search for more ... $\endgroup$ Commented Dec 13, 2024 at 16:33
  • 1
    $\begingroup$ Number of trials to get a known count of successes (the observed data) would be negative binomial. $\endgroup$ Commented Dec 14, 2024 at 2:08
  • 1
    $\begingroup$ Literature on bayesian feature selection might contain some recipes - since they are concerned with inherently discrete parameters, like the number of features in a model. $\endgroup$ Commented Dec 15, 2024 at 9:08

1 Answer 1

1
$\begingroup$

One approach can be to approximate binomial distribution with a normal distribution

$$X \sim \mathcal{N}(\mu=\theta,\sigma^2=q\theta)$$ where $q = (1-p)$ is fixed and $\theta = np$ is unknown and we search a Jeffreys prior for it.

The Jeffreys prior is proportional to the square root of the Fisher information matrix

$$\mathcal{I}(\theta) = E\left[\left(\frac{\partial}{\partial\theta} \log f(x;\theta)\right)^2\right] $$

and

$$\begin{array}{} \frac{\partial}{\partial\theta} \log f(x;\theta) &=& \frac{\partial}{\partial\theta} \left(-\log(2\pi q\theta)-\frac{(x-\theta)^2}{2q\theta} \right) \\ &=& - \frac{1}{\theta} + \frac{x^2/\theta^2-1}{2q} \end{array}$$

and using

$$E[x^2] = (\theta^2+q \theta)$$ $$E[x^4] = \theta^4 + 6 q\theta^3 + 3q^2 \theta^2$$

we get

$$\begin{array}{} E\left[\left(- \frac{1}{\theta} + \frac{x^2/\theta^2-1}{2q} \right)^2\right]& = &E\left[ \frac{1}{\theta^2} - \frac{x^2/\theta^3-1/\theta}{q} + \frac{x^4/\theta^4-2x^2/\theta^2 + 1}{4q^2} \right]\\ &=& \frac{1}{q \theta }+\frac{3}{4\theta^2} \end{array} $$

Then a prior could be

$$p(\theta) \propto \sqrt{\frac{1}{q \theta }+\frac{3}{4\theta^2} } $$

$\endgroup$
1
  • $\begingroup$ I thought that these calculations were gonna give some insight, but I am not so sure what it means. For example with $q=1$ I would have expected to get a similar result as the Jeffreys prior for the Poisson distribution, but it isn't. $\endgroup$ Commented Dec 15, 2024 at 10:35

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.