Consider the problem of computing a Maximum-Likelihood estimate of the parameters to a finite Dirichlet distribution, given a set of multinomial observations (probability vectors) assumed to have been sampled from a Dirichlet. The following paper provides an iterative fixed-point algorithm to estimate the mean and precision of the Dirichlet separately:
Minka, Thomas. Estimating a Dirichlet distribution. (2000): 3.
The algorithm for estimating the mean $\mathbf{m}$, given fixed precision $s$, is summarized as follows:
- Estimate the full concentration parameters $\pmb{\alpha}=s\mathbf{m}$ by inverting the digamma function.
- $\forall\, k$, set $m_k^{new} = \frac{\alpha_k}{\sum_j \alpha_j}$.
- Repeat until convergence.
Why must we resort to an iterative algorithm to find the mean? Is the average of our observed data vectors not an accurate estimate of the mean? Further, is it also not true that the expected value of the mean of a set of samples from a Dirichlet is the mean of the Dirichlet itself?
Any insight here would be appreciated!