I stumbled upon this old question and I would like to share my solution. As mentioned in other answers, there is no analytical solution, but the function to be minimized behaves nicely and the optimal value of $\alpha$ can be found easily with a few Newton iterations. There is also a formula to check the optimality of the result.
The impulse response of the length $N$ FIR moving average filter is given by
$$h_{FIR}[n]=\frac{1}{N}(u[n]-u[n-N])\tag{1}$$
where $u[n]$ is the unit step function. The first order IIR filter
$$y[n]=\alpha x[n]+(1-\alpha)y[n-1]\tag{2}$$
has the impulse response
$$h_{IIR}[n]=\alpha(1-\alpha)^nu[n]\tag{3}$$
The goal is now to minimize the squared error
$$\epsilon=\sum_{n=0}^{\infty}\left(h_{FIR}[n]-h_{IIR}[n]\right)^2\tag{4}$$
Using $(1)$ and $(3)$, the error can be written as
$$\begin{align}\epsilon(\alpha)&=\sum_{n=0}^{N-1}\left(\alpha(1-\alpha)^n-\frac{1}{N}\right)^2+\sum_{n=N}^{\infty}\alpha^2(1-\alpha)^{2n}\\&=\alpha^2\sum_{n=0}^{\infty}(1-\alpha)^{2n}-\frac{2\alpha}{N}\sum_{n=0}^{N-1}(1-\alpha)^n+\sum_{n=0}^{N-1}\frac{1}{N^2}\\&=\frac{\alpha^2}{1-(1-\alpha)^2}-\frac{2\alpha}{N}\frac{1-(1-\alpha)^N}{1-(1-\alpha)}+\frac{1}{N}\\&=\frac{\alpha}{2-\alpha}-\frac{2}{N}\left(1-(1-\alpha)^N\right)+\frac{1}{N},\qquad 0<\alpha<2\tag{5}\end{align}$$
This expression is very similar to the one given in this answer, but it's not identical. The restriction on $\alpha$ in $(5)$ makes sure that the infinite sum converges, and it is identical to the stability condition for the IIR filter given by $(2)$.
Setting the derivative of $(5)$ to zero results in
$$(1-\alpha)^{N-1}(2-\alpha)^2=1\tag{6}$$
Note that the optimal $\alpha$ must be in the interval $(0,1]$ because larger values of $\alpha$ result in an alternating impulse response $(3)$, which cannot approximate the constant impulse repsonse of the FIR moving average filter.
Taking the square root of $(6)$ and introducing $\beta=1-\alpha$, we obtain
$$\beta^{(N+1)/2}+\beta^{(N-1)/2}-1=0\tag{7}$$
This equation cannot be solved analytically for $\beta$, but it can be solved for $N$:
$$N=-2\frac{\log(1+\beta)}{\log(\beta)},\qquad \beta\neq 0\tag{8}$$
Equation $(8)$ can be used to double-check a numerical solution of $(7)$; it must return the specified value of $N$.
Equation $(7)$ can be solved with a few lines of (Matlab/Octave) code:
N = 50; % desired filter length of FIR moving average filter if ( N == 1 ) % no iteration for trivial case b = 0; else % Newton iteration b = 1; % starting value Nit = 7; n = (N+1)/2; for k = 1:Nit, f = b^n + b^(n-1) -1; fp = n*b^(n-1) + (n-1)*b^(n-2); b = b - f/fp; end % check result N0 = -2*log(1+b)/log(b) + 1 % must equal N end a = 1 - b;
Below is a table with the optimal values of $\alpha$ for a range of filter lengths $N$:
N alpha 1 1.0000e+00 2 5.3443e-01 3 3.8197e-01 4 2.9839e-01 5 2.4512e-01 6 2.0809e-01 7 1.8083e-01 8 1.5990e-01 9 1.4333e-01 10 1.2987e-01 20 6.7023e-02 30 4.5175e-02 40 3.4071e-02 50 2.7349e-02 60 2.2842e-02 70 1.9611e-02 80 1.7180e-02 90 1.5286e-02 100 1.3768e-02 200 6.9076e-03 300 4.6103e-03 400 3.4597e-03 500 2.7688e-03 600 2.3078e-03 700 1.9785e-03 800 1.7314e-03 900 1.5391e-03 1000 1.3853e-03