By construction, `M` is real symmetric and almost certainly positive definite. 
The eigenvalues of `M` are `e = (d/d[[1]])^2`, where `d` is the vector of singular values of `A`. 
The eigenvalues of `C = M^2 + alpha M + beta` are `e^2 + alpha e + beta`. 
The eigenvectors `M` and `C` are the same as the left singular vectors of `A`, say `V`. 
The offdiagonals of `V'CV` are zero. 
The sum of squares of the diagonals of `V'CV` = the sum of squares of all the elements of `C`. 
But the diagonals of `V'CV` are just the eigenvalues of `C`, so we can simplify the problem: 
minimize `Tr[(e^2 + alpha e + beta)^2]`. 
You already know `e`, because you constructed it from `d`, so the solution is immediate: 
`{alpha = Covariance[e,-e^2]/Variance[e], beta = Mean[-e^2] - alpha*Mean[e]}`. 
That's the exact least-squares solution.

EDIT - It occurred to me that, although getting the singular values of a 1000 x 1000 matrix takes only a few seconds, if you have to do it many times (e.g., inside a loop) then the total time may be prohibitive, so further analysis may help.

Simulations suggest that if the elements of `A` are iid Uniform[-1,1] then the expected value of `e` is close to `Reverse[(#/2 + ArcSin[#]/Pi)^2 & [Range@n/n]]`. This code 

 {n = #, y = -(x = (#/2 + ArcSin[#]/Pi)^2 & [Range@n/n])^2;
 N@{a = (n x.y - (Tr@x)(Tr@y))/(n x.x - (Tr@x)^2),
 b = (Tr@y - a*Tr@x)/n}}& /@ {10,100,1000}

gives `{alpha,beta}` pairs for `n = {10, 100, 1000}`. 

 {{10, {-0.939996, 0.104015}}
 {100, {-0.764133, 0.0667884}}
 {1000, {-0.743093, 0.0626846}}}

The pair for `n = 1000` should be a good approximation to the actual least-squares coefficients for a sample 1000 x 1000 matrix.

EDIT 2 - I get the same results when the elements of `A` are normal and logistic as when they're uniform. Has taking n = 1000 put us in asymptoticland? **Would someone please check these results?**