FindMinimum step size too small

Question

I'm running a simulation of an electromagnetic system using Radia, an external code that is run from Mathematica. I want to use FindMinimum to determine some optimum parameters for this simulation. So my function definition is something like:

totalFPandRMSXZ[eps1_?NumericQ, eps2_?NumericQ, eps3_?NumericQ, ebs1_?NumericQ, ebs2_?NumericQ] := Total@(Flatten[finalPosAndRMSXP[eps1, eps2, eps3, ebs1, ebs2]][[2, 6, 9, 11, 13, 14]])^2

where finalPosAndRMSXP builds a structure, solves it and evaluates it.

So I use FindMinimum to look for the best values for the variables ebs1 and ebs2 (eventually I'll extend to all five variables).

FindMinimum[totalFPandRMSXZ[0, 0, 0, ebs1, ebs2], {ebs1, 8, 7}, {ebs2, 6, 6.5}, AccuracyGoal -> 3]

This is slow to run, so I don't want too many function calls. And this will be a real thing that gets built, so I only want to find an answer for the best value of ebs1 and ebs2 to one decimal place. (These are dimensions in mm.) There may be a difference between 8.000 and 8.001, but I'm not interested in it. But when I execute the FindMinimum, I get function calls with parameters:

{0,0,0,8.,6.} {0,0,0,7.96875,6.} {0,0,0,8.01931,6.} {0,0,0,8.01914,6.} {0,0,0,8.01183,6.} {0,0,0,8.01635,6.} {0,0,0,8.01807,6.} {0,0,0,8.01873,6.} {0,0,0,8.01848,6.} {0,0,0,8.01887,6.}

How can I tell Mathematica to take bigger steps?

Community · Accepted Answer · 2017-04-13 12:55:39Z

I can only say about Method -> "LevenbergMarquardt" with which I have experience. This method uses the "TrustRegion" line search method which current implementation in Mathematica has a very significant shortcoming which I described in this question of mine. The essence of the problem seems to be that FindMinimum doesn't increase step size when it should. The option "AcceptableStepRatio" in such situations seemingly doesn't work: setting "AcceptableStepRatio" -> 10^-100 along with "MaxScaledStepSize" -> Infinity changes nothing.

The only way to overcome this limitation is to introduce the concept of FindMinimum sessions: when the step size becomes too small you finish current session and start a new FindMinimum session with parameters obtained during previous session. In this new session FindMinimum will start from the "StartingScaledStepSize" again and as a result will achieve the minimum MUCH faster than if you allow it to continue previous session. From my experience in some situations this approach allows to achieve the minimum 10 - 20 times faster than the default.

The main rule of thumb is as follows: if the process of minimization takes more than 100 iterations (in some cases may be 300 iterations but it is undoubtedly so for 1000 iterations) then at the end the step size becomes so small that the process of minimization becomes very slow. So usually I simply set MaxIterations -> 100 and restart FindMinimum with parameters obtained on previous step until it reaches the minimum. In such a way I get the minimum in, say, 2000 iterations while when I simply set MaxIterations -> 10000 I get the same (or worse) minimum in, say, 8000 iterations.

J. M.'s missing motivation · Accepted Answer · 2016-05-21 13:06:07Z

From FindMinimum's point of view, the objective function you are minimizing is a "black-box function", meaning it is an oracle, that yields values, but not derivatives, or Hessians.

If the method chosen by FindMinimum to solve the problem requires a gradient of the objective function at a point, finite differences are used to approximate it.

The multitude of evaluations at 'too-precise' points come from that, and the arguments are indeed different, just beyond the printing accuracy.

The solution is to use a minimization method that is derivative-free, and since your problem is unconstrained, you can use Method -> "PrincipalAxis".

I thought Principal Axis was the default method when you provide two starting values? — benshepherd
– benshepherd, Commented Apr 23, 2016 at 21:16

Community · Accepted Answer · 2017-04-13 12:55:59Z

Referring back to my answer regarding the "PrincipalAxis" method, and the documentation:

For an $n$-variable problem, take a set of search directions $u_1,u_2,...,u_n$ and a point $x_0$. Take $x_i$ to be the point that minimizes $f$ along the direction $u_i$ from $x_{i-1}$ (i.e. do a line search from $x_{i-1}$), then replace $u_i$ with $u_{i+1}$.

Two distinct starting conditions in each variable are required for this method because these are used to define the magnitudes of the vectors $u_i$.

I think that the first parameter is the starting point, $x_0$, and, combined with the second parameter, thus define the magnitude of the search direction.

I suggest playing around with the magnitude of the second parameter, in that case, and perhaps using EvaluationMonitor[] to investigate the behaviour.

FindMinimum[x^2, {{x, 0.5}}, Method -> "PrincipalAxis", EvaluationMonitor :> Print["x = ", x]]

Stack Exchange Network

FindMinimum step size too small

3 Answers 3

Linked

Hot Network Questions

FindMinimum step size too small

3 Answers 3

Linked

Related

Hot Network Questions