Assume that we have N observations
where are observable predictors and
are target real variables (i.e., the variables that must be predicted).
We would like to estimate a nonlinear model of the form
where θ is a vector of k unknown real parameters, f is a known function nonlinear in θ and for some positive value of σ.
Setting
and assuming that the N observations are independent, the log-likelihood of the model given our N observations is
We estimate the model parameters by maximizing the log-likelihood with respect to θ. This is equivalent to minimizing the following objective function (sum of squared residuals):
Start from and approximate the objective function around
with the following quadratic function:
Thanks to the objective function's special form, we can calculate a local quadratic approximation by taking the first order expansion of f instead of the second-order expansion of the objective function itself.
Defining for simplicity
we have that this quadratic approximation reaches its minimum when
which is satisfied when the displacement δ solves the following linear system:
Since
The following contributions are added to the gradient of φ in the two cases:
The linear system is now
where
