Maximization of a likelihood function for a GARCH(1,1) process [closed]

Question

I'm currently studying Pseudo Maximum Likelihood estimation. I'm trying to fit a GARCH model with Gaussian Pseudo Maximum Likelihood (and then non Gaussian), but before doing it on actual data I wanted to make sure it works with simulations. The parametrization I'm using is the one in Newey and Steigerwald (1997)

$$y_t=\sigma_0 v_t z_t=\epsilon_t$$ $$v_t^2=1+\alpha \epsilon_{t-1}^2+\beta v_{t-1}^2$$ Where $z_t$ is i.i.d. $D(0,1)$. I then wrote the log-likelihood function in Matlab

function [logL,gradlogL] = gllik(theta,y) %GLOGLIKELIHOOD Given a time series this function calculates the gaussian log %likelihood for a garch(1,1) process. the notation used is %yt=epsilon_t %epsilon_t=sigma0*sigma_t*z_t, z_t iid(0,1) %sigmat^2=1+alpha*(epsilon(t-1)^2)+beta*sigmat(t-1)^2 %theta(1)=sigma0^2; theta(2)=alpha; theta(3)=beta; T=size(y,1); logL=0; %initializing innovation eps=nan(T,1); eps(1)=0; %initializing conditional variance sigmatsq=nan(T,1); sigmatsq(1)=var(y); ztsq=nan(T,1); ztsq(1)=0; dgammavtsq=zeros(T,2); dgammalt=zeros(T,2); %Gradient declaration gradlogL=zeros(1,3); for t=2:T %calculating new means, volatilities and residuals sigmatsq(t)=1+theta(2)*eps(t-1)^2+theta(3)*sigmatsq(t-1); eps(t)=y(t); ztsq(t)=(eps(t))^2/(theta(1)*sigmatsq(t)); %calculating likelihood lt=-log(2*pi)/2-log(sigmatsq(t))/2-ztsq(t)/2-log(theta(1))/2; logL=logL+lt; %calculating gradient gradlogL(1)=gradlogL(1)+(2*theta(1))^(-1)*(ztsq(t)-1); dgammavtsq(t,:)=[eps(t-1)^2, sigmatsq(t-1)]; dgammalt(t,:)=(dgammavtsq(t,:)/(2*sigmatsq(t)))*(ztsq(t)-1); gradlogL(2:3)=gradlogL(2:3)+dgammalt(t,:); end logL=-logL/T; gradlogL=-gradlogL/T; end

I return minus the function because then I want to maximize it using fmincon. I'm using the following options for the minimizer:

options = optimoptions('fmincon','Algorithm','interior-point','SpecifyObjectiveGradient',true,'MaxIterations', 1000, 'MaxFunEvals',500);

And I'm imposing that $\alpha+\beta <1$ and the parameters are all positive. The problem is that the minimization gives me completely off estimates. In my latest simulation I used as parameters $\alpha=0.1$, $\beta=0.85$, $\sigma_0=0.9$ and simulated $N=50$ GARCH(1,1) Gaussian time series. Fmincon returns as averaged estimates $\hat{\sigma_0}=4.3896$, $\hat{\alpha}=0.0330$, $\hat{\beta}=0.5620$ (I did it with higher N too, with results that are no better).

It even seems to work slightly better when I simulate a t-student GARCH and then estimate it by Gaussian PMLE! Also with the "classical" parametrization it all seems to work fine.

The question: I'm pretty sure there's a thousand ways the code could be optimized (I'm unashamedly guilty of using too many for cicles...) but is there something I'm missing (like numerical problems or just a mistake in the code)?. Thanks in advance for the answers.

Emil · Accepted Answer · 2019-07-27 16:10:35Z

Welcome to Cross Validated!

(This is not intended as a full answer to your question but as it's difficult to post code in the comments, I had to submit it like this.)

A few recommendations I would give you would be to first start experimenting on the simplest model of this type, namely an ARCH(1) model. Build a likelihood estimator that works for that model, and then proceed to more complex cases.

Furthermore, it's quite useful in practice to parameterize your inputs in such a way that you can get the estimates using unconstrained instead of constrained optimization routines (MATLAB itself has an example of this somewhere in their documentation). Convergence is faster and more reliable this way. Below is an example of how I parameterized the inputs for a GARCH(1,1) model: I don't perform the optimization on the inputs directly (theta) but rather on some appropriate transformations (alpha). For example, the $a_0$ (in the code denoted as alpha(1)) needs to be strictly positive, so instead of constraining it explicitly, I get a guaranteed positive value by using the exponential of a different parameter, exp(theta(1)), and so on for the other parameters:

function val = log_like(theta,data) % the log-likelihood function of a GARCH(1,1) alpha(1) = exp(theta(1)); alpha(2) = theta(2)^2/(1+theta(2)^2); alpha(3) = (1 - alpha(2))*exp(theta(3))/(1+exp(theta(3))); st = [var(data); zeros(length(data), 1)]; v = st; for t = 2:length(data) st(t) = alpha(1) + alpha(2)*data(t-1)^2 + alpha(3)*st(t-1); v(t) = log(2*pi) + log(st(t)) + data(t)^2/st(t); end val = .5*sum(v);

As you can see here, theta is transformed into alpha so that you make use of the parameter constraints; they are possibly slightly different than your parametrization of the underlying model equations, but the main idea is there.

%% GARCH(1,1) model % initial values supplied a0 = .05; % a0 -- the constant a1 = .05; % a1 -- the ARCH term b = .05; % b1 -- the GARCH term X = tarch_simulate(10000, [.2 .3 .5], 1, 0, 1); % this simulates GARCH data options = optimset('Display', 'notify', 'MaxIter', 50000, 'TolX', 1e-30, 'TolFun', 1e-30,... 'LargeScale', 'off', 'MaxFunEvals', 10000); % transformations of the initial values by using the inverse of the functions % stated in log_like theta0(1) = log(a0); theta0(2) = sqrt(a1/(1-a1)); theta0(3) = log(b/(1-a1-b)); clc % parameter estimation using the custom function log_like & the MFE version [theta, ~, exitflag, output, grad, hessian] = fminunc('log_like', theta0, options, X); params = tarch(X, 1, 0, 1, [], [], [], options);

After you perform the estimation, you reverse back using the inverse of the functions you've used to transform them in the first place:

% transforming back (this is for the simple GARCH) a(1) = exp(theta(1)); a(2) = theta(2)^2/(1+theta(2)^2); a(3) = (1-a(2))*exp(theta(3))/(1+exp(theta(3))); s1 = sprintf('GARCH(1,1) parameters according to custom built ML function are:\n'); disp(s1); disp(a') s2 = sprintf('GARCH(1,1) parameters according to MFE are:\n'); disp(s2); disp(params) clear s1 s2

Feel free to experiment with these code snippets in case that helps you. I used Kevin Sheppard's MFE toolbox for the simulations and estimations (i.e. the tarch functions etc.)

By the way, there's no problem with derivatives, right? What I mean by that is: if I do the derivatives in terms of the original parameters and then substitute the new ones is it the same? (asking because of maximization but also because of standard errors) (it should follow from invariance something something of the log likelihood, but you can never be too sure) — Gennaro Marco Devincenzis
– Gennaro Marco Devincenzis, Commented Jul 27, 2019 at 16:47
I'd say there's no problem because you are performing the optimization on the parameters that interest you to begin with, you're just using an additional step, or rather a different path (namely the transformation) for reaching the result. If I'm not mistaken, Sheppard does the same kind of transformation in his functions (check MFE documentation to be sure). — Emil
– Emil, Commented Jul 27, 2019 at 17:04
You should know however that convergence to the "true" parameters becomes increasingly difficult as the model becomes more complicated, i.e. in the case of having a more convoluted variance equation as in the case of EGARCH, or innovations following an intricate distribution, e.g. the Generalized Hyperbolic. That happens because the likelihood function becomes susceptible to local minima, in which case using multiple starting vectors for the estimation and comparing results is a good idea. — Emil
– Emil, Commented Jul 27, 2019 at 17:08
Yes, in fact I myself got into the topic you asked about when trying to replicate results from this paper, but to no avail, exactly because of the difficulties of that particular distribution, plus the authors provided little in the way of reproducibility. — Emil
– Emil, Commented Jul 27, 2019 at 18:15

Stack Exchange Network

Maximization of a likelihood function for a GARCH(1,1) process [closed]

1 Answer 1

Linked

Hot Network Questions

Maximization of a likelihood function for a GARCH(1,1) process [closed]

1 Answer 1

Linked

Related

Hot Network Questions