11
$\begingroup$

I'm trying to understand Gaussian Processes. Could anyone tell me:

  1. Why we need to use the log marginal likelihood?
  2. Why using log, the marginal likelihood can be decomposed to 3 terms (including a fitting term and a penalty term)?
$\endgroup$
7
  • $\begingroup$ Con you please add some more details? However, the marginal likelihood is generally used to have a measure of how the model fitting. You can find the marginal likelihood of a process as the marginalization over the set of parameters that govern the process This integral is generally not available and cannot be computed in closed form. However, an approximation can be found with the sum of the complete likelihood and a penalization term, that I suppose is the decomposition that you mentioned on point 2. The likelihood is generally computed in logarithmic scale for numerical stability reason. $\endgroup$ Commented Jul 16, 2014 at 21:21
  • $\begingroup$ @niandra82 thank you so much for your comment~ Please could you explain why we compute the marginal likelihood in logarithmic scale for numerical stability reason? How can such a form increase the stability~ thank you for your help once more. $\endgroup$ Commented Jul 19, 2014 at 11:17
  • $\begingroup$ I can try to explain. COnsider a computer that can store only numbers between 99,000 and 0.001 (only three decimal) plus the sign. If you compute a density and in some point this has value 0.0023456789, in the computer this will be stored as 0.003 losing part of the real value, if you compute it in log scale the $log(0.0023456789)=-6.05518$ will be stored as $-6.055$ losing less than in original scale. If you multiply a lot of small values, the situation get worst: consider $0.0023456789^2$ that will be store as $0$ while $log(0.0023456789^2)=-12.11036$ $\endgroup$ Commented Jul 19, 2014 at 15:08
  • $\begingroup$ If you think this satisfy your question, tell me and i put the comment in an "answer" $\endgroup$ Commented Jul 19, 2014 at 15:10
  • $\begingroup$ @niandra82 thanks for your comments~~ please put them in the 'answer'~ by the way, would you mind to explain how can Cross Validation and Gaussian Process overcome over-fitting problems? $\endgroup$ Commented Jul 20, 2014 at 9:42

1 Answer 1

7
$\begingroup$

The marginal likelihood is generally used to have a measure of how the model fitting. You can find the marginal likelihood of a process as the marginalization over the set of parameters that govern the process This integral is generally not available and cannot be computed in closed form. However, an approximation can be found with the sum of the complete likelihood and a penalization term, that I suppose is the decomposition that you mentioned on point 2.

The likelihood is generally computed in logarithmic scale for numerical stability reason: consider a computer that can store only numbers between 99,000 and 0.001 (only three decimal) plus the sign. If you compute a density and in some point this has value 0.0023456789, in the computer this will be stored as 0.003 losing part of the real value, if you compute it in log scale the log(0.0023456789)=−6.05518 will be stored as −6.055 losing less than in original scale. If you multiply a lot of small values, the situation get worst: consider 0.00234567892 that will be store as 0 while log(0.00234567892)=−12.11036

$\endgroup$

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.