1
$\begingroup$

In a linear regression model ${\bf y} = {\bf x}^T {\bf w} + \epsilon$, assuming Gaussian noise ($\epsilon\sim N(0,\sigma_n^2)$), and Gaussian priors on the weights (${\bf w}\sim N({\bf 0}, \Sigma_p))$, I want to show that the posterior distribution for the weights given data $X,{\bf y}$ is also Gaussian, with: $$p({\bf w}|X,{\bf y})\sim N\left(\bar{\bf w}=\frac{1}{\sigma_n^2}A^{-1}X{\bf y}, A^{-1}\right)$$ Where $A = \sigma_n^{-2}X X^T + \Sigma_p^{-1}$.

However I am stumped at how to come up with the said mean. This is how far I have reached.

My Work

how do I proceed from here to finally obtain $A$.

$\endgroup$

1 Answer 1

1
$\begingroup$

Think of the expression inside the exponential as a quadratic in $\mathbf w$, and complete the square! You should get $$ - \frac{1}{2\sigma^2} (\mathbf y - \mathbf x^T \mathbf w)^T (\mathbf y - \mathbf x^T \mathbf w) - \frac 1 2 \mathbf w^T \mathbf \Sigma^{-1} \mathbb w = -\frac 1 2 \left(\mathbf w - \mathbf A^{-1}\mathbf x \mathbf y^T \right)^T \mathbf A\left(\mathbf w - \mathbf A^{-1}\mathbf x \mathbf y^T \right) + c$$ where $$ \mathbf A = \mathbf \Sigma^{-1} + \frac{1}{\sigma^2}\mathbf x \mathbf x^{T}$$ and $c$ is the "constant term" (i.e. an unimportant number that is independent of $\mathbf w$.)

To verify this, just multiply both sides out and check that the terms containing $\mathbf w$ agree. The terms not involving $\mathbf w$ don't matter.

Having completed the square, we get $$ p(\mathbf w | \mathbf x, \mathbf y)\propto \exp\left( -\frac 1 2 \left(\mathbf w - \mathbf A^{-1}\mathbf x \mathbf y^T \right)^T \mathbf A\left(\mathbf w - \mathbf A^{-1}\mathbf x \mathbf y^T \right) \right).$$ Notice how the constant $c$ has been absorbed into the constant of proportionality. This expression is the probability density function for the multivariate Gaussian with mean $\mathbf A^{-1}\mathbf x\mathbf y^T$ and variance $\mathbf A^{-1}$.

$\endgroup$
6
  • $\begingroup$ My main problem is getting to $A$, how did you complete the square? $\endgroup$ Commented Jun 6, 2018 at 23:49
  • $\begingroup$ Have you tried multiplying out both sides of the equation, and verifying that the terms containing $\mathbf w$ agree? $\endgroup$ Commented Jun 6, 2018 at 23:51
  • $\begingroup$ I agree that both sides are the same. But in the future I would want to be able to do this for any two gaussians. So is there a generic way to achieve this? $\endgroup$ Commented Jun 6, 2018 at 23:52
  • $\begingroup$ Yes. If the expression is $w^TAw + w^Tb + b^Tw + c$ (where $A$, $b$ and $c$ are independent of $w$), then the expression after completing the square is $(w - A^{-1}b)^TA(w-A^{-1}b) + c'$, where $c'=c+b^{T}A^T b$ $\endgroup$ Commented Jun 6, 2018 at 23:54
  • $\begingroup$ Thanks mate! this helped. Also $\mathbf{A}$ should be $\Sigma^{-1}+ \frac{\mathbf{xx^T}}{\sigma^2}$. $\endgroup$ Commented Jun 8, 2018 at 3:11

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.