0
$\begingroup$

In this video, at 31:12, the following equality pops up:

$(\mathbf{y}-\mathbf{X}\mathbf{\theta})^T(\mathbf{y}-\mathbf{X}\mathbf{\theta})=\sum\limits_{i=1}^n(y_i-\mathbf{x}_i^T\mathbf{\theta})^2$

From the previous slides in the same video, it looks like $\mathbf{y}$ and $\mathbf{\theta}$ are column vectors. So in order for the equality to make sense, $\mathbf{x}_i$ should be column vectors (so that their transposes are row vectors, which can be multiplied with the column vector $\mathbf{\theta}$).

I've tried an example, but am not getting that it holds.

$\mathbf{\theta}=\begin{pmatrix} 2 \\ 1 \end{pmatrix}$

$\mathbf{y}=\begin{pmatrix} 1\\ 2\end{pmatrix}$

$\mathbf{X}=\begin{pmatrix} 1&3\\2&4\end{pmatrix}$.

$\mathbf{X\theta}=\begin{pmatrix}5 \\8\end{pmatrix}$

$\mathbf{y}-\mathbf{X\theta}=\begin{pmatrix}-4 \\-6\end{pmatrix}$

So the LHS is equal to $-4\times -4+-6\times-6=52$.

\begin{align}(y_1-\mathbf{x}_1^T\mathbf{\theta})^2&=\big(1-\begin{pmatrix}1&2\end{pmatrix}\begin{pmatrix}2\\1\end{pmatrix}\big)^2=9\\ (y_2-\mathbf{x}_2^T\mathbf{\theta})^2&=\big(2-\begin{pmatrix}3&4\end{pmatrix}\begin{pmatrix}2\\1\end{pmatrix}\big)^2=64 \end{align}

So the RHS equals $73$. Why aren't these values equal?

$\endgroup$
3
  • $\begingroup$ I think the $\mathbf{x}_i^T$'s should actually be the rows of $X$. $\endgroup$ Commented Jan 1, 2016 at 14:42
  • $\begingroup$ So $\mathbf{x}_1$ is actually the transpose of the first row, so that when you take $\mathbf{x}_1^T$ you get the first row? Doesn't look like the most natural notation to me, but I guess I'll get used to it $\endgroup$ Commented Jan 1, 2016 at 14:46
  • 1
    $\begingroup$ No, it's not natural notation - don't get used to it. Natural notation is to have $\mathbf{x}_i$ denote a column vector. I think the author has chosen this odd use of notation to ensure the reader (watcher) is aware that the object being considered is a transposed column vector of some sort ... maybe ... I'm just guessing really ... $\endgroup$ Commented Jan 1, 2016 at 14:49

1 Answer 1

1
$\begingroup$

Suppose $X$ is an $m \times n$ matrix, and let $\theta \in K^n$ and $y \in K^m$ (we regard the elements of $K^k$ as column vectors). For $1 \leq i \leq m$ let $x_i$ denote the $i$-th row of $X$ (in particular $x_i$ is a row vector and thus $x_i^T \in K^n$). For $v,w \in K^n$ let $$ \langle v,w \rangle = \sum_{i=1}^n v_i w_i = v_i^T w_i. $$

For all $1 \leq i \leq m$ we have $$ (X \theta)_i = \sum_{j=1}^n X_{ij} \theta_j = \sum_{j=1}^n (x_i)^T_j \theta_j = \langle x_i^T, \theta \rangle, $$ and therefore $$ (y - X \theta)_i = y_i - (X \theta)_i = y_i - \langle x_i^T, \theta \rangle, $$ as well as $$ a^T a = \sum_{i=1}^m a_i a_i = \sum_{i=1}^n a_i^2 \quad\text{for every $a \in K^m$}. $$ Putting this together we have $$ (y - X \theta)^T (y - X \theta) = \sum_{i=1}^n (y - X \theta)_i^2 = \sum_{i=1}^n (y_i - \langle x_i^T, \theta \rangle)^2. $$

I guess your confusion comes from the fact that instead of $\langle v,w \rangle$ your formule writes $vw$, leading you to belive that $x_i^T$ must be row vector, while it really is a column vector (i.e. $x_i$ is, as defined above, the $i$-th row of $X$ and not the $i$-th column).

Regarding your calculation, you now have to swap two elements:

\begin{align*} (y_1 - \langle \mathbf{x}_1^T, \mathbf{\theta} \rangle)^2 &=\left( 1- \begin{pmatrix} 1 & \color{red}{3}\end{pmatrix} \begin{pmatrix} 2 \\ 1 \end{pmatrix} \right)^2 = 16\\ (y_2 - \langle \mathbf{x}_2^T, \mathbf{\theta} \rangle)^2 &=\left( 2- \begin{pmatrix} \color{red}{2} & 4 \end{pmatrix} \begin{pmatrix} 2 \\ 1 \end{pmatrix} \right)^2 =36. \end{align*}

$\endgroup$
2
  • $\begingroup$ But wouldn't (1 3) be a row vector, rather than a column vector? $\endgroup$ Commented Jan 1, 2016 at 14:41
  • $\begingroup$ I didn’t make it clear enough to distinguish between matrix multiplication and scalar multiplication. I edited my answer and introduced a seperate (standard) notation for the scalar product. $\endgroup$ Commented Jan 1, 2016 at 14:44

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.