Return to Question

Notice removed Draw attention by A. Donda

occurred Apr 7, 2019 at 20:47

Bounty Ended with user1551's answer chosen by A. Donda

occurred Apr 7, 2019 at 20:47

Tweeted twitter.com/StackMath/status/1114316826220072961

occurred Apr 6, 2019 at 0:00

added 205 characters in body

edited Apr 5, 2019 at 20:01

1.5k
10
24

Premise: I have an $n × q$ matrix $X$ and a $q × a$ matrix $C$ with $n > q > a$.

I'm interested in the structure of the matrix $$ M = X X^+ - X_0 X_0^+ $$ where the superscript $^+$ indicates the Moore–Penrose pseudoinverse and $$ X_0 = X (I_q - C C^+). $$

I assume that $X$ is of full column rank and therefore $X^+ = (X' X)^{-1} X$ (where $'$ indicates the transpose).

_{Background: $X$ is the design matrix of a linear model, $C$ is a contrast, $X_0$ is a reduced design matrix, and $M$ occurs in the definition of standard test statistics.}

$M$ is the difference of two orthogonal projection matrices, where the second projects into a subspace of the subspace the first projects into. This makes the difference an orthogonal projection matrix itself (symmetric and idempotent), which means it has a representation $$ M = X_\Delta X_\Delta^+. $$

Question: How do I obtain $X_\Delta$?

_{user1551 has correctly pointed out in an answer that $X_\Delta = M$ itself fulfills the equation. However, I'm looking for a "version" of $X$, meaning an $n \times q$ matrix of rank $a$.}

My approach: I am guessing that $$ X_\Delta = X - X_0 X_0^+ X, $$ and this seems to be confirmed by numerical tests. But I am unable to come up with a proof, i.e. to show that $$ (X - X_0 X_0^+ X) (X - X_0 X_0^+ X)^+ = X X^+ - X_0 X_0^+. $$

The problem is how to deal with the pseudoinverse of a difference. One can write $$ X_\Delta = (I_n - X_0 X_0^+) X, $$ and according to Wikipedia, in the pseudoinverse of a product where one factor is an orthogonal projection, the orthogonal projection can be redundantly multiplied to the opposite side, meaning here $$ X_\Delta^+ = [(I_n - X_0 X_0^+) X]^+ = [(I_n - X_0 X_0^+) X]^+ (I_n - X_0 X_0^+) = X_\Delta^+ (I_n - X_0 X_0^+), $$ but that doesn't seem to help.

I can prove that $M$ is symmetric and idempotent, using the relations $$ X X^+ X_0 = X_0 \quad \text{and} \quad X_0 X_0^+ X X^+ = X_0 X_0^+, $$ which derive from the definition of $X_0$ and the properties of the pseudoinverse. I can also show that $$ X X_0^+ = X_0 X_0^+ $$ using the property of the pseudoinverse of a product involving an orthogonal projection (see above). But none of that helps either.

Premise: I have an $n × q$ matrix $X$ and a $q × a$ matrix $C$ with $n > q > a$.

I'm interested in the structure of the matrix $$ M = X X^+ - X_0 X_0^+ $$ where the superscript $^+$ indicates the Moore–Penrose pseudoinverse and $$ X_0 = X (I_q - C C^+). $$

I assume that $X$ is of full column rank and therefore $X^+ = (X' X)^{-1} X$ (where $'$ indicates the transpose).

_{Background: $X$ is the design matrix of a linear model, $C$ is a contrast, $X_0$ is a reduced design matrix, and $M$ occurs in the definition of standard test statistics.}

$M$ is the difference of two orthogonal projection matrices, where the second projects into a subspace of the subspace the first projects into. This makes the difference an orthogonal projection matrix itself (symmetric and idempotent), which means it has a representation $$ M = X_\Delta X_\Delta^+. $$

Question: How do I obtain $X_\Delta$?

My approach: I am guessing that $$ X_\Delta = X - X_0 X_0^+ X, $$ and this seems to be confirmed by numerical tests. But I am unable to come up with a proof, i.e. to show that $$ (X - X_0 X_0^+ X) (X - X_0 X_0^+ X)^+ = X X^+ - X_0 X_0^+. $$

The problem is how to deal with the pseudoinverse of a difference. One can write $$ X_\Delta = (I_n - X_0 X_0^+) X, $$ and according to Wikipedia, in the pseudoinverse of a product where one factor is an orthogonal projection, the orthogonal projection can be redundantly multiplied to the opposite side, meaning here $$ X_\Delta^+ = [(I_n - X_0 X_0^+) X]^+ = [(I_n - X_0 X_0^+) X]^+ (I_n - X_0 X_0^+) = X_\Delta^+ (I_n - X_0 X_0^+), $$ but that doesn't seem to help.

I can prove that $M$ is symmetric and idempotent, using the relations $$ X X^+ X_0 = X_0 \quad \text{and} \quad X_0 X_0^+ X X^+ = X_0 X_0^+, $$ which derive from the definition of $X_0$ and the properties of the pseudoinverse. I can also show that $$ X X_0^+ = X_0 X_0^+ $$ using the property of the pseudoinverse of a product involving an orthogonal projection (see above). But none of that helps either.

Premise: I have an $n × q$ matrix $X$ and a $q × a$ matrix $C$ with $n > q > a$.

I'm interested in the structure of the matrix $$ M = X X^+ - X_0 X_0^+ $$ where the superscript $^+$ indicates the Moore–Penrose pseudoinverse and $$ X_0 = X (I_q - C C^+). $$

I assume that $X$ is of full column rank and therefore $X^+ = (X' X)^{-1} X$ (where $'$ indicates the transpose).

_{Background: $X$ is the design matrix of a linear model, $C$ is a contrast, $X_0$ is a reduced design matrix, and $M$ occurs in the definition of standard test statistics.}

$M$ is the difference of two orthogonal projection matrices, where the second projects into a subspace of the subspace the first projects into. This makes the difference an orthogonal projection matrix itself (symmetric and idempotent), which means it has a representation $$ M = X_\Delta X_\Delta^+. $$

Question: How do I obtain $X_\Delta$?

_{user1551 has correctly pointed out in an answer that $X_\Delta = M$ itself fulfills the equation. However, I'm looking for a "version" of $X$, meaning an $n \times q$ matrix of rank $a$.}

My approach: I am guessing that $$ X_\Delta = X - X_0 X_0^+ X, $$ and this seems to be confirmed by numerical tests. But I am unable to come up with a proof, i.e. to show that $$ (X - X_0 X_0^+ X) (X - X_0 X_0^+ X)^+ = X X^+ - X_0 X_0^+. $$

The problem is how to deal with the pseudoinverse of a difference. One can write $$ X_\Delta = (I_n - X_0 X_0^+) X, $$ and according to Wikipedia, in the pseudoinverse of a product where one factor is an orthogonal projection, the orthogonal projection can be redundantly multiplied to the opposite side, meaning here $$ X_\Delta^+ = [(I_n - X_0 X_0^+) X]^+ = [(I_n - X_0 X_0^+) X]^+ (I_n - X_0 X_0^+) = X_\Delta^+ (I_n - X_0 X_0^+), $$ but that doesn't seem to help.

I can prove that $M$ is symmetric and idempotent, using the relations $$ X X^+ X_0 = X_0 \quad \text{and} \quad X_0 X_0^+ X X^+ = X_0 X_0^+, $$ which derive from the definition of $X_0$ and the properties of the pseudoinverse. I can also show that $$ X X_0^+ = X_0 X_0^+ $$ using the property of the pseudoinverse of a product involving an orthogonal projection (see above). But none of that helps either.

Notice added Draw attention by A. Donda

occurred Apr 5, 2019 at 17:32

Bounty Started worth 50 reputation by A. Donda

occurred Apr 5, 2019 at 17:32

revised the whole question, making it more mathematical (less applied) and filling in own attempts at solution

edited Apr 5, 2019 at 17:27

1.5k
10
24

spectral decomposition difference of effect-extracting matrixtwo orthogonal projections is orthogonal projection

Premise: I have an $n × q$ design matrix $X$ of a linear model and a $q × a$ contrast matrix $C$ specifying an effect of interest, with $n > q > a$.

I'm interested in the structure of the matrix $$ M = X X^- - X_0 X_0^- $$$$ M = X X^+ - X_0 X_0^+ $$ where the superscript $^-$$^+$ indicates the Moore–Penrose pseudoinverse and $$ X_0 = X (I_q - C C^-) $$ is the reduced design matrix corresponding to the case that the effect of interest is 0. This matrix $M$ occurs in the formulation of standard test statistics like univariate $F$ and multivariate Hotelling's trace, and can be seen as extracting that part of the data that is accounted for by the effect indicated by $C$.$$ X_0 = X (I_q - C C^+). $$

I assume that the contrast is estimable, meaning that $$ X^- X C = C, $$ or that $C'$ lies in the row space of $X$, and that $X$ is of full column rank and therefore $X^- = (X' X)^{-1} X$$X^+ = (X' X)^{-1} X$ (where $'$ indicates the transpose).

_{Background: $X$ is the design matrix of a linear model, $C$ is a contrast, $X_0$ is a reduced design matrix, and $M$ occurs in the definition of standard test statistics.}

Question: $M$ is the difference of two orthogonal projection matrices, and because $C$ is estimable, it iswhere the second projects into a subspace of the subspace the first projects into. This makes the difference an orthogonal projection matrix itself (symmetric and idempotent). $M$ is an $n × n$ matrix, but it is of the same rank as $C$, $$ \DeclareMathOperator{\rk}{rank} \rk M = \rk C = c \leq a < n, $$ which which means its spectralit has a representation can be written in terms of $c$ $n$-dimensional vectors (eigenvectors whose eigenvalues are 1). $$ M = X_\Delta X_\Delta^+. $$

Is there a way to obtain these $c$ vectors without explicitly calculating $M$ and then its spectral decomposition?Question: How do I obtain $X_\Delta$?

The question is motivatedMy approach: I am guessing that $$ X_\Delta = X - X_0 X_0^+ X, $$ and this seems to be confirmed by an application in data analysis, wherenumerical tests. But I needam unable to storecome up with a large number of such matricesproof, i.e. to show that $$ (X - X_0 X_0^+ X) (X - X_0 X_0^+ X)^+ = X X^+ - X_0 X_0^+. $$

My own attempts: In my view, the key The problem is how to this question lies indeal with the simplificationpseudoinverse of the expressiona difference. One can write $$ X_0^- = [X (I_q - C C^-)]^-. $$$$ X_\Delta = (I_n - X_0 X_0^+) X, $$ If I could reformulateand according to $X_0^-$Wikipedia, in termsthe pseudoinverse of $X^-$ and a product where one factor depending on $C$ onlyis an orthogonal projection, I could unify the two terms inorthogonal projection can be redundantly multiplied to the definition ofopposite side, meaning here $$ X_\Delta^+ = [(I_n - X_0 X_0^+) X]^+ = [(I_n - X_0 X_0^+) X]^+ (I_n - X_0 X_0^+) = X_\Delta^+ (I_n - X_0 X_0^+), $$ but that doesn't seem to help.

I can prove that $M$ is symmetric and hopefully "guess" its spectral decomposition. In other wordsidempotent, how can I understandusing the effectrelations $$ X X^+ X_0 = X_0 \quad \text{and} \quad X_0 X_0^+ X X^+ = X_0 X_0^+, $$ which derive from the definition of $C$ not on $X$ (leading to $X_0$) but on $X^-$ leading to $X_0^-$?

I tried to use and the properties of the Moore–Penrose pseudoinverse with respect to products and with respect to projections, but as far as. I can tell,also show that $$ X X_0^+ = X_0 X_0^+ $$ using the property of the pseudoinverse of a product involving an orthogonal projection (see above). But none of them helpthat helps either.

spectral decomposition of effect-extracting matrix

Premise: I have an $n × q$ design matrix $X$ of a linear model and a $q × a$ contrast matrix $C$ specifying an effect of interest, with $n > q > a$.

I'm interested in the structure of the matrix $$ M = X X^- - X_0 X_0^- $$ where the superscript $^-$ indicates the Moore–Penrose pseudoinverse and $$ X_0 = X (I_q - C C^-) $$ is the reduced design matrix corresponding to the case that the effect of interest is 0. This matrix $M$ occurs in the formulation of standard test statistics like univariate $F$ and multivariate Hotelling's trace, and can be seen as extracting that part of the data that is accounted for by the effect indicated by $C$.

I assume that the contrast is estimable, meaning that $$ X^- X C = C, $$ or that $C'$ lies in the row space of $X$, and that $X$ is of full rank and therefore $X^- = (X' X)^{-1} X$ (where $'$ indicates the transpose).

Question: $M$ is the difference of two orthogonal projection matrices, and because $C$ is estimable, it is an orthogonal projection matrix itself (symmetric and idempotent). $M$ is an $n × n$ matrix, but it is of the same rank as $C$, $$ \DeclareMathOperator{\rk}{rank} \rk M = \rk C = c \leq a < n, $$ which means its spectral representation can be written in terms of $c$ $n$-dimensional vectors (eigenvectors whose eigenvalues are 1).

Is there a way to obtain these $c$ vectors without explicitly calculating $M$ and then its spectral decomposition?

The question is motivated by an application in data analysis, where I need to store a large number of such matrices.

My own attempts: In my view, the key to this question lies in the simplification of the expression $$ X_0^- = [X (I_q - C C^-)]^-. $$ If I could reformulate $X_0^-$ in terms of $X^-$ and a factor depending on $C$ only, I could unify the two terms in the definition of $M$ and hopefully "guess" its spectral decomposition. In other words, how can I understand the effect of $C$ not on $X$ (leading to $X_0$) but on $X^-$ leading to $X_0^-$?

I tried to use the properties of the Moore–Penrose pseudoinverse with respect to products and with respect to projections, but as far as I can tell, none of them help.

difference of two orthogonal projections is orthogonal projection

Premise: I have an $n × q$ matrix $X$ and a $q × a$ matrix $C$ with $n > q > a$.

I'm interested in the structure of the matrix $$ M = X X^+ - X_0 X_0^+ $$ where the superscript $^+$ indicates the Moore–Penrose pseudoinverse and $$ X_0 = X (I_q - C C^+). $$

I assume that $X$ is of full column rank and therefore $X^+ = (X' X)^{-1} X$ (where $'$ indicates the transpose).

_{Background: $X$ is the design matrix of a linear model, $C$ is a contrast, $X_0$ is a reduced design matrix, and $M$ occurs in the definition of standard test statistics.}

$M$ is the difference of two orthogonal projection matrices, where the second projects into a subspace of the subspace the first projects into. This makes the difference an orthogonal projection matrix itself (symmetric and idempotent), which means it has a representation $$ M = X_\Delta X_\Delta^+. $$

Question: How do I obtain $X_\Delta$?

My approach: I am guessing that $$ X_\Delta = X - X_0 X_0^+ X, $$ and this seems to be confirmed by numerical tests. But I am unable to come up with a proof, i.e. to show that $$ (X - X_0 X_0^+ X) (X - X_0 X_0^+ X)^+ = X X^+ - X_0 X_0^+. $$

The problem is how to deal with the pseudoinverse of a difference. One can write $$ X_\Delta = (I_n - X_0 X_0^+) X, $$ and according to Wikipedia, in the pseudoinverse of a product where one factor is an orthogonal projection, the orthogonal projection can be redundantly multiplied to the opposite side, meaning here $$ X_\Delta^+ = [(I_n - X_0 X_0^+) X]^+ = [(I_n - X_0 X_0^+) X]^+ (I_n - X_0 X_0^+) = X_\Delta^+ (I_n - X_0 X_0^+), $$ but that doesn't seem to help.

I can prove that $M$ is symmetric and idempotent, using the relations $$ X X^+ X_0 = X_0 \quad \text{and} \quad X_0 X_0^+ X X^+ = X_0 X_0^+, $$ which derive from the definition of $X_0$ and the properties of the pseudoinverse. I can also show that $$ X X_0^+ = X_0 X_0^+ $$ using the property of the pseudoinverse of a product involving an orthogonal projection (see above). But none of that helps either.

linear-algebra numerical-linear-algebra pseudoinverse projection-matrices

typo

edited Apr 1, 2019 at 14:51

1.5k
10
24

Premise: I have an $n × q$ design matrix $X$ of a linear model and a $q × a$ contrast matrix $C$ specifying an effect of interest, with $n > q > a$.

I'm interested in the structure of the matrix $$ M = X X^- - X_0 X_0^- $$ where the superscript $^-$ indicates the Moore–Penrose pseudoinverse and $$ X_0 = X (I_q - C C^-) $$ is the reduced design matrix corresponding to the case that the effect of interest is 0. This matrix $M$ occurs in the formulation of standard test statistics like univariate $F$ and multivariate Hotelling's trace, and can be seen as extracting that part of the data that is accounted for by the effect indicated by $C$.

I assume that the contrast is estimable, meaning that $$ X^- X C = C, $$ or that $C'$ lies in the row space of $X$, and that $X$ is of full rank and therefore $X^- = (X' X)^{-1} X$ (where $'$ indicates the transpose).

Question: $M$ is the difference of two orthogonal projection matrices, and because $C$ is estimable, it is an orthogonal projection matrix itself (symmetric and idempotent). $M$ is an $n × n$ matrix, but it is of the same rank as $C$, $$ \DeclareMathOperator{\rk}{rank} \rk M = \rk C = c \leq a < n, $$ which means its spectral representation can be written in terms of $c$ $n$-dimensional vectors (eigenvectors whose eigenvalues are 1).

Is there a way to obtain these $c$ vectors without explicitly calculating $M$ and then its spectral decomposition?

The question is motivated by an application in data analysis, where I need to store a large number of such matrices.

My own attempts: In my view, the key to this question lies in the simplification of the expression $$ X_0^- = [X (I_q - C C^-)]^-. $$ If I could reformulate $X_0^-$ in terms of $X^-$ and a factor depending on $C$ only, I could unify the two terms in the definition of $M$ and hopefully "guess" its spectral decomposition. In other words, how can I understand the effect of $C$ not on $X$ (leading to $X_0$) but on $X^-$ leading to $X_0^-$?

I tried to use the properties of the Moore–Penrose pseudoinverse with respect to products and with respect to projections, but as far as I can tell, none of them help.

Premise: I have an $n × q$ design matrix $X$ of a linear model and a $q × a$ contrast matrix $C$ specifying an effect of interest, with $n > q > a$.

I'm interested in the structure of the matrix $$ M = X X^- - X_0 X_0^- $$ where the superscript $^-$ indicates the Moore–Penrose pseudoinverse and $$ X_0 = X (I_q - C C^-) $$ is the reduced design matrix corresponding to the case that the effect of interest is 0. This matrix $M$ occurs in the formulation of standard test statistics like univariate $F$ and multivariate Hotelling's trace, and can be seen as extracting that part of the data that is accounted for by the effect indicated by $C$.

I assume that the contrast is estimable, meaning that $$ X^- X C = C, $$ or that $C'$ lies in the row space of $X$, and that $X$ is of full rank and therefore $X^- = (X' X)^{-1} X$ (where $'$ indicates the transpose).

Question: $M$ is the difference of two orthogonal projection matrices, and because $C$ is estimable, it is an orthogonal projection matrix itself (symmetric and idempotent). $M$ is an $n × n$ matrix, but it of the same rank as $C$, $$ \DeclareMathOperator{\rk}{rank} \rk M = \rk C = c \leq a < n, $$ which means its spectral representation can be written in terms of $c$ $n$-dimensional vectors (eigenvectors whose eigenvalues are 1).

Is there a way to obtain these $c$ vectors without explicitly calculating $M$ and then its spectral decomposition?

The question is motivated by an application in data analysis, where I need to store a large number of such matrices.

My own attempts: In my view, the key to this question lies in the simplification of the expression $$ X_0^- = [X (I_q - C C^-)]^-. $$ If I could reformulate $X_0^-$ in terms of $X^-$ and a factor depending on $C$ only, I could unify the two terms in the definition of $M$ and hopefully "guess" its spectral decomposition. In other words, how can I understand the effect of $C$ not on $X$ (leading to $X_0$) but on $X^-$ leading to $X_0^-$?

I tried to use the properties of the Moore–Penrose pseudoinverse with respect to products and with respect to projections, but as far as I can tell, none of them help.

Premise: I have an $n × q$ design matrix $X$ of a linear model and a $q × a$ contrast matrix $C$ specifying an effect of interest, with $n > q > a$.

I'm interested in the structure of the matrix $$ M = X X^- - X_0 X_0^- $$ where the superscript $^-$ indicates the Moore–Penrose pseudoinverse and $$ X_0 = X (I_q - C C^-) $$ is the reduced design matrix corresponding to the case that the effect of interest is 0. This matrix $M$ occurs in the formulation of standard test statistics like univariate $F$ and multivariate Hotelling's trace, and can be seen as extracting that part of the data that is accounted for by the effect indicated by $C$.

I assume that the contrast is estimable, meaning that $$ X^- X C = C, $$ or that $C'$ lies in the row space of $X$, and that $X$ is of full rank and therefore $X^- = (X' X)^{-1} X$ (where $'$ indicates the transpose).

Question: $M$ is the difference of two orthogonal projection matrices, and because $C$ is estimable, it is an orthogonal projection matrix itself (symmetric and idempotent). $M$ is an $n × n$ matrix, but it is of the same rank as $C$, $$ \DeclareMathOperator{\rk}{rank} \rk M = \rk C = c \leq a < n, $$ which means its spectral representation can be written in terms of $c$ $n$-dimensional vectors (eigenvectors whose eigenvalues are 1).

Is there a way to obtain these $c$ vectors without explicitly calculating $M$ and then its spectral decomposition?

The question is motivated by an application in data analysis, where I need to store a large number of such matrices.

My own attempts: In my view, the key to this question lies in the simplification of the expression $$ X_0^- = [X (I_q - C C^-)]^-. $$ If I could reformulate $X_0^-$ in terms of $X^-$ and a factor depending on $C$ only, I could unify the two terms in the definition of $M$ and hopefully "guess" its spectral decomposition. In other words, how can I understand the effect of $C$ not on $X$ (leading to $X_0$) but on $X^-$ leading to $X_0^-$?

I tried to use the properties of the Moore–Penrose pseudoinverse with respect to products and with respect to projections, but as far as I can tell, none of them help.

asked Mar 29, 2019 at 21:27

1.5k
10
24

Loading