Role of $\sigma(X)$ in the theorem that the conditional expectation $\mathbb{E}[Y|X]$ minimizes mean sqaured error

Question

So it is very well known that the conditional expectation minimizes the mean squared error. That is, given a probability space $(\Omega, \mathcal{A}, \mathbb{P})$ and random variables $X:(\Omega, \mathcal{A})\rightarrow (\mathcal{X}, \mathcal{F})$ and $Y:(\Omega, \mathcal{A})\rightarrow (\mathcal{Y}, \mathcal{G})$. Let $f:(\mathcal{X}, \mathcal{F}) \rightarrow (\mathcal{Y}, \mathcal{G})$ be measurable. The statement is then: $$\mathbb{E}[Y|X]\in \arg \min_{f}\mathbb{E}[(Y-f(X))^2] $$ The following statement is also true: $$ \mathbb{E}[Y|\mathcal{H}]\in \arg \min_{\{Z:(\Omega, \mathcal{H})\rightarrow (\mathcal{Y}, \mathcal{G})\}}\mathbb{E}[(Y-Z)^2],$$ for a sub-sigma algebra $\mathcal{H}\subseteq\mathcal{A}$.

I cannot bring these two statements into accordance. So my confusion comes from the fact that $\mathbb{E}[Y|X]:= \mathbb{E}[Y|\sigma(X)]$, where $\sigma(X)$ is the sigma-algebra generated by $X$ with $\sigma(X)\subseteq\mathcal{A}$. Hence, I would somehow assume that $f\circ X $ must be a mapping $(\Omega, \sigma(X)) \rightarrow (\mathcal{Y}, \mathcal{G})$. But this is not the case since $X:(\Omega, \mathcal{A})\rightarrow (\mathcal{X}, \mathcal{F})$. Can someone help?

EDIT: Thanks to the hint by Pantelis Tassopoulos I realized that I need the factorization lemma here: https://en.wikipedia.org/wiki/Doob%E2%80%93Dynkin_lemma

This lemma says for $(\mathcal{X}, \mathcal{F})\subseteq(\bar{\mathbb{R}}, \mathcal{B}(\bar{\mathbb{R}}))$ and $(\mathcal{Y}, \mathcal{G})\subseteq(\bar{\mathbb{R}}, \mathcal{B}(\bar{\mathbb{R}}))$ that $Z=f(X)$ is $\sigma(X)-\mathcal{G}$ measurable if and only if there exists a $\mathcal{F}-\mathcal{G}$-measurable function f such that $Z=f(X)$. Now, of course, $X$ is always $\sigma(X)-\mathcal{F}$-measurable. But since here we have even assumed that $X$ is $\mathcal{A}-\mathcal{F}$ measurable (which is not required for the lemma) we also know that $\sigma(X)\subseteq \mathcal{A}$. Hence due to the lemma we know that $Z=f(X)$ is $\sigma(X)-\mathcal{G}$ measureable. In fact we wouldnt even need the factorization lemma but only the fact that the composition of measureable maps is again measurable, in this case, that $Z=f(X)$ is $\sigma(X)-\mathcal{G}$ measureable.

BUT: $Z=f(X)$ is also $\mathcal{A}-\mathcal{G}$ measurable. In that case we would get as the argmin $\mathbb{E}[Y|\mathcal{A}] = \mathbb{E}[Y]$ (Is this equality actually true?). How do I now know that I should be using $f(X)$ as a $\sigma(X)-\mathcal{G}$ measurable function insted of a $\mathcal{A}-\mathcal{G}$ measurable here? I.e., if I just look at the problem $$\arg \min_{f}\mathbb{E}[(Y-f(X))^2] =?$$ I would right away treat $f(X)$ as a $\mathcal{A}-\mathcal{G}$ measurable function and hence I would get the wrong result.

This is well known. In fact some authors define conditional expectation in terms of orthogonal projections. Here is a potting along those lines — Mittens
– Mittens, Commented Apr 23 at 15:00
@Mittens Thanks for your comment. I have updated my question to make it more clear. Maybe you could say something about that — guest1
– guest1, Commented Apr 24 at 12:04

Pantelis Tassopoulos · Accepted Answer · 2025-04-24 15:30:24Z

The thing to observe is that (real-valued, say) $\sigma(X)$-measurable random variables are exactly the compositions of $X$ with some Borel measurable $f:\mathbb{R} \to \mathbb{R}$, see this post and this Wikipedia page).

This would give $$\mathrm{argmin}_{f}\mathbb{E}[(Y-f(X))^2] = \mathrm{argmin}_{Z:(\Omega, \sigma(X))\to (\mathcal{Y}, \mathcal{G})}\mathbb{E}[(Y-Z)^2]\,.$$

If you used $f$ as an $\mathcal{A}-\mathcal{G}$ measurable function, then to obtain the conditional expectation with respect to $\mathcal{A}$, you would have to use $\mathrm{argmin}_{Z:(\Omega, \mathcal{A})\to (\mathcal{Y}, \mathcal{G})}\mathbb{E}[(Y-Z)^2]$ instead of $\mathrm{argmin}_{f}\mathbb{E}[(Y-f(X))^2]$, since in general the set of $\mathcal{A}-\mathcal{G}$-measurable functions contains at least as many functions that the set of $\sigma(X)-\mathcal{G}$ ones.

Thank you. So you are referring to the factorization lemma by Doob? — guest1
– guest1, Commented Apr 24 at 10:02
I have Edited my question. Could you maybe look at my updated question? — guest1
– guest1, Commented Apr 24 at 11:18
@guest1, I have edited my answer. I hope this clears any confusion. — Pantelis Tassopoulos
– Pantelis Tassopoulos, Commented Apr 24 at 15:30
Thanks for the update! What I do not understand here is the following: You write that if we take $f$ as a $\mathcal{A}-\mathcal{G}$-measurable function, then we have to use $\arg \min_{Z:(\Omega, \mathcal{A}\rightarrow (\mathcal{Y}, \mathcal{G})}\mathbb{E}[(Y-Z)^2]$. So the question is: Can we not write Z in that case a as a function of $X$? I.e., does here not hold that $Z=f(X)$? — guest1
– guest1, Commented Apr 25 at 7:47

Stack Exchange Network

Role of $\sigma(X)$ in the theorem that the conditional expectation $\mathbb{E}[Y|X]$ minimizes mean sqaured error

1 Answer 1

You must log in to answer this question.

Linked

Hot Network Questions

Role of $\sigma(X)$ in the theorem that the conditional expectation $\mathbb{E}[Y|X]$ minimizes mean sqaured error

1 Answer 1

You must log in to answer this question.

Linked

Related

Hot Network Questions