Definition: For a random variables $X\in\mathbb R^{d_1}$ and $Y\in\mathbb R^{d_2}$, we define a conditional expectation of $X$ given $Y$ by any random variable $Z$ satisfying:
- there exists $g:\mathbb R^{d_2}\rightarrow\mathbb R^{d_1}$ such that $Z=g(Y)$ and
- $\mathbb E\left[Z\unicode{x1D7D9}_{\{Y\in A\}}\right]=\mathbb E\left[X\unicode{x1D7D9}_{\{Y\in A\}}\right]$ for all $A\subseteq \mathbb R^{d_2}$
To be honest I don't understand the definition. Like
- the reason for requiring $\mathbb E[X|Y]$ to be a function of $Y$
- Why $\mathbb E\left[Z\unicode{x1D7D9}_{\{Y\in A\}}\right]=\mathbb E\left[X\unicode{x1D7D9}_{\{Y\in A\}}\right]$ needed for all $A\subseteq \mathbb R^{d_2}?$
Here is one example they mentioned:
$\Omega=[-1,1]$ and $\mathbb P$ is uniform distribution. Define $$\begin{align}X(\omega)&=-\frac12+\unicode{x1D7D9}_{\{\omega\in[-1,-1/2]\cup[0,1/2]\}}+2\unicode{x1D7D9}_{\{\omega\in[-1/2,0]\}}\\Y(\omega)&=\unicode{x1D7D9}_{\{\omega\geq0\}}\\Z(\omega)&=1-Y(\omega)\end{align}$$ Then $\mathbb E[X|Y]=Z$ and $\mathbb P(X=Z)=0$
I didn't get how to compute conditional expectation using the above definition.
Here is another definition from A First Look at Rigorous Probability Theory, by Jeffrey S. Rosenthal
Definition: If $Y$ is a random variable, and if we define $v $ by $v(S)=\mathbb P(Y\in S|B)=\mathbb P(Y\in S,B)/P(B)$, then $v=\mathcal L(Y|B)$ is a probability measure, called the conditional distribution of $Y$ given $B$. $\mathcal L(Y\unicode{x1D7D9}_{B})=\mathbb P(B)\mathcal L(Y|B)+P(B^c)\delta_0$, so taking expectations and re-arranging, $$\mathbb E(Y|B)=\mathbb E(Y\unicode{x1D7D9}_{B})/\mathbb P(B)$$
Here also I can't understand the role of $v$ and how it creates similar thing with above definition.