I've been trying to understand $\sigma$-algebras and how it encodes information in context of filtration. While certain parts seem clear and logical, I can't say I get the whole picture.
I'll try to explain the counter-intuition I get with the classical example of the coin tossingthe coin tossing: the probability space $\Omega = \{ HH, HT, TH, TT \}$ and a r.v. $X(\omega)$ equal to the number of heads.
At times $0$, $1$ and $2$ the available information is represented using $\sigma$-algebras $\mathcal{F}_0=\{\emptyset,\Omega\}$, $\mathcal{F}_1=\{\emptyset, \Omega, \{HH,HT\},\{TH,TT\}\}$ and $\mathcal{F}_2=\{\emptyset, \Omega,\{HH,HT\},\{TH,TT\},\{HH\},\{HT\},\{TH\},\{TT\}\}$.
One can notice that $X(\omega)$ is not measurable with respect to $\mathcal{F}_0$ and $\mathcal{F}_1$, because $X^{-1}((\frac{3}{2}; +\infty))=\{HH\}$. To me it is quite surprising: intuitively $X$ makes perfect sense at all times. In particular it has an expected value at time $0$, which I interpret as that the probability and value of all outcomes $\{\omega\}$ can be computed. How do you think of a non-measurable function?
Here's another way of expressing the same confusion. The most natural choice of $\sigma$-algebra in a finite discrete case is $\mathcal{F}=2^\Omega$, and it is implicitly used in all elementary probability problems. However, this choice of $\mathcal{F}$ does not reflect the fact that some information is known or unknown, conditional probability does. Does it mean that the statement "$\sigma$-algebra is known information" make sense only in conditioning? Why is it convenient then?