136k views
5 votes
Let (Ω,F,P) be a probability space and G⊆F a sub- σ-algebra. Assume furthermore that E[X2]<[infinity] (i) Show that E[(X−Y)2]=E[(X−E[X∣G])2]+E[(E[X∣G]−Y)2] holds for all Y:(Ω,G)→(R,B(R)) with E[Y2]<[infinity] (i.e. all square-integrable, G B(R)-measurable (!) random variables Y ). (ii) Conclude that the expected square distance to X is minimized among the class of squareintegrable, G−B(R)-measurable random variables Y by the choice Y=E[X∣G]. What does this result mean?

User Rolele
by
7.6k points

1 Answer

5 votes

Final Answer

For (i),
\(E[(X−Y)^2]=E[(X−E[X∣G])^2]+E[(E[X∣G]−Y)^2]\)holds for all square-integrable,
\(G B(R)\)-measurable random variables
\(Y\). For (ii), the choice
\(G B(R)\)- minimizes the expected square distance to
\(X\) among square-integrable,
\(G B(R)\)-measurable random variables
\(Y\).

Step-by-step explanation

The equation
\(E[(X−Y)^2]=E[(X−E[X∣G])^2]+E[(E[X∣G]−Y)^2]\) (i) represents the law of total variance. When broken down, it elucidates that the total squared difference between
\(X\) and
\(Y\) is composed of the squared difference between
\(X\) and its conditional expectation given \(G\) plus the squared difference between the conditional expectation of
\(X\)given
\(G\)and
\(Y\). This showcases how the variability of
\(X\)can be decomposed based on the information available in
\(G\).

The choice
\(Y=E[X∣G]\) in (ii) is significant as it results in the minimum expected square distance to
\(X\) among
\(G−B(R)\)-measurable random variables
\(Y\). By setting
\(Y\) as the conditional expectation of
\(X\) given
\(G\),we minimize the variability captured in
\(Y\) with respect to
\(X\)within the class of measurable random variables, highlighting that
\(E[X∣G]\) is the best predictor of
\(X\) within the information captured by
\(G\).

This result implies that, given the available information in the best approximation or prediction for
\(X\) is \(E[X∣G]\).It reflects the efficiency of the conditional expectation in minimizing the expected square distance between a random variable and its approximation, showing its significance in statistical estimation within the given probability space and sub-σ-algebra.

User Akshar Gupta
by
7.6k points