I am going through the book elements of statistical learning:http://statweb.stanford.edu/~tibs/ElemStatLearn/printings/ESLII_print10.pdf
In chapter 3 on linear regression (page 44), the author mentions that the least squares criterion (from the statistical point of view) is valid if:
- If the training observations $(x_i, y_i)$ represent independent random draws.
Or
- The $y_i$'s are conditionally independent given the inputs $x_i$.
I don't understand this requirement, the criterion seems valid to me no matter what, all it does is measure the goodness of fit in linear terms.
Can anyone explain to me the requirements 1 and 2 ?