Return to Answer

added 434 characters in body

edited Jul 18, 2018 at 12:04

109.1k
37
325
350

The issues are indeed subtle. But it is definitely not true that LOOCV has larger variance in general. A recent paper discusseddiscusses some key aspects and addresses several seemingly widespread misconceptions on cross-validation.

Yongli Zhang and Yuhong Yang (2015). Cross-validation for selecting a model selection procedure. Journal of Econometrics, vol. 187, 95-112.

The following misconceptions are frequently seen in the literature, even up to now:

"Leave-one-out (LOO) CV has smaller bias but larger variance than leave- more-out CV"

This view is quite popular. For instance, Kohavi (1995, Section 1) states: "For example, leave-one-out is almost unbiased, but it has high variance, leading to unreliable estimates". The statement, however, is not generally true.

It also addresses several seemingly widespread misconceptions on cross-validation.In more detail:

In the literature, even including recent publications, there are overly taken recommendations. The general suggestion of Kohavi (1995) to use 10-fold CV has been widely accepted. For instance, Krstajic et al (2014, page 11) state: “Kohavi [6] and Hastie et al [4] empirically show that V-fold cross-validation compared to leave-one-out cross-validation has lower variance”. They consequently take the recommendation of 10-fold CV (with repetition) for all their numerical investigations. In our view, such a practice may be misleading. First, there should not be any general recommendation that does not take into account of the goal of the use of CV. In particular, examination of bias and variance of CV accuracy estimation of a candidate model/modeling procedure can be a very different matter from optimal model selection (with either of the two goals of model selection stated earlier). Second, even limited to the accuracy estimation context, the statement is not generally correct. For models/modeling procedures with low instability, LOO often has the smallest variability. We have also demonstrated that for highly unstable procedures (e.g., LASSO with pn much larger than n), the 10-fold or 5-fold CVs, while reducing variability, can have significantly larger MSE than LOO due to even worse bias increase.

Overall, from Figures 3-4, LOO and repeated 50- and 20-fold CVs are the best here, 10-fold is significantly worse, and k ≤ 5 is clearly poor. For predictive performance estimation, we tend to believe that LOO is typically the best or among the best for a fixed model or a very stable modeling procedure (such as BIC in our context) in both bias and variance, or quite close to the best in MSE for a more unstable procedure (such as AIC or even LASSO with p ≫ n). While 10-fold CV (with repetitions) certainly can be the best sometimes, but more frequently, it is in an awkward position: it is riskier than LOO (due to the bias problem) for prediction error estimation and it is usually worse than delete-n/2 CV for identifying the best candidate.

The issues are indeed subtle. But it is definitely not true that LOOCV has larger variance in general. A recent paper discussed some key aspects.

Yongli Zhang and Yuhong Yang (2015). Cross-validation for selecting a model selection procedure. Journal of Econometrics, vol. 187, 95-112.

It also addresses several seemingly widespread misconceptions on cross-validation.

In the literature, even including recent publications, there are overly taken recommendations. The general suggestion of Kohavi (1995) to use 10-fold CV has been widely accepted. For instance, Krstajic et al (2014, page 11) state: “Kohavi [6] and Hastie et al [4] empirically show that V-fold cross-validation compared to leave-one-out cross-validation has lower variance”. They consequently take the recommendation of 10-fold CV (with repetition) for all their numerical investigations. In our view, such a practice may be misleading. First, there should not be any general recommendation that does not take into account of the goal of the use of CV. In particular, examination of bias and variance of CV accuracy estimation of a candidate model/modeling procedure can be a very different matter from optimal model selection (with either of the two goals of model selection stated earlier). Second, even limited to the accuracy estimation context, the statement is not generally correct. For models/modeling procedures with low instability, LOO often has the smallest variability. We have also demonstrated that for highly unstable procedures (e.g., LASSO with pn much larger than n), the 10-fold or 5-fold CVs, while reducing variability, can have significantly larger MSE than LOO due to even worse bias increase.

Overall, from Figures 3-4, LOO and repeated 50- and 20-fold CVs are the best here, 10-fold is significantly worse, and k ≤ 5 is clearly poor. For predictive performance estimation, we tend to believe that LOO is typically the best or among the best for a fixed model or a very stable modeling procedure (such as BIC in our context) in both bias and variance, or quite close to the best in MSE for a more unstable procedure (such as AIC or even LASSO with p ≫ n). While 10-fold CV (with repetitions) certainly can be the best sometimes, but more frequently, it is in an awkward position: it is riskier than LOO (due to the bias problem) for prediction error estimation and it is usually worse than delete-n/2 CV for identifying the best candidate.

The issues are indeed subtle. But it is definitely not true that LOOCV has larger variance in general. A recent paper discusses some key aspects and addresses several seemingly widespread misconceptions on cross-validation.

Yongli Zhang and Yuhong Yang (2015). Cross-validation for selecting a model selection procedure. Journal of Econometrics, vol. 187, 95-112.

The following misconceptions are frequently seen in the literature, even up to now:

"Leave-one-out (LOO) CV has smaller bias but larger variance than leave- more-out CV"

This view is quite popular. For instance, Kohavi (1995, Section 1) states: "For example, leave-one-out is almost unbiased, but it has high variance, leading to unreliable estimates". The statement, however, is not generally true.

In more detail:

In the literature, even including recent publications, there are overly taken recommendations. The general suggestion of Kohavi (1995) to use 10-fold CV has been widely accepted. For instance, Krstajic et al (2014, page 11) state: “Kohavi [6] and Hastie et al [4] empirically show that V-fold cross-validation compared to leave-one-out cross-validation has lower variance”. They consequently take the recommendation of 10-fold CV (with repetition) for all their numerical investigations. In our view, such a practice may be misleading. First, there should not be any general recommendation that does not take into account of the goal of the use of CV. In particular, examination of bias and variance of CV accuracy estimation of a candidate model/modeling procedure can be a very different matter from optimal model selection (with either of the two goals of model selection stated earlier). Second, even limited to the accuracy estimation context, the statement is not generally correct. For models/modeling procedures with low instability, LOO often has the smallest variability. We have also demonstrated that for highly unstable procedures (e.g., LASSO with pn much larger than n), the 10-fold or 5-fold CVs, while reducing variability, can have significantly larger MSE than LOO due to even worse bias increase.

Overall, from Figures 3-4, LOO and repeated 50- and 20-fold CVs are the best here, 10-fold is significantly worse, and k ≤ 5 is clearly poor. For predictive performance estimation, we tend to believe that LOO is typically the best or among the best for a fixed model or a very stable modeling procedure (such as BIC in our context) in both bias and variance, or quite close to the best in MSE for a more unstable procedure (such as AIC or even LASSO with p ≫ n). While 10-fold CV (with repetitions) certainly can be the best sometimes, but more frequently, it is in an awkward position: it is riskier than LOO (due to the bias problem) for prediction error estimation and it is usually worse than delete-n/2 CV for identifying the best candidate.

added 2150 characters in body

Source Link

edited Jan 11, 2017 at 22:02

amoeba

109.1k
37
325
350

The issues are indeed subtle. But it is definitely not true that LOOCV has larger variance in general. A recent paper discussed some key aspects.

Yongli Zhang and Yuhong Yang (2015). Cross-validation for selecting a model selection procedure. Journal of Econometrics, vol. 187, 95-112.Yongli Zhang and Yuhong Yang (2015). Cross-validation for selecting a model selection procedure. Journal of Econometrics, vol. 187, 95-112.

It also addresses several seemingly widespread misconceptions on cross-validation.

In the literature, even including recent publications, there are overly taken recommendations. The general suggestion of Kohavi (1995) to use 10-fold CV has been widely accepted. For instance, Krstajic et al (2014, page 11) state: “Kohavi [6] and Hastie et al [4] empirically show that V-fold cross-validation compared to leave-one-out cross-validation has lower variance”. They consequently take the recommendation of 10-fold CV (with repetition) for all their numerical investigations. In our view, such a practice may be misleading. First, there should not be any general recommendation that does not take into account of the goal of the use of CV. In particular, examination of bias and variance of CV accuracy estimation of a candidate model/modeling procedure can be a very different matter from optimal model selection (with either of the two goals of model selection stated earlier). Second, even limited to the accuracy estimation context, the statement is not generally correct. For models/modeling procedures with low instability, LOO often has the smallest variability. We have also demonstrated that for highly unstable procedures (e.g., LASSO with pn much larger than n), the 10-fold or 5-fold CVs, while reducing variability, can have significantly larger MSE than LOO due to even worse bias increase.

Overall, from Figures 3-4, LOO and repeated 50- and 20-fold CVs are the best here, 10-fold is significantly worse, and k ≤ 5 is clearly poor. For predictive performance estimation, we tend to believe that LOO is typically the best or among the best for a fixed model or a very stable modeling procedure (such as BIC in our context) in both bias and variance, or quite close to the best in MSE for a more unstable procedure (such as AIC or even LASSO with p ≫ n). While 10-fold CV (with repetitions) certainly can be the best sometimes, but more frequently, it is in an awkward position: it is riskier than LOO (due to the bias problem) for prediction error estimation and it is usually worse than delete-n/2 CV for identifying the best candidate.

The issues are indeed subtle. But it is definitely not true that LOOCV has larger variance in general. A recent paper discussed some key aspects.

Yongli Zhang and Yuhong Yang (2015). Cross-validation for selecting a model selection procedure. Journal of Econometrics, vol. 187, 95-112.

It also addresses several seemingly widespread misconceptions on cross-validation.

The issues are indeed subtle. But it is definitely not true that LOOCV has larger variance in general. A recent paper discussed some key aspects.

Yongli Zhang and Yuhong Yang (2015). Cross-validation for selecting a model selection procedure. Journal of Econometrics, vol. 187, 95-112.

It also addresses several seemingly widespread misconceptions on cross-validation.

In the literature, even including recent publications, there are overly taken recommendations. The general suggestion of Kohavi (1995) to use 10-fold CV has been widely accepted. For instance, Krstajic et al (2014, page 11) state: “Kohavi [6] and Hastie et al [4] empirically show that V-fold cross-validation compared to leave-one-out cross-validation has lower variance”. They consequently take the recommendation of 10-fold CV (with repetition) for all their numerical investigations. In our view, such a practice may be misleading. First, there should not be any general recommendation that does not take into account of the goal of the use of CV. In particular, examination of bias and variance of CV accuracy estimation of a candidate model/modeling procedure can be a very different matter from optimal model selection (with either of the two goals of model selection stated earlier). Second, even limited to the accuracy estimation context, the statement is not generally correct. For models/modeling procedures with low instability, LOO often has the smallest variability. We have also demonstrated that for highly unstable procedures (e.g., LASSO with pn much larger than n), the 10-fold or 5-fold CVs, while reducing variability, can have significantly larger MSE than LOO due to even worse bias increase.

Overall, from Figures 3-4, LOO and repeated 50- and 20-fold CVs are the best here, 10-fold is significantly worse, and k ≤ 5 is clearly poor. For predictive performance estimation, we tend to believe that LOO is typically the best or among the best for a fixed model or a very stable modeling procedure (such as BIC in our context) in both bias and variance, or quite close to the best in MSE for a more unstable procedure (such as AIC or even LASSO with p ≫ n). While 10-fold CV (with repetitions) certainly can be the best sometimes, but more frequently, it is in an awkward position: it is riskier than LOO (due to the bias problem) for prediction error estimation and it is usually worse than delete-n/2 CV for identifying the best candidate.