Skip to main content

You are not logged in. Your edit will be placed in a queue until it is peer reviewed.

We welcome edits that make the post easier to understand and more valuable for readers. Because community members review edits, please try to make the post substantially better than how you found it, for example, by fixing grammar or adding additional resources and hyperlinks.

3
  • $\begingroup$ Personally, I would not use the phrase "external cross validation", as I would see cross validation as the repeated splitting off of different validation sets from the training set for model selection and tuning purposes. You cannot meaningfully do this repeatedly with the test set, as that is as a one-off proxy for future as-yet-unknown data used to judge the performance of the final model. $\endgroup$ Commented Aug 26, 2015 at 10:11
  • 3
    $\begingroup$ Henry, I don't think you are understanding external cross validation. You can "do this repeatedly with the test set," repeatedly holding out some portion of your full data for test purposes while executing your full training procedure on the rest (which may include internal cross validation). External cross validation is still typically done in folds, and allows for all of the original data to at some point be in the test set. $\endgroup$ Commented Aug 26, 2015 at 11:33
  • $\begingroup$ @jlimahaverford why the internal cv does not produce good measure of the actual algorithm performance? I think mathematically it is. Because when you fit your model, say LASSO with specific hyperparameter $\alpha$, then you apply it on another piece of the data (that you didn't use to fit your model) in the internal cv process, then this error rate should be an unbiased estimator of the true prediction error rate for your specific $\alpha$, isn't it? $\endgroup$ Commented Mar 27, 2017 at 0:55