One may be tempted to use the entire training data to select the “optimal” classifier, then estimate the error rate This naïve approach has two fundamental problems The final model will normally overfit the training data: it will not be able to generalize to new data The problem of overfitting is more pronounced with models that have a large number of parameters The error rate estimate will be overly optimistic (lower than the true error rate) In fact, it is not uncommon to have 100% correct classification on training data The techniques presented in this lecture will allow you to make the best use of your (limited) data for Training Model selection and Performance estimation