Timeline for Cross-validation for (hyper)parameter tuning to be performed in validation set or training set?

Current License: CC BY-SA 4.0

15 events

when toggle format	what		by	license	comment
Sep 21, 2021 at 9:04	history	bumped	CommunityBot		This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Jul 19, 2021 at 18:48	answer	added	Atul Mishra		timeline score: 2
Jul 19, 2021 at 18:00	history	bumped	CommunityBot		This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Jul 2, 2021 at 0:00	history	tweeted			twitter.com/StackStats/status/1410750117750050822
Mar 17, 2021 at 20:11	comment	added	Firebug		Option #2 is the most correct, but there's a catch: you train the "test model" (the one that will be applied to test data) on validation + training sets. The final model (production model that will be used on unseen data) is trained on the full dataset (training+validation+testing). It's a tiered approach: the hyperparameters are defined in an inner loop, the expected generalization performance is estimated in an outer loop and the final model comes from all data.
Mar 17, 2021 at 20:04	history	bumped	CommunityBot		This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
S Feb 9, 2021 at 2:28	history	suggested	Shayan Shafiq		Edited tags
Feb 8, 2021 at 18:30	review	Suggested edits
S Feb 9, 2021 at 2:28
Sep 25, 2019 at 20:52	answer	added	desertnaut		timeline score: 0
Sep 18, 2019 at 10:53	comment	added	user2974951		That is a good question. Have a look at stats.stackexchange.com/questions/19048/…
Sep 17, 2019 at 14:53	comment	added	KubiK888		Aside from data scarcity issue, what about overfitting the parameter based on the validation data? Why is it better to overfit in your validation data, then in your training data?
Sep 17, 2019 at 11:30	comment	added	user2974951		Yes that is of course a problem, hence why you would do that only if you can afford it, if you have a large enough sample, otherwise you may be better off with only 2 splits or none.
Sep 16, 2019 at 17:10	comment	added	KubiK888		Could you explain why? If doing parameter tuning in the training set will overfit in the training set, wouldn't doing parameter tuning in the validation set will overfit in the validation set? And now you use this best (from validation set) but potentially overfit parameter set to train the training set, wouldn't this still be a (or even a bigger) problem?
Sep 16, 2019 at 7:06	comment	added	user2974951		Yes, options 2 is better, but it does require more data. So do it if you can afford it.
Sep 14, 2019 at 22:22	history	asked	KubiK888	CC BY-SA 4.0