Skip to main content
added [validation], improved formatting
Source Link
Jan Kukacka
  • 11.6k
  • 1
  • 45
  • 71

Similar to this question (hyperparameter tuning in neural networks), I have a neural network with a similar list of parameters as the link above:

learning rate: [0.001, 0.01, 0.1] l_1 penalty: [0.01, 0.05, 0.1, 0.5] early stopping tolerance: [0.0001, 0.001, 0.01] 
  • Learning rate: $[0.001, 0.01, 0.1]$
  • $L_1$ penalty: $[0.01, 0.05, 0.1, 0.5]$
  • Early stopping tolerance: $[0.0001, 0.001, 0.01]$

The paper I'm replicating didn't use dropout, but they also didn't specify exactly how they've done hyperparameter tuning. So I've reserved a portion of data for choosing learning rate and L1 penalty, but for how many epochs do I train?

This is where early stopping comes in. I can either further split my training data and use a smaller portion just for early stopping purposes. Or I can use my larger validation set for early stopping and use the validation error for when training is stopped to also choose my hyperparameters. Conceptually, I would train my model solely in the training set and choose hyperparameters using the validation set, but having training stopped-early and choose hyperparameters at the same time seem to require the supposedly "unseen" data during training. Which method should I use?

Similar to this question (hyperparameter tuning in neural networks), I have a neural network with a similar list of parameters as the link above:

learning rate: [0.001, 0.01, 0.1] l_1 penalty: [0.01, 0.05, 0.1, 0.5] early stopping tolerance: [0.0001, 0.001, 0.01] 

The paper I'm replicating didn't use dropout, but they also didn't specify exactly how they've done hyperparameter tuning. So I've reserved a portion of data for choosing learning rate and L1 penalty, but for how many epochs do I train?

This is where early stopping comes in. I can either further split my training data and use a smaller portion just for early stopping purposes. Or I can use my larger validation set for early stopping and use the validation error for when training is stopped to also choose my hyperparameters. Conceptually, I would train my model solely in the training set and choose hyperparameters using the validation set, but having training stopped-early and choose hyperparameters at the same time seem to require the supposedly "unseen" data during training. Which method should I use?

Similar to this question (hyperparameter tuning in neural networks), I have a neural network with a similar list of parameters as the link above:

  • Learning rate: $[0.001, 0.01, 0.1]$
  • $L_1$ penalty: $[0.01, 0.05, 0.1, 0.5]$
  • Early stopping tolerance: $[0.0001, 0.001, 0.01]$

The paper I'm replicating didn't use dropout, but they also didn't specify exactly how they've done hyperparameter tuning. So I've reserved a portion of data for choosing learning rate and L1 penalty, but for how many epochs do I train?

This is where early stopping comes in. I can either further split my training data and use a smaller portion just for early stopping purposes. Or I can use my larger validation set for early stopping and use the validation error for when training is stopped to also choose my hyperparameters. Conceptually, I would train my model solely in the training set and choose hyperparameters using the validation set, but having training stopped-early and choose hyperparameters at the same time seem to require the supposedly "unseen" data during training. Which method should I use?

Source Link
stevew
  • 841
  • 6
  • 13

Early stopping together with hyperparameter tuning in neural networks

Similar to this question (hyperparameter tuning in neural networks), I have a neural network with a similar list of parameters as the link above:

learning rate: [0.001, 0.01, 0.1] l_1 penalty: [0.01, 0.05, 0.1, 0.5] early stopping tolerance: [0.0001, 0.001, 0.01] 

The paper I'm replicating didn't use dropout, but they also didn't specify exactly how they've done hyperparameter tuning. So I've reserved a portion of data for choosing learning rate and L1 penalty, but for how many epochs do I train?

This is where early stopping comes in. I can either further split my training data and use a smaller portion just for early stopping purposes. Or I can use my larger validation set for early stopping and use the validation error for when training is stopped to also choose my hyperparameters. Conceptually, I would train my model solely in the training set and choose hyperparameters using the validation set, but having training stopped-early and choose hyperparameters at the same time seem to require the supposedly "unseen" data during training. Which method should I use?