Return to Question

added [validation], improved formatting

edited Aug 19, 2019 at 7:56

11.6k
1
45
71

Similar to this question (hyperparameter tuning in neural networks), I have a neural network with a similar list of parameters as the link above:

learning rate: [0.001, 0.01, 0.1] l_1 penalty: [0.01, 0.05, 0.1, 0.5] early stopping tolerance: [0.0001, 0.001, 0.01]

Learning rate: $[0.001, 0.01, 0.1]$

$L_1$ penalty: $[0.01, 0.05, 0.1, 0.5]$

Early stopping tolerance: $[0.0001, 0.001, 0.01]$

The paper I'm replicating didn't use dropout, but they also didn't specify exactly how they've done hyperparameter tuning. So I've reserved a portion of data for choosing learning rate and L1 penalty, but for how many epochs do I train?

This is where early stopping comes in. I can either further split my training data and use a smaller portion just for early stopping purposes. Or I can use my larger validation set for early stopping and use the validation error for when training is stopped to also choose my hyperparameters. Conceptually, I would train my model solely in the training set and choose hyperparameters using the validation set, but having training stopped-early and choose hyperparameters at the same time seem to require the supposedly "unseen" data during training. Which method should I use?

Similar to this question (hyperparameter tuning in neural networks), I have a neural network with a similar list of parameters as the link above:

learning rate: [0.001, 0.01, 0.1] l_1 penalty: [0.01, 0.05, 0.1, 0.5] early stopping tolerance: [0.0001, 0.001, 0.01]

Similar to this question (hyperparameter tuning in neural networks), I have a neural network with a similar list of parameters as the link above:

Learning rate: $[0.001, 0.01, 0.1]$

$L_1$ penalty: $[0.01, 0.05, 0.1, 0.5]$

Early stopping tolerance: $[0.0001, 0.001, 0.01]$

Source Link

asked Aug 18, 2019 at 14:26

stevew

Early stopping together with hyperparameter tuning in neural networks

Similar to this question (hyperparameter tuning in neural networks), I have a neural network with a similar list of parameters as the link above:

learning rate: [0.001, 0.01, 0.1] l_1 penalty: [0.01, 0.05, 0.1, 0.5] early stopping tolerance: [0.0001, 0.001, 0.01]

neural-networks hyperparameter