Questions tagged [training-error]
The training-error tag has no summary.
76 questions
0 votes
0 answers
47 views
How do you pick the best model out of a set of models if you knew both the training error and validation error of each model?
An interesting question I stumbled upon today is this: Suppose I train models $m_1, m_2, m_3, \ldots, m_N$, where each $m_i$, $i = 1, \ldots, N$ is associated with a hyperparameter $i$. All models are ...
1 vote
1 answer
135 views
What is the current consensus on "using test set as training set, post testing"? [duplicate]
This question is inspired by a blog post by https://www.argmin.net/p/in-defense-of-typing-monkeys and several rumors I've heard from other people who works in machine learning. The gist of it is that ...
0 votes
0 answers
80 views
Is using the TEST set to calculate the optimal threshold for binary classification and then calculating the accuracy on the same test set wrong
I have a dataset that has been split into 2 parts, train and test set. After training a model with the training set to classify between class 0 and 1, I used the sklearn roc_curve to calculate the ...
0 votes
0 answers
47 views
How to Choose Regularization Hyperparameters Based on Consistency vs. Accuracy in Model Training?
I’m training a machine learning model and optimizing the regularization hyperparameters to ensure the model generalizes well. During training, I include regularization terms in the loss function to ...
4 votes
1 answer
140 views
How do machine learning topics fit into a traditional undergraduate statistics course on estimation?
I'm recently teaching an undergraduate introduction to statistics course, but as required by program director, need to add some machine learning materials to it. I'm wondering what is the appropriate ...
0 votes
0 answers
50 views
Model training loss always converge to 1.35
I'm trying to create a multi-class classification model using RNNs. The input data has a sequence length of 90 and consists of 5 features, normalized to the [0,1] range. Here's the network ...
2 votes
1 answer
109 views
Training loss reach to zero, then suddenly increases, then decreases to zero
I get the following loss behavior when training multilayer perceptron with mean squared error loss on some synthetic data using default Adam with default learning. (I am working on 1 demention data) I ...
1 vote
0 answers
34 views
Reference Request: Rate at which Training Error goes to 0
Looking for references on which training error goes to 0. As an example, suppose that we have a linear model $y = X \beta_0 + \epsilon,$ where $\beta_0$ is bounded and each row of $X$ is generated ...
1 vote
1 answer
113 views
Why does the performance on the training set go down as the number of samples increases?
To my knowledge there are two types of learning curves, those that show the progression of the performance as the amount of epochs progresses, and those that show the performance progression as the ...
1 vote
0 answers
178 views
What is a good way to understand the difference between training RSS/MSE and test RSS for polynomial models?
I am trying to understand the difference for training RSS and test RSS for smoothing splines e.g. $$\hat{g}=\arg \min_g \left(\Sigma_{i=1}^{n}(y_{i} - g(x_i))^2 + \lambda \int [g^{(m)}(x)]^{2}dx \...
1 vote
0 answers
89 views
Peaks in error during training - Regression problem with DL model (LSTM)
During training I get unusual behavior of my model. Peaks show up in the both validation and training errors that I could not understand. I use MSE as a loss 1 and similar behavior appears in other ...
3 votes
0 answers
504 views
Shuffling data significantly decrease the performance of linear regression
I'm trying to build a simple linear regression model $y=ax+b$ using pytorch, where $y$ is the number of cells increased on Day $n$ and $x$ is the number of cells ...
4 votes
1 answer
3k views
Is it possible to have a higher train error than a test error in machine learning?
Usually it is called over-fitting when the test error is higher than the training error. Does that imply that it is called under-fitting when the training error is higher than the test error? Also ...
0 votes
0 answers
102 views
PRESS statistic and k-fold cross-validation
How is the PRESS statistic calculated in a k-fold cross-validation? I know how it is done in the leave-one-out scenario. Is it still summed over all training samples, just that there are now k-many ...
1 vote
0 answers
183 views
How to handle problem of different random seeds giving drastically different test scores in machine learning model?
For a rigorous empirical analysis, I am training a model with three different seeds - 0, 1 and 2. In each case, I found that the model obtained through early stopping (lowest validation loss) had an ...