Skip to main content

Questions tagged [overfitting]

Modeling error (especially sampling error) instead of replicable and informative relationships among variables improves model fit statistics, but reduces parsimony, and worsens explanatory and predictive validity.

1 vote
0 answers
47 views

Neural Network Beginner here. I am currently implementing a CNN on PyTorch for recognizing Japanese handwritten letters, which has 46 classes of outputs. I found a dataset on Kaggle https://www.kaggle....
Krish Thyagarajan's user avatar
0 votes
0 answers
51 views

There is something I have an intuition on but my numerical toy examples do not confirm, and I really want to understand where is my mistake. I suppose that I have a random vector $X = (X_1, \cdots, ...
arthur_elbrdn's user avatar
3 votes
3 answers
294 views

The title is perhaps purposely provocative, but still reflects my ignorance. I am trying to understand carefully why, despite a very nice Bayesian interpretation, softmax might overfit, since I've ...
Chris's user avatar
  • 322
1 vote
0 answers
28 views

How accurate are the estatimates of an order logit model with only 51 observations? Here is my stata output from the model:
Oindrila Roy's user avatar
0 votes
1 answer
52 views

Learning about EM algorithms and finite mixture models and I've run into a particularly unintuitive problem. I'm trying to fit a finite mixture regression model on simulated data, where the true ...
dancing_monkeys's user avatar
1 vote
0 answers
60 views

So I have a school project which is to train a CNN with our own architecture to be able to classify marine mammals with a minimum accuracy of 0.82 I have been trying a lot of things and different way ...
erodrigu's user avatar
2 votes
0 answers
80 views

Can AUC be used for model selection, and how can the excessive number of features/parameters be penalized in this case? In frequentist framework we have various model selection criteria, like AIC, BIC,...
Roger V.'s user avatar
  • 5,071
1 vote
1 answer
78 views

I am using a GridSearchCV to optimize some hyper parameters on a xgboost model. However, although the logloss (metric I am optimizing for) seems alright according to domain knowledge, the learning ...
user54565's user avatar
1 vote
1 answer
118 views

I'm working on fitting a random forest model using the caret library in R with a repeated cross-validation design to select hyperparameters. I've also experimented with adjusting the number of trees (...
Mdhale's user avatar
  • 133
1 vote
0 answers
54 views

Assume you have training data $(x_1,y_1), \ldots, (x_n,y_n)$ and a relationship $y_i=f(x_i)+\epsilon_i$, where $\epsilon$ is a random variable. Assume you approximate $f$ with $\hat{f}$ using the ...
user394334's user avatar
2 votes
1 answer
202 views

This is sort-of a follow-up from my last question, except purely based on curiosity. I found different versions of similar bs="sz" models in ...
Nate's user avatar
  • 2,537
1 vote
0 answers
57 views

I've been thinking about the use of cross-validation and hold-out sets and I don't really see the use of a randomly selected hold-out test set. I have to say, though, that when the hold-out is not ...
adriavc00's user avatar
4 votes
1 answer
111 views

Suppose I have a family of $N$ models for the same data, indexed by $n\in\{1,\dots,N\}$. And suppose that model $n\in\{1,\dots,N\}$ has log-likelihood given by: $$L(X_n \theta_n),$$ where $L:\mathbb{R}...
cfp's user avatar
  • 565
0 votes
0 answers
90 views

I am training an MLP on a dataset with the number of features >> number of samples. For certain reasons, MLPs with at least one hidden layer is the only architecture I am considering. ...
dkolobok's user avatar
1 vote
0 answers
50 views

I have built an XGBoost model that performs rather weirdly across months... I trained the model on a heavily imbalanced dataset (1:40 000), which I undersampled to (1:500). The model performance (...
user24758287's user avatar

15 30 50 per page
1
2 3 4 5
67