Skip to main content

All Questions

0 votes
0 answers
47 views

I am fitting a simple model for dataset where the outcome is binary (1 or 0). ...
Eagle Hawk's user avatar
1 vote
0 answers
34 views

I have a nice multiclass random forest model in R (using the packages ranger and caret) but I think this question applies to any random forest logic. When I use my RF to label unknown data I want to ...
Dr Egg's user avatar
  • 11
4 votes
2 answers
374 views

I am relatively new to machine learning. I see many examples of practices where people include variables that are only available after the outcome variable (Y) to make predictions. An example of this ...
Abdullah Abdelaziz's user avatar
1 vote
0 answers
18 views

I’m trying to project TPES (Total Primary Energy Supply) by country in Africa up to the year 2100 under different SSP (Shared Socioeconomic Pathways) scenarios, the same framework used in the latest ...
grégoire david's user avatar
2 votes
1 answer
101 views

Suppose I have two metrics, x and y. I have measures for a few dozen units on both metrics, at time 1 and at time 2. I want to validate metric y, so that future users can use it as a substitute for ...
Clara's user avatar
  • 123
0 votes
1 answer
116 views

I understand orthogonal polynomials (perhaps not the discrete ones?) but I don't understand how predict exactly handles polynomials with different number of data points i.e. different x-values and ...
Christoph's user avatar
  • 435
7 votes
1 answer
167 views

The vast majority of statistical literature involves having a dataset which can be partitioned into $n$ data points, $\mathbf{x} = \{x_1,...,x_n\}$ constructing a model for the individual data ...
jms's user avatar
  • 121
11 votes
3 answers
728 views

Background I trained an XGBoost model to predict a dichotomous outcome, which has a base rate of about 55% in the overall sample. This model will not be used to classify, however: It will be used ...
Mark White's user avatar
  • 11.7k
0 votes
0 answers
42 views

I want to model the probability of an event occurring, given that a string has occurred. Or, in other words, predict which event is more likely to happen, given that the string was observed. These are ...
Ricardo Antunes's user avatar
1 vote
1 answer
68 views

I have a database of many employees, and i want to estimate how many are going to retire next year, based on many retired last year. So i thought about a logistic model like glm(retire ~ age2025 + ...
FloLe's user avatar
  • 33
1 vote
1 answer
82 views

I was reading through the company white paper for AncestryDNA, which gives DNA ancestry estimates to individuals who are willing to send them a saliva sample. In their 2024 white paper they list the ...
H_1317's user avatar
  • 141
7 votes
2 answers
446 views

I ran this lognormal hurdle GLMM using the R package glmmTMB: ...
Michaela's user avatar
  • 229
0 votes
0 answers
89 views

TLDR : confusion matrix is used to validate a model. But I also want to make predictions using my models. Can I use the confusion matrix to make predictions? I don't see any other way to do it, but I ...
Siva Kg's user avatar
  • 23
2 votes
0 answers
133 views

I am trying to decide on the best method for producing model predictions (for graphing) from my generalized linear mixed effects model. I am interested in getting marginal predictions (i.e., what the ...
Stephanie Rivest's user avatar
2 votes
1 answer
82 views

I’m currently conducting a statistical study to evaluate whether a given factor has predictive power over another variable—such as future returns. As part of this, I’ve been analyzing the mean and ...
user73016's user avatar

15 30 50 per page
1
2 3 4 5
323