All Questions
Tagged with prediction or predictive-models
4,835 questions
0 votes
0 answers
47 views
Interpreting the predicted values from family = poisson(link="log") , binary outcome
I am fitting a simple model for dataset where the outcome is binary (1 or 0). ...
1 vote
0 answers
34 views
Confidence threshold for random forest type = "prob" new data
I have a nice multiclass random forest model in R (using the packages ranger and caret) but I think this question applies to any random forest logic. When I use my RF to label unknown data I want to ...
4 votes
2 answers
374 views
Is it okay in prediction problems to put post-outcome features in the model?
I am relatively new to machine learning. I see many examples of practices where people include variables that are only available after the outcome variable (Y) to make predictions. An example of this ...
1 vote
0 answers
18 views
How to use a hierarchical Bayesian model to combine regional and country-level data for TPES projections?
I’m trying to project TPES (Total Primary Energy Supply) by country in Africa up to the year 2100 under different SSP (Shared Socioeconomic Pathways) scenarios, the same framework used in the latest ...
2 votes
1 answer
101 views
Validating a new metric using two-period panel data
Suppose I have two metrics, x and y. I have measures for a few dozen units on both metrics, at time 1 and at time 2. I want to validate metric y, so that future users can use it as a substitute for ...
0 votes
1 answer
116 views
How does the math of predict work for `lm` with `poly`?
I understand orthogonal polynomials (perhaps not the discrete ones?) but I don't understand how predict exactly handles polynomials with different number of data points i.e. different x-values and ...
7 votes
1 answer
167 views
Statistical modeling with only a single data point
The vast majority of statistical literature involves having a dataset which can be partitioned into $n$ data points, $\mathbf{x} = \{x_1,...,x_n\}$ constructing a model for the individual data ...
11 votes
3 answers
728 views
How can calibration plots for my model's predictions look good while the standard metrics (ROC AUC, F-score, etc.) look poor?
Background I trained an XGBoost model to predict a dichotomous outcome, which has a base rate of about 55% in the overall sample. This model will not be used to classify, however: It will be used ...
0 votes
0 answers
42 views
Predicting occurrence of event following observation of string
I want to model the probability of an event occurring, given that a string has occurred. Or, in other words, predict which event is more likely to happen, given that the string was observed. These are ...
1 vote
1 answer
68 views
Predicting global outcomes with logistic model
I have a database of many employees, and i want to estimate how many are going to retire next year, based on many retired last year. So i thought about a logistic model like glm(retire ~ age2025 + ...
1 vote
1 answer
82 views
Recall in AncestryDNA white paper
I was reading through the company white paper for AncestryDNA, which gives DNA ancestry estimates to individuals who are willing to send them a saliva sample. In their 2024 white paper they list the ...
7 votes
2 answers
446 views
Confidence intervals for predictions in ggeffects are outside the possible range of probabilities
I ran this lognormal hurdle GLMM using the R package glmmTMB: ...
0 votes
0 answers
89 views
Can I use confusion matrix for prediction?
TLDR : confusion matrix is used to validate a model. But I also want to make predictions using my models. Can I use the confusion matrix to make predictions? I don't see any other way to do it, but I ...
2 votes
0 answers
133 views
Prediction for glmm (correcting for bias due to jensens inequality?)
I am trying to decide on the best method for producing model predictions (for graphing) from my generalized linear mixed effects model. I am interested in getting marginal predictions (i.e., what the ...
2 votes
1 answer
82 views
Quantile-Based Analysis for Predictive Power Study
I’m currently conducting a statistical study to evaluate whether a given factor has predictive power over another variable—such as future returns. As part of this, I’ve been analyzing the mean and ...