All Questions

Question 1

I am fitting a simple model for dataset where the outcome is binary (1 or 0). ...

Question 2

I have a nice multiclass random forest model in R (using the packages ranger and caret) but I think this question applies to any random forest logic. When I use my RF to label unknown data I want to ...

Question 3

I am relatively new to machine learning. I see many examples of practices where people include variables that are only available after the outcome variable (Y) to make predictions. An example of this ...

Question 4

I’m trying to project TPES (Total Primary Energy Supply) by country in Africa up to the year 2100 under different SSP (Shared Socioeconomic Pathways) scenarios, the same framework used in the latest ...

Question 5

Suppose I have two metrics, x and y. I have measures for a few dozen units on both metrics, at time 1 and at time 2. I want to validate metric y, so that future users can use it as a substitute for ...

Question 6

I understand orthogonal polynomials (perhaps not the discrete ones?) but I don't understand how predict exactly handles polynomials with different number of data points i.e. different x-values and ...

Question 7

The vast majority of statistical literature involves having a dataset which can be partitioned into $n$ data points, $\mathbf{x} = \{x_1,...,x_n\}$ constructing a model for the individual data ...

Question 8

Background I trained an XGBoost model to predict a dichotomous outcome, which has a base rate of about 55% in the overall sample. This model will not be used to classify, however: It will be used ...

Question 9

I want to model the probability of an event occurring, given that a string has occurred. Or, in other words, predict which event is more likely to happen, given that the string was observed. These are ...

Question 10

I have a database of many employees, and i want to estimate how many are going to retire next year, based on many retired last year. So i thought about a logistic model like glm(retire ~ age2025 + ...

Question 11

I was reading through the company white paper for AncestryDNA, which gives DNA ancestry estimates to individuals who are willing to send them a saliva sample. In their 2024 white paper they list the ...

Question 12

I ran this lognormal hurdle GLMM using the R package glmmTMB: ...

Question 13

TLDR : confusion matrix is used to validate a model. But I also want to make predictions using my models. Can I use the confusion matrix to make predictions? I don't see any other way to do it, but I ...

Question 14

I am trying to decide on the best method for producing model predictions (for graphing) from my generalized linear mixed effects model. I am interested in getting marginal predictions (i.e., what the ...

Question 15

I’m currently conducting a statistical study to evaluate whether a given factor has predictive power over another variable—such as future returns. As part of this, I’ve been analyzing the mean and ...

Stack Exchange Network

All Questions

Interpreting the predicted values from family = poisson(link="log") , binary outcome

Confidence threshold for random forest type = "prob" new data

Is it okay in prediction problems to put post-outcome features in the model?

How to use a hierarchical Bayesian model to combine regional and country-level data for TPES projections?

Validating a new metric using two-period panel data

How does the math of predict work for `lm` with `poly`?

Statistical modeling with only a single data point

How can calibration plots for my model's predictions look good while the standard metrics (ROC AUC, F-score, etc.) look poor?

Predicting occurrence of event following observation of string

Predicting global outcomes with logistic model

Recall in AncestryDNA white paper

Confidence intervals for predictions in ggeffects are outside the possible range of probabilities

Can I use confusion matrix for prediction?

Prediction for glmm (correcting for bias due to jensens inequality?)

Quantile-Based Analysis for Predictive Power Study

Hot Network Questions