Linked Questions

0 votes
2 answers
5k views

I was wondering if it is necessary to mean center and set std to 1 to the both my xs and ys in linear regression or doing that to just xs is fine enough. Lets say I use a different model, say neural ...
user34790's user avatar
  • 6,927
421 votes
7 answers
433k views

In some literature, I have read that a regression with multiple explanatory variables, if in different units, needed to be standardized. (Standardizing consists in subtracting the mean and dividing ...
mathieu_r's user avatar
  • 4,621
69 votes
2 answers
102k views

What are the best (recommended) pre-processing steps before performing k-means?
pedrosaurio's user avatar
  • 1,373
38 votes
3 answers
61k views

I am working with many algorithms: RandomForest, DecisionTrees, NaiveBayes, SVM (kernel=linear and rbf), KNN, LDA and XGBoost. All of them were pretty fast except for SVM. That is when I got to know ...
Aizzaac's user avatar
  • 1,179
32 votes
2 answers
99k views

Under what circumstances should the data be normalized/standardized when building a regression model. When i asked this question to a stats major, he gave me an ambiguous answer "depends on the data". ...
Raj's user avatar
  • 963
46 votes
1 answer
49k views

I have 2 simple questions about linear regression: When is it advised to standardize the explanatory variables? Once estimation is carried out with standardized values, how can one predict with new ...
teucer's user avatar
  • 2,071
24 votes
3 answers
34k views

I was reading the following justification (from cs229 course notes) on why we divide the raw data by its standard deviate: even though I understand what the explanation is saying, it is not clear to ...
Charlie Parker's user avatar
10 votes
3 answers
2k views

I'm curious how an online dating systems might use survey data to determine matches. Suppose they have outcome data from past matches (e.g., 1 = happily married, 0 = no 2nd date). Next, let's ...
d_a_c321's user avatar
  • 1,269
4 votes
1 answer
9k views

When the dependent variable is standardized, how does one interpret the regression coefficients of continuous or categorical independent variables? For instance, if we have $K$ groups in the data and ...
karsha's user avatar
  • 158
7 votes
1 answer
14k views

Related reading: When conducting multiple regression, when should you center your predictor variables & when should you standardize them? When and how to use standardized explanatory variables in ...
kingledion's user avatar
6 votes
2 answers
3k views

I am planning to predict a binomial variable (1/0, a used point by an animal or point available to an animal in its range) using several continuous, distance-based predictor variables (distance to ...
Nova's user avatar
  • 605
6 votes
2 answers
7k views

I'm working in R, using glm.nb (of the MASS package) to model count data with a negative binomial regression model. I'd like to compare the relative importance of each of my predictor variables ...
CJH's user avatar
  • 291
3 votes
1 answer
2k views

I've got a question concerning leave-one-subject-out cross-validation of a classifier and correct outlier handling in this case. Let's suppose I've got 5 subjects. Within each subject the features ...
user avatar
2 votes
1 answer
3k views

The classifier is KNN or RBF-SVM. After doing dimension reduction (e.g., PCA, LDA or KPCA, KLDA), does it need to do normalization before classification? In LIBSVM ...
mining's user avatar
  • 1,049
5 votes
0 answers
2k views

I'm reading an article about Direct Linear Transformation which processes data using SVD, and the data set is standardized so that it has zero mean and unit standard deviation (n.b., some people call ...
avocado's user avatar
  • 3,703

15 30 50 per page