Naming explanatory variables in regression output

Question

Each one of my variables is a list on its own.

I am using a method found on another thread here.

import numpy as np import statsmodels.api as sm y = [1,2,3,4,3,4,5,4,5,5,4,5,4,5,4,5,6,5,4,5,4,3,4] x = [ [4,2,3,4,5,4,5,6,7,4,8,9,8,8,6,6,5,5,5,5,5,5,5], [4,1,2,3,4,5,6,7,5,8,7,8,7,8,7,8,7,7,7,7,7,6,5], [4,1,2,5,6,7,8,9,7,8,7,8,7,7,7,7,7,7,6,6,4,4,4] ] def reg_m(y, x): ones = np.ones(len(x[0])) X = sm.add_constant(np.column_stack((x[0], ones))) for ele in x[1:]: X = sm.add_constant(np.column_stack((ele, X))) results = sm.OLS(y, X).fit() return results

My only problem being, that in my regression output, the explanatory variables are labelled x1, x2, x3 etc. Was wondering if it was possible to change these to more meaningful names?

Thanks

You are probably looking for pandas: stackoverflow.com/questions/19991445/… — Akavall
– Akavall, Commented Apr 12, 2016 at 1:42
Thanks! This was quite useful, should probably learn how to use it — aspiringcoderzzz
– aspiringcoderzzz, Commented Apr 12, 2016 at 23:40
The code in the question comes form the answer here: stackoverflow.com/questions/11479064/…, you probably should reference that. — Akavall
– Akavall, Commented Apr 15, 2016 at 17:23

Gerrat · Accepted Answer · 2023-08-22 18:50:20Z

Searching through the source, it appears the summary() method does support using your own names for explanatory variables. So:

results = sm.OLS(y, X).fit() print results.summary(xname=['Fred', 'Mary', 'Ethel', 'Bob'])

gives us:

 OLS Regression Results ============================================================================== Dep. Variable: y R-squared: 0.535 Model: OLS Adj. R-squared: 0.461 Method: Least Squares F-statistic: 7.281 Date: Mon, 11 Apr 2016 Prob (F-statistic): 0.00191 Time: 22:22:47 Log-Likelihood: -26.025 No. Observations: 23 AIC: 60.05 Df Residuals: 19 BIC: 64.59 Df Model: 3 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [95.0% Conf. Int.] ------------------------------------------------------------------------------ Fred 0.2424 0.139 1.739 0.098 -0.049 0.534 Mary 0.2360 0.149 1.587 0.129 -0.075 0.547 Ethel -0.0618 0.145 -0.427 0.674 -0.365 0.241 Bob 1.5704 0.633 2.481 0.023 0.245 2.895 ============================================================================== Omnibus: 6.904 Durbin-Watson: 1.905 Prob(Omnibus): 0.032 Jarque-Bera (JB): 4.708 Skew: -0.849 Prob(JB): 0.0950 Kurtosis: 4.426 Cond. No. 38.6 ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

Josef · Accepted Answer · 2016-04-12 02:30:31Z

There are several ways to adjust the names for the parameters

summary has an xname keyword that should work which can be used to just change the names in the summary table http://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.RegressionResults.summary.html

When the model is created with a formula, then the parameter names are stored internally in the data attribute of models, model.data.xnames, and can be accessed through model.exog_names.

There is no proper setter method and it's not "officially" (*) supported, but AFAIK model.data.xnames can be overwritten, i.e. assign a new list of strings. The list model.exog_names should only be changed inplace, because it's just another reference for the model.data.xnames. These changes will be permanent and affect all uses of the parameter names.

(*) AFAIK: There are not unit tests for changing exog_names or xnames. Some models need to change the names depending on extra parameters that need to be estimated. The internal refactoring is going into the direction of using param_names so we can separate the names of the parameters from the names of the explanatory variables. The latter is needed in several newer models but is not relevant for OLS and many other traditional models.

Collectives™ on Stack Overflow

Naming explanatory variables in regression output

2 Answers 2

1 Comment

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Linked

Related