5

I have some data like

arr = [ [30.0, 0.0257], [30.0, 0.0261], [30.0, 0.0261], [30.0, 0.026], [30.0, 0.026], [35.0, 0.0387], [35.0, 0.0388], [35.0, 0.0387], [35.0, 0.0388], [35.0, 0.0388], [40.0, 0.0502], [40.0, 0.0503], [40.0, 0.0502], [40.0, 0.0498], [40.0, 0.0502], [45.0, 0.0582], [45.0, 0.0574], [45.0, 0.058], [45.0, 0.058], [45.0, 0.058], [50.0, 0.0702], [50.0, 0.0702], [50.0, 0.0698], [50.0, 0.0704], [50.0, 0.0703], [55.0, 0.0796], [55.0, 0.0808], [55.0, 0.0803], [55.0, 0.0805], [55.0, 0.0806], ] 

which is plotted like

in Google Charts API

I am trying to do linear regression on this, i.e. trying to find the slope and the (y-) intercept of the trend line, and also the uncertainty in slope and uncertainty in intercept.

The Google Charts API already finds the slope and the intercept value when I draw the trend line, but I am not sure how to find the uncertainties.

I have been doing this using LINEST function in Excel, but I find this very cumbersome, since all my data are in Python.

So my question is, how can I find the two uncertainty values that I get in LINEST using Python?

I apologize for asking an elementary question like this.

I am pretty good at Python and Javascript, but I am very poor at regression analysis, so when I tried to look them up in documentations, because of the difficult terms, I got very confused.

I hope to use some well-known Python library, although it would be ideal if I could do this within Google Charts API.

2
  • I think this might help you stackoverflow.com/questions/11479064/… Commented Sep 26, 2014 at 1:58
  • I am an absolute novice when it comes to regression or any statistical methods. Unfortunately, the link does not help. Sorry. Commented Sep 26, 2014 at 2:08

1 Answer 1

4

It could be done using statsmodels like this:

import statsmodels.api as sm import numpy as np y=[];x=[] for item in arr: x.append(item[0]) y.append(item[1]) # include constant in ols models, which is not done by default x = sm.add_constant(x) model = sm.OLS(y,x) results = model.fit() 

You could then access the values you require as follows. The intercept and the slope are given by:

results.params # linear coefficients # array([-0.036924 , 0.0021368]) 

I suppose you mean the standard errors when you refer to uncertainty, they can be accessed like this:

results.bse # standard errors of the parameter estimates # array([ 1.03372221e-03, 2.38463106e-05]) 

An overview can be obtained by running

>>> print results.summary() OLS Regression Results ============================================================================== Dep. Variable: y R-squared: 0.997 Model: OLS Adj. R-squared: 0.996 Method: Least Squares F-statistic: 8029. Date: Fri, 26 Sep 2014 Prob (F-statistic): 5.61e-36 Time: 05:47:08 Log-Likelihood: 162.43 No. Observations: 30 AIC: -320.9 Df Residuals: 28 BIC: -318.0 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [95.0% Conf. Int.] ------------------------------------------------------------------------------ const -0.0369 0.001 -35.719 0.000 -0.039 -0.035 x1 0.0021 2.38e-05 89.607 0.000 0.002 0.002 ============================================================================== Omnibus: 7.378 Durbin-Watson: 0.569 Prob(Omnibus): 0.025 Jarque-Bera (JB): 2.079 Skew: 0.048 Prob(JB): 0.354 Kurtosis: 1.714 Cond. No. 220. ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. 

This might also be of interest for a summary of the properties of the resulting model.

I did not compare to LINESTin Excel. I also don't know if this is possible using only the Google Charts API.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.