I am trying to do some Newey-West OLS with statsmodels on my data to estimate my parameters, and the following is my code for doing so:
from __future__ import print_function, division import xlrd as xl import numpy as np import scipy as sp import pandas as pd import statsmodels.formula.api as smf import statsmodels.api as sm file_loc = "/Python/dataset_3.xlsx" workbook = xl.open_workbook(file_loc) sheet = workbook.sheet_by_index(0) tot = sheet.nrows data = [[sheet.cell_value(r, c) for c in range(sheet.ncols)] for r in range(sheet.nrows)] rv1 = [] rv5 = [] rv22 = [] rv1fcast = [] T = [] price = [] time = [] retnor = [] for i in range(1, tot): t = data[i][0] ret = data[i][1] ret5 = data[i][2] ret22 = data[i][3] ret1_1 = data[i][4] retn = data[i][5] t = xl.xldate_as_tuple(t, 0) rv1.append(ret) rv5.append(ret5) rv22.append(ret22) rv1fcast.append(ret1_1) retnor.append(retn) T.append(t) df = pd.DataFrame({'RVFCAST':rv1fcast, 'RV1':rv1, 'RV5':rv5, 'RV22':rv22,}) df = df[df.RV1.notnull()] model = smf.OLS(formula = 'df.RVFCAST ~ df.RV1 + df.RV5 + df.RV22', data = df) Everything looks just fine when I look at the arrays or my dataframe, but it returns just: TypeError: init() takes at least 2 arguments (1 given)
I have tried a bunch of different methods and I cannot see what I am missing.
When i run it the following errormessage shows:
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) /Python/harrv.py in <module>() 41 df = df[df.RV1.notnull()] 42 ---> 43 model = smf.OLS(formula = 'df.RVFCAST ~ df.RV1 + df.RV5 + df.RV22', data = df) 44 45 #mdl = model.get_robustcov_results(cov_type='HAC',maxlags=1) TypeError: __init__() takes at least 2 arguments (1 given) printing rv1 gives you:
Out[318]: [0.015538008996147568, 0.008881670570720125, 0.010421778063375802, ..... 0.003151044550868834, 0.0029676428110974166, 0.005236329928710288, 0.004838460533164701, ''] And the other rv gives similair floating numbers. The df just assembles them in the manner that pd.dataframe does, which according to the documentation is supported (http://statsmodels.sourceforge.net/devel/example_formulas.html).
df.for the formula argument is clearly wrong. (Compare with statsmodels.sourceforge.net/devel/example_formulas.html). But I admit that I don't see a correlation between the error message and your arguments. It would help if you could make the example self contained. E.g. addimportstatements and example data.