0

So i am using Indicators package and the sma function. The sma function is setup like this:

function sma(x::Array{Float64}; n::Int64=10)::Array{Float64} return runmean(x, n=n, cumulative=false) end 

Its input is an Array{Float64}.

So I load my data into a df with types:

julia> showcols(df) 6258×7 DataFrames.DataFrame │ Col # │ Name │ Eltype │ Missing │ ├───────┼───────────┼─────────┼─────────┤ │ 1 │ Date │ Date │ 0 │ │ 2 │ Open │ Float64 │ 0 │ │ 3 │ High │ Float64 │ 0 │ │ 4 │ Low │ Float64 │ 0 │ │ 5 │ Close │ Float64 │ 0 │ │ 6 │ Adj_Close │ Float64 │ 0 │ │ 7 │ Volume │ Int64 │ 0 │ 

I then try run the sma function directly over a data frame column like below:

df[:Close_200sma] = sma(df[:Close],n=200) 

it reports back:

MethodError: no method matching sma(::DataArrays.DataArray{Float64,1}; n=200 

I see the type is:

6258-element DataArrays.DataArray{Float64,1} 

DataArrays is a data structure which allows missing values so I read here:

https://github.com/JuliaStats/DataArrays.jl

I imported the data with:

df = readtable("SPY.csv", header=true) 

So not sure how it converted to data.array structure.

When I pull the data frame column to a vector and use convert() to an Array:

 Close = Float64[] Close = vec(df[:Close]) # 6258-element DataArrays.DataArray{Float64,1} # I use convert to direct array to drop the dataarray structure: Close = convert(Array, Close) # Float64[6258] 

I can run this just fine through the sma function:

sma(Close,n=200) 

When i check showcols(df)

julia> showcols(df) 6258×7 DataFrames.DataFrame │ Col # │ Name │ Eltype │ Missing │ ├───────┼───────────┼─────────┼─────────┤ │ 1 │ Date │ Date │ 0 │ │ 2 │ Open │ Float64 │ 0 │ │ 3 │ High │ Float64 │ 0 │ │ 4 │ Low │ Float64 │ 0 │ │ 5 │ Close │ Float64 │ 0 │ │ 6 │ Adj_Close │ Float64 │ 0 │ │ 7 │ Volume │ Int64 │ 0 │ 

eltype is Float64. Because its wrapped into the datarray structure, I am unable to pass it to the SMA function which is setup for Float64 only.

Am I correct in saying its not working because of the datarray structure and its why I am unable to have it work directly into the data frame?

I had this call working fine when I used read.CSV() from package CSV, however it started to throw the null error and was overwriting other files. So i dropped the CSV package for now.

dt = CSV.read("SPY.csv", types=[String; fill(Float64, 5); Int]) 

Here I had the capability to specify the types and i was able to run the df column to the sma() function.

1 Answer 1

4

There's a lot going on in your question, but I believe it boils down to: why can't you call your sma function you defined with a DataArray?

Well, it is because you demand that your sma function only works with Array{Float64} and it must only return things that are Array{Float64}. DataArray{Float64}, as you've discovered is not an Array{Float64}. It's another kind of array (with a lowercase "a" array). It is, however, an AbstractArray{Float64}. Lots of custom array objects have been implemented that look, act, and behave just like the builtin Array but have special properties. In this case, the special property is specialized handling for missing values.

So you have two options:

  • You can either implement your methods to accept and possibly return any AbstractArray{Float64}. This is generally considered good style if you're not relying on any special internal behavior — and just using indexing as the API into the array.
  • Or you can explicitly convert your DataArray to an Array before you call them. You can do this with convert(Array{Float64}, A) — but note that it'll throw an error if any of the elements are missing.
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.