How to compute the correlation coefficient between two columns from a data frame?

Question

I want to compute the correlation between two different columns from the same data frame. This is the code I use:

Correlation_unemp_demvote=np.corrcoef(New_table['unemp'], New_table['demVote']) Correlation_unemp_demvote

The outcome as follows:

array([[ 1. , 0.34167764], [ 0.34167764, 1. ]])

I was actually expecting to get a value between -1 and 1, as the real correlation coefficient definition explains. Could you explain to me the result I have just got? I've also seen lots of functions referred to correlations, like corr(), or correlate(). Which one should be better to be used?

Thanks,

piRSquared · Accepted Answer · 2017-01-31 18:23:28Z

pd.Series.corr is what you want.
Do this instead

Correlation_unemp_demvote = New_table['unemp'].corr(New_table['demVote'])

example

df = pd.DataFrame(np.random.rand(10, 2), columns=list('AB')) df.A.corr(df.B) -0.1814956009745472

Collectives™ on Stack Overflow

How to compute the correlation coefficient between two columns from a data frame?

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related