Skip to main content
8 of 14
added 1190 characters in body

There are two equations for computing the statistical significance of the correlation coefficient here. The first is the sampling variance of the measured correlation coefficient $r$ of two bivariate normal random variables $X$ and $Y$ with true correlation coefficient $\rho$:\begin{equation}\text{var}\left(r\right)=\frac{\left(1-\rho^2\right)^2}{n},\end{equation} and the second is a t-statistic associated with the hypothesis that in the linear regression of $Y$ on $X$, the main effect of $X$ is zero:\begin{equation}t=r\sqrt{\frac{n-2}{1-r^2}}.\end{equation}Whence the standard error of $r$ mentioned by the OP: $\text{se}\left(r\right)=\sqrt{\frac{1-r^2}{n-2}}$. These can be derived in the following ways.

Pearson derived the first of these as \begin{equation}\text{var}\left(r\right)=\frac{\left(1-\rho^2\right)^2}{n\left(1+\rho^2\right)}\approx\frac{1-3\rho^2}{n}\end{equation} in VII. Mathematical contributions to the theory of evolution.-III. Regression, heredity, and panmixia by expanding the joint distribution of $n$ pairs of bivariate normal variables about the true value of $\rho$. We can derive it using expectations as the OP asks. First, assume $X$ and $Y$ are mean-subtracted standard normal variables with standard deviations $\sigma_X$ and $\sigma_Y=1$. Then the correlation coefficient is defined by \begin{equation}\mathbb{E}\left(XY\right)=\rho\sigma_X\sigma_Y.\end{equation}Expressed in terms of the density $f\left(x,y\right)$ of the bivariate normal, this becomes \begin{align}\mathbb{E}\left(XY\right)=&\frac{1}{2\pi\sigma_X\sigma_Y\sqrt{1-\rho^2}}\int_{-\infty}^{\infty}xy\exp{\left(\frac{-x^2}{2\sigma_X^2\left(1-\rho^2\right)}\right)}\exp{\left(\frac{-y^2}{2\sigma_Y^2\left(1-\rho^2\right)}\right)}\exp{\left(\frac{xy\rho}{\sigma_X\sigma_Y\left(1-\rho^2\right)}\right)}dxdy=\rho\sigma_X\sigma_Y.\tag{1}\end{align}Differentiate each side once with respect to $\rho$ to get\begin{align}\frac{\rho}{1-\rho^2}\mathbb{E}\left(XY\right)-\frac{\rho}{\sigma_X^2\left(1-\rho^2\right)^2}\mathbb{E}\left(X^3Y\right)-\frac{\rho}{\sigma_Y^2\left(1-\rho^2\right)^2}\mathbb{E}\left(XY^3\right)+\frac{1}{\sigma_X\sigma_Y\left(1-\rho^2\right)}\left(1+\frac{2\rho^2}{1-\rho^2}\right)\mathbb{E}\left(X^2Y^2\right)=\sigma_X\sigma_Y\end{align}or \begin{align}\frac{1}{\sigma_X\sigma_Y}\left(1+\frac{2}{1-\rho^2}\right)\mathbb{E}\left(X^2Y^2\right)-\frac{\rho}{\sigma_X^2\left(1-\rho^2\right)}\mathbb{E}\left(XY^3\right)-\frac{\rho}{\sigma_Y^2\left(1-\rho^2\right)}\mathbb{E}\left(X^3Y\right)=\sigma_X\sigma_Y\left(1-2\rho^2\right)\tag{2}\end{align}To eliminate $\mathbb{E}\left(X^3Y\right)$ and $\mathbb{E}\left(XY^3\right)$ from this expression, differentiate the identity $\int_{-\infty}^{\infty}f\left(x,y\right)dxdy=1$ with respect to $\rho$ to find\begin{align}\frac{\rho}{1-\rho^2}\mathbb{E}\left(1\right)-\frac{\rho}{\sigma_X^2\left(1-\rho^2\right)^2}\mathbb{E}\left(X^2\right)+\frac{1}{\sigma_X\sigma_Y\left(1-\rho^2\right)}\left(1+\frac{2\rho^2}{1-\rho^2}\right)\mathbb{E}\left(XY\right)-\frac{\rho}{\sigma_Y^2\left(1-\rho^2\right)^2}\mathbb{E}\left(Y^2\right)\end{align}or\begin{align}\rho=\frac{1}{\sigma_X\sigma_Y}\left(1+\frac{2\rho^2}{1-\rho^2}\right)\mathbb{E}\left(XY\right)-\frac{\rho}{\sigma_X^2\left(1-\rho^2\right)^2}\mathbb{E}\left(X^2\right)-\frac{\rho}{\sigma_Y^2\left(1-\rho^2\right)^2}\mathbb{E}\left(Y^2\right).\end{align}Since $\rho=\mathbb{E}\left(XY\right)/\sigma_X\sigma_Y$, the only way this expression can be true for all $\rho$ is if $\mathbb{E}\left(X^2\right)=\frac{\rho\sigma_X}{\sigma_Y}\mathbb{E}\left(XY\right)$ and $\mathbb{E}\left(Y^2\right)=\frac{\rho\sigma_Y}{\sigma_X}\mathbb{E}\left(XY\right)$, whence it follows that\begin{align}\mathbb{E}\left(XY\cdot X^2\right)&=\frac{\rho\sigma_X}{\sigma_Y}\mathbb{E}\left(XY\cdot XY\right)\\\mathbb{E}\left(X^3Y\right)&=\frac{\rho\sigma_X}{\sigma_Y}\mathbb{E}\left(X^2Y^2\right)\end{align}and\begin{align}\mathbb{E}\left(XY^3\right)&=\frac{\rho\sigma_Y}{\sigma_X}\mathbb{E}\left(X^2Y^2\right)\end{align}because the left- and right-hand sides refer the expectation to the same distribution $f$. Hence we can substitute into the expression for $\mathbb{E}\left(X^2Y^2\right)$ above to get \begin{align}\frac{1}{\sigma_X\sigma_Y}\left(1+\frac{2}{1-\rho^2}-\frac{2\rho^2}{1-\rho^2}\right)\mathbb{E}\left(X^2Y^2\right)=\mathbb{E}\left(X^2Y^2\right)=\sigma_X\sigma_Y\left(1-2\rho^2\right),\end{align} giving the variance of $XY$ as \begin{equation}\mathbb{E}\left(X^2Y^2\right)-\mathbb{E}\left(XY\right)^2=1-3\rho^2.\end{equation}

Now the sample correlation coefficient of two mean-subtracted random variables is \begin{equation}r=\frac{\sum_i\left(X_iY_i-\overline{X}\overline{Y}\right)}{\sqrt{\sum_i\left(X_i-\overline{X}\right)^2\sum_i\left(Y_i-\overline{Y}\right)^2}}=\frac{\sum_iX_iY_i}{n\sigma_X\sigma_Y}\end{equation}because $\overline{X}=\overline{Y}=0$. Thus \begin{align}\sigma_X\sigma_Yr&=\frac{1}{n}\sum_iX_iY_i=\frac{1}{n}nX_1Y_1,\end{align}which has expected value\begin{equation}\sigma_X\sigma_Y\mathbb{E}\left(r\right)=\mathbb{E}\left(XY\right)=\rho\sigma_X\sigma_Y.\end{equation}Similarly,\begin{align}\sigma_X^2\sigma_Y^2n^2r^2&=\left(\sum_iX_iY_i\right)^2=nX_1Y_1+n\left(n-1\right)X_1X_2Y_1Y_2\end{align} which has expectation\begin{align}\sigma_X^2\sigma_Y^2n^2\mathbb{E}\left(r^2\right)&=n\mathbb{E}\left(X^2Y^2\right)+n\left(n-1\right)\mathbb{E}\left(XY\right)^2.\end{align}Hence the sampling variance of $r$ is approximately\begin{align}\mathbb{E}\left(r^2\right)-\mathbb{E}\left(r\right)^2=\frac{1-2\rho^2+\left(n-1\right)\rho^2}{n}-\frac{n\rho^2}{n}=\frac{1-3\rho^2}{n},\end{align} which is approximately Pearson's expression.

For the second form of the statistic, let's drop the assumption of bivariate normality and consider the regression of $Y$ on $X$. The equation of the line which is the best fit in respect of least squares is $Y=\alpha+\beta X$. Under the null hypothesis that $\beta=0$ the t-statistic is \begin{equation}t=\frac{\hat{\left(\beta\right)}}{\text{se}\left(\hat{\beta}\right)/\sqrt{n-2}},\end{equation}in which we lose two degrees of freedom in the estimation of $\hat{\alpha}$ and $\hat{\beta}$. Now the total sum of squares is\begin{equation}\sum_i\left(Y_i-\hat{\alpha}-\hat{\beta}X_i\right)^2=s_Y^2-\hat{\beta}^2s_X^2=s_Y^2-r^2s_Y^2,\end{equation}where $s$ is the sample standard error and we have used the relationship\begin{equation}\hat{\beta}=\frac{\sum_i\left(X_i-\overline{X}\right)\left(Y_i-\overline{Y}\right)}{\sum_i\left(X_i-\overline{X}\right)^2}=\frac{\sum_i\left(X_i-\overline{X}\right)\left(Y_i-\overline{Y}\right)}{\sqrt{\sum_i\left(X_i-\overline{X}\right)^2\sum_i\left(Y_i-\overline{Y}\right)^2}}\frac{s_Y}{s_X}=r\frac{s_Y}{s_X}\end{equation} between the estimate of the slope and the sample correlation coefficient. Since the variance of $\hat{\beta}$ is got by \begin{equation}\text{var}\left(\hat{\beta}\right)=\frac{\sum_i\left(Y_i-\hat{\alpha}-\hat{\beta}X_i\right)^2}{\sum_i\left(X_i-\overline{X}\right)^2},\end{equation} we can substitute from the preceding two equations to get\begin{equation}\text{var}\left(\hat{\beta}\right)=\frac{s_Y^2\left(1-r^2\right)}{s_X^2}=\frac{1-r^2}{r^2/\hat{\beta}^2},\end{equation}whence the standard error of $\hat{\beta}$ is $\frac{\hat{\beta}\sqrt{1-r^2}}{r}$. Thus our t-statistic becomes \begin{equation}t=\frac{\hat{\beta}}{\text{se}\left(\hat{\beta}\right)/\sqrt{n-2}}=r\sqrt{\frac{n-2}{1-r^2}}.\end{equation}The foregoing is a summary of The Analysis of Physical Measurements, pp. 193-199, by Pugh and Winslow cited in A brief note on the standard error of the Pearson correlation. The idea is that if we test the hypothesis that $\beta$ is zero, we are equivalently testing the hypothesis that $r$ is zero. Hence this statistic is only really valid for small values of the correlation coefficient.