1
$\begingroup$

I am working on a problem to predict a stock's returns with a certain set of features.

The problem which I am facing is that when I try to predict the stock price itself, the models capture the trend very well and the R-Squared scores are closer to 1.

But, when I try to predict the daily returns of the stock (a pct_change on the Stock Price), my values are not able to capture the trend, and the R-Squared scores go negative (sometimes even beyond -1).

I know that a workaround for this would be simply to predict the stock prices and then do a pct_change() on the predicted price (which captures the trends in daily returns accurately). However, I want to know as to what I could do to improve my current models with the dependent variable being the returns themselves as opposed to the Stock Price.

My hunch is that this could be because the daily returns themselves have a lower values (the average returns for my stock turn out to be about ~0.0002) whereas the stock prices themselves have somewhat high values (average stock price over 90 days ~ $30).

But, I believe that the models should still work well enough. My independent variables also have a range between -1 and 1, and they have a positive correlation with the stock's returns and the stock's price.

What could I do to overcome this issue?

Thanks!

$\endgroup$
2
  • $\begingroup$ Sorry to say, but this will not ever work the way you want it. If you could predict the stock with such a simple model, we would all be rich by now :-) $\endgroup$ Commented Jul 30, 2020 at 20:27
  • $\begingroup$ @Repmat I understand this. I am working on this as an educational exercise because I have encountered this problem before where my models predict the stock price well enough, however, if I try to compute the returns of the price, the models fail to capture the trend. $\endgroup$ Commented Jul 30, 2020 at 20:33

1 Answer 1

2
$\begingroup$

Lot's more data and trading capital.

If I understand you right, you want to predict prices at, say P(t+1) based on today's price at P(t). The problem is that any time series like stock prices there is a natural correlation from today today's price and tomorrow's: much like weather, the best estimate for tomorrow's price is today's price likely with a but of drift.

The problem with this is that it gives a false sense of security. In cross-sectional data a high R-squared gives you a sense of what you have accomplished. Usually one looks at both the "adjusted R-squared" (which is, loosely, the R-squared once that serial correlation is removed or accounted for. Hence, we try to predict the amount by which tomorrow's price can be estimated today, and that ideally captures the "real" R-squared - not one boosted by serial correlation. One would also look at the t-statistic on the "growth per day" to make sure it is statistically sound.

One thing that should not matter is the size of the typical variables. you could laser-measure movements in the Hoover dam measured in thousandths in a millimeter based on the weight of the water in the lake as measured in quadrillion pounds. All of the scaling should be irrelevant, it just gives you more accurate or intuitive results.

$\endgroup$
3
  • $\begingroup$ Ah, yes. I agree on the part about the false sense of security. I am using some lagged variables as well in my predictive model for the returns (not the stock price) and based on the coefficients, I do get how that could be misleading. That being said, I want to know as to why the same model, with the same indep variables, and the same degree of correlation on both stock price and stock returns offers such a different trend for each of them. Is there a way by which I could try to at least predict the trends well enough (not the magnitude of the returns, just the trend) $\endgroup$ Commented Jul 30, 2020 at 20:37
  • $\begingroup$ I'm not sure I follow, but think of it this way. Suppose prices keep going up in general The easiest case to think about is a .01% constant daily increase. Now, this very simple return process is easily predictable - it's +.01% daily. But the price itself - which is pretty much a straight lining going up - has a lot of volatility, since the price keeps changing from day-to-day. The best estimate for tomorrow's price is today's price. That's why returns are looked at. $\endgroup$ Commented Jul 30, 2020 at 21:43
  • $\begingroup$ Ah, got it. Apologies for not getting back earlier - I have been thinking about what you have said and this makes sense to me now. Thanks a ton! $\endgroup$ Commented Aug 10, 2020 at 1:59

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.