Should Geometric Brownian Motion Prediction Use Rolling Window or Train-Test Split?

Question

I'm currently working on a stock market prediction project using Geometric Brownian Motion (GBM). My goal is to showcase that GBM, as a stochastic model, does not require a large dataset compared to models like LSTM or ARIMA. This makes GBM suitable for applications involving new stocks, indices, or assets like cryptocurrencies, where historical data is limited.

Given the stochastic nature of the model, I'm trying to decide the best way to evaluate its performance on stock price prediction. I have some thoughts and questions I'd like input on:

1. Rolling Window vs. Train-Test Split

Should I evaluate GBM predictions using a rolling window approach or a traditional train-test split (e.g., 80-20)?
My concern with train-test split is that it assumes stationarity in the data, which might not hold in financial time series. Rolling windows, on the other hand, allow the model to adapt to changing data, but I am unsure if the results are directly comparable to a train-test split.

2. Evaluation of Rolling Window Results

How should I evaluate the results from a rolling window? Should I calculate metrics (e.g., RMSE, MAE) across all prediction points and compare them to a single train-test split metric?

3. Minimal Data Requirement

Are there any studies or research articles suggesting that GBM performs well with less data?
What is the minimum amount of data recommended to estimate ( \mu ) and ( \sigma ) reliably? I hypothesize that it depends on the time scale of the data (daily, weekly, etc.), but I haven't found specific guidance.

4. My Current Thoughts

I believe that rolling windows might be better suited for time-varying systems like financial markets, as they allow the model to adapt to new patterns. However, I'm concerned that rolling windows might introduce more noise due to smaller training sets for each window.
For minimal data, I think GBM might work well with as little as 1 month of daily data (~20 points) for short-term predictions. However, this is just a hypothesis, and I’d like confirmation or suggestions from experts.

Any insights, references to relevant papers, or suggestions for best practices would be greatly appreciated!

Hi: The GBM implies that the log price model is a brownian motion with a correction for the drift term. So the GBM model assumes that the log price is not predictable ( it's also continuous so you would need to discretize it ). Therefore, hopefully you won't see much predictability when you use it for short term predictions. I'm not familiar with LSTM's but the GBM should not show improved predictability compared to ARIMA models regardless of the length of the series. A correctly estimated ARIMA should ( try an ARIMA(0,1,1) on the log price ) should probably be a better model. — mark leeds
– mark leeds, Commented Jan 25 at 13:21
I read back above and it could be confusing. What I should have added is that, in discrete time, GBM is a random walk of the log price, so $log(P_t) = log(P_{t-1}) +$ possible drift $ + \epsilon_t$ . Clearly the model has zero predictive power. — mark leeds
– mark leeds, Commented Jan 25 at 14:46

Stack Exchange Network

Should Geometric Brownian Motion Prediction Use Rolling Window or Train-Test Split?

1. Rolling Window vs. Train-Test Split

2. Evaluation of Rolling Window Results

3. Minimal Data Requirement

4. My Current Thoughts

0

Hot Network Questions

Should Geometric Brownian Motion Prediction Use Rolling Window or Train-Test Split?

1. Rolling Window vs. Train-Test Split

2. Evaluation of Rolling Window Results

3. Minimal Data Requirement

4. My Current Thoughts

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Related

Hot Network Questions