0
$\begingroup$

I have applied simple forecasting models such as Naive Forecast, Moving Average, Simple Exponential Smoothing, Holts Linear Trend Model on 2018 sales data of a salesperson.

All the model resulted in flatten or prediction line flattens at zero. Could be it be an issue with data? as most of the data is flatten at zero.

ARIMA Fitted plot enter image description here

enter image description here Difference between given and predict data sets

model = ARIMA(train_log, order=(0, 1, 2)) output = model.fit(disp=-1) #convert fitted values in to series output_series=pd.Series(output.fittedvalues, copy=True) print(output_series.head()) #Calc Cumm sum output_series_cumsum= output_series.cumsum() print(output_series.head()) #Add cumsum values output_tr_log=pd.Series(train_log.ix[0],index=train_log.index) output_tr_log=output_tr_log.add(output_series_cumsum,fill_value=0) output_tr_log.head() #convert to predicted ARIMA vlaues to original format convert_output = np.exp(output_tr_log) plt.title('RMSE: %.4f'% (np.sqrt(np.dot(convert_output, train_log))/len(train_log)) 
Date Sales ---- ----- 2018-01-27 1 2018-01-30 60 2018-01-31 22 2018-02-01 490 2018-02-04 53 2018-02-05 30 2018-02-06 204 2018-02-07 234 2018-02-08 64 2018-02-10 70 2018-02-11 81 2018-02-12 10 2018-06-01 40 2018-06-02 669 2018-06-06 1188 2018-06-07 1250 2018-06-10 3861 2018-06-14 40 2018-06-21 44 
$\endgroup$

1 Answer 1

1
$\begingroup$

The naive forecast = random walk with no drift will always give a flat straight line as your forecasts. This is just the last observed data point in your time series extended forward n steps ahead.

The moving average model with m =50 will be the average of the most 50 recent data points. As we can see, your time series is mostly flat with some erratic peaks that have been smoothed away by the relatively large window (50) that you have chosen. In general, the larger the window the more flat and smooth your forecasts. Try changing the window to be smaller and you will probably get more erratic behavior in your forecasts.

Simple exponential smoothing always gives a flat forecast since all forecasted values are equal to the first forecasted value (i.e. y(t+k) = y(t+k-1) =....y(t+1), for all k > 1). This can be proven quite easily using basic induction.

Holts Linear Trend model breaks up the forecasts into three different components; the trend, level, and seasonality. If your time series exhibits no seasonality and no level changes then all you will be left with is your trend which is a flat line in your case.

Your time series looks to be a flat line with unpredictable random fluctuations. Hence, in terms of forecasting error I am not too surprised to see that a flat line may give the best forecasts. Try calculating RMSE, MAE, or MASE (which compares to the naive method directly) and see for yourself. Often, a flat line is the best forecast especially for financial time series in particular.

$\endgroup$
16
  • $\begingroup$ aranglol -sure will try out those options. Before that need one clarification.As my TS data is a flat line with random fluctuations. So can it can be called has a trend or cyclicity type of Time series data? $\endgroup$ Commented May 22, 2019 at 7:43
  • $\begingroup$ Cyclical, I think. This is sales data, right? I would think that there would be consistent cyclical patterns. Namely, what do you think is causing the spikes at June, and the last quarter of the year? June, maybe a sale/start of summer? Last quarter of the year, maybe the holiday season? You could add exogenous variables to time series models that support them that account for this. $\endgroup$ Commented May 22, 2019 at 14:06
  • $\begingroup$ In other words, an ARMA model could be useful if you believe that certain patterns shluld be consistent year to year. $\endgroup$ Commented May 22, 2019 at 14:15
  • $\begingroup$ aranglol-yes its sales data, all your predictions are absolutely right. Have tried the ARIMA model,infact am a bit confused on values to be passed ARIMA predict function, specifically on start & end fields. Tried prediction function with test data set range got a slicing error, posted in SO[stackoverflow.com/questions/56240139/] no reply yet.So,then i treid by taking lenght of train data set range and it worked, but having trouble interpreting the graphs. Have attached screen shots, please suggest. $\endgroup$ Commented May 22, 2019 at 17:13
  • $\begingroup$ Also,I see nan's populated(excpet coef column) in these fields "ar.L2.D.Volume","ma.L1.D.Volume" and "ma.L2.D.Volume" ARIMA model results summary.what could be the reason? $\endgroup$ Commented May 22, 2019 at 17:19

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.