I think I don't understand the drift term in ARIMA models. This is what I did and my current understanding:
I have a time series $Y_t$ of n observations with mean $\bar Y \approx 7.15.$ I wanted to test its stationarity (around possible trend). I used ur.df ADF test (with drift) implementation from urca package in R. ADF test fits the following linear regression: $$\Delta y_{t} = c + \gamma y_{t-1} + \beta \Delta y_{t-1} + \epsilon_t$$ to determine whether there is a unit root (i.e. whether $\gamma = 0$). Coefficients of this equation can be translated into coefficients of the original model for $y_t$ in the following way: $$y_{t} = c + (\gamma + 1) y_{t-1} + \beta \Delta y_{t-1} + \epsilon_t.$$ Based on the output of ur.df: $c = 1.38, \gamma = -0.2$, and $\beta$ is not significant. It means that my original model should be $$y_t = 1.38 + (1+ -0.2)y_{t-1} = 1.38 + 0.8 y_{t-1}.$$ I know that in ARMA model the drift term $c$ should be equal to $\bar Y(1 - \sum \delta_i)$ where $\delta_i$ are coefficients next to lagged terms - in this case there is only one. $1.38 \approx 7.15 \times (1 - 0.8)$, so everything seems fine.
In the next step I tried to fit AR(1) model with non-zero mean using arima function and setting p = 1, d = 0, q = 0 with include.mean = TRUE which allows for an intercept. The fitted coefficients were similar to the ones from ur.df regression.
Then I subtracted the mean from $Y_t$. Let $Z_t = Y_t - \bar Y$. Using ADF test (without drift, which means it fits the same equation as "with drift" but without the intercept) showed that the time series $Z_t$ is stationary (and estimated nearly the same $\gamma$).
What I don't understand:
- Is $Y_t$ stationary "around trend"? If so, what exactly does it mean?
- Do I understand it correctly: a "drift" means that there is a linear trend in time series (causing it to "drift away from zero")?
- By definition AR(1) model with a drift can't be stationary even with no unit root (can be shown by deriving the formulas for mean and variance).
- But if a drift corresponds to a linear trend in time series, then subtracting a constant mean shouldn't make it stationary. What am I missing here?
- But if $\bar Z = 0$ then using the equation for $c$ clearly shows that there is no drift if the mean is zero. Why can't we have a time series with a drift and a zero mean?