January 29th, 2004, 8:23 pm
Shunya,Some preliminaries"x(t) is the time series at time t; e(t) is the white noise (mean zero, variance sigma^2) shock at time t;AR(p) denotes autoregressive, i.e. the variable today is linearly related to p previous values of itself, e.g. x(t) = a0 + a1*x(t-1) +....+ap*x(t-p) + e(t)MA(q) denotes moving average, i.e. the variable today is a weighted sum of q past shocks, eg. x(t) = a0 + b1*e(t-1) +...+bq*e(t-q) + e(t)ARMA(p,q) denotes autoregressive and moving average process, eg. x(t) = a0 + a1*x(t-1) +....+ap*x(t-p) + e(t) + b1*e(t-1) +...+ bq*e(t-q) ARIMA(p,d,q) denotes a process that has to be differenced d times before it is stationary, then the d'th differences have an ARMA(p,q)1. Think of a random walk for a stock price, p(t), i.e. P(t) = p(t-1) + e(t), where we assume that the e(t) are iid(0;sigma^2) and P(0) = 1 (a constant)The variance of P(t) grows with time, so P(t) is non-stationary. In terms of the ARIMA notation, this is an ARIMA(0,1,0) process because you have to difference it once to obtain a stationary process. The differenced process has no AR or MA component.The DF test is basically a test of the null hypothesis that a process is ARIMA(0,1,0) versus the alternative hypothesis, ARMA(1,0), i.e. is the process a random walk or is it a stationary, first-order Autoregressive process.The DF test is not that useful in practice because many economic time series may have AR and MA components, AND we never really know what the order of these AR and MA components, i.e. we don't know p and q.So, it is better to use the ADF.Keep in mind that when we think of nationarity in the DF sense, we are thinking of a process that behaves like a random walk (when we only have to difference once to obtain stationarity). And, random walk behavior is consistent with our intution regarding stock prices: it implies that shocks (news releases) have a permanent effect on stock prices and it is consistent with geometric brownian motion that is used in the Black Scholes model (although in this case, we are talking about a random walk in the log of the process, and we allow for a positive drift component).2. For stationary data you use ARMA(p,q) models and for nonstationary data you would use ARIMA(p,d,q) models, i.e. difference d times to the data is stationary and then fit the AR and MA components.3. I am not sure that I completely understand your question, but constants are usually included in regressions because it forces the fitted error to have a mean of zero. (You can check this by setting up a least squares criterion for a simple regression and note that the first derivative of the sum of squared errors with respect to the constant or intercept provides the result that the sum of the errors equals zero at the minimum)4. For the process, Y(t) = a*Y(t-1) + u(t), it is stationary when -1 < a < 1. When a = 1 or -1 you have the unit root problem, and when a>1 the process is explosive. [Yes it does fall in the same interval as correlation, although the boundary points are excluded. However, don't try to find some relationship between these roots and correlation, per se]5. (natural) Log Returns are useful for a couple of reasons: 1. they provide continously compounded returns and 2. they "stabilize" the variance (compare a daily move in the Berkshire Hatheway stock price ($88,700) to a daily move in IBM (approx. $100). Clearly in dollars, arithmetic returns, a 1% move in either stock will be vastly different, but a log return will be approx the same.6. A trend stationary process is a process that has a determinstic time trend and some stationary ARMA components, e.g. x(t) = c0+ c1*t + a1*X(t-1) + ...+ap*X(t-p) + e(t) + b1*e(t-1) +...+bq*e(t-q). Note that there is no unit root component here!A difference stationary process is an ARIMA(p,d,q) process where the nonstationarity is not introduced by a time trend, but by the integrated natured of the process.NOte: There was a huge debate in the Economics literature about these processes. The key issue is that if GDP is trend stationary, ashock to an economy decays over time and the gdp reverts back to its trend. On the other hand, if GDP is difference stationary, then a shock to the economy has a permanent effect, just like every shock to a random walk has a permanent effect on its level.7. Consider the random walk process you used in 4, Y(t) = y(t-1) +u(t). If you substitute in for y(t-1), y(t-2), etc, you end up with Y(t) = sum_{s=0,t} u(s). The sum is required for an integral and hence you get "integrated". I hope that some of these responses will clarify some of the issues.RegardsOwen