Stationarity- ARIMA

Shunya · January 28th, 2004, 10:07 am

Hi,I have just started to read Gujarati’s Basic Econometrics, and have not read much of it but skimmed through it. I came across a chapter on time series analysis and started to read but ran into a few problems and questions, which someone could hopefully help me get an answer to.1.Firstly I can across stationarity of data/time series, it clearly states in the book that stock price data is inherently non-stationary and we need to find stationarity in it, and recommended some techniques like ADF and DF tests, using first differencing of the series. Do I regress the data after this? (they talk about spurious regression later.)Then it talks about 2nd differencing as well. How does differencing make the time series stationary or does it mean that the after you have differenced the time series x or y times the mean and variance of the that series stabilises and it is supposed to be staionary at that time interval. Will that same time diffrence need to ne applied if I use the series in an ARIMA model2. Does data need to be stationary t be applied to a time series model like ARMA or ARIMA, can it not be applied it with non-stationarity? Form the equations its seems that AR and ARMA equations are similar to ADF. Why is this so?3. There seems to be constant in the equation (noise) why do we need this can I solve without this constant? If I have to have a white noise in it cann’t or shouldn’t I compute it from the data it self?4. First equation for unit root test Yt = Y(t-1) + UtIn the book it says that if the co-efficient of Y(t-1) is infact equal to 1, we face what is known as a unit root problem i.e. non stationarity, does anyone know what number would suggest stationarity, e.g. below 1 a 0 or a negative number? Does this number have to be between 1 and –1 like correlation?5. Someone mentioned to me take log returns and not just arithmetic returns for analysis, does this make a difference and why?6. What is a “trend stationary process” and “difference stationary process”, what does this mean and how do I determine how to solve it? Does it make a difference? It seems there seems to be different approach o these 2 methods, is there a robust generic method for this.7. I read co-integrated means, A series is integrated with B series but what exactly does integrated mean? Integrated with itself?I sorry to ask so many questions all at once, but I’m a bit confused here. Could you all please help me?Regds Shunya

u418936 · January 28th, 2004, 3:27 pm

These questions are pretty basic. The answers to them should be in any time-series book. I'm not going to answer all your questions, but I'll get you started. When you have two time series, do something like the following rough guide.[1] Check if they're stationary ; eg, dickey-fuller. If they're all stationary, do your regression; if not, go to step 2.[2] Check if they're both I(1) (which they probably are) . If they are, see if they're cointegrated. If they're cointegrated, do your regression; if not, go to step [3].[3] Difference the series until they're stationary and do your regression. If you need to know how to do this stuff, I strongly suggest that you read Enders' book on applied time series. It's a pretty easy read.

Shunya · January 29th, 2004, 1:23 pm

Anybody else?

seanT · January 29th, 2004, 3:05 pm

Nobel 2003 Overview - CointegrationThis is a good intro - its well worth a read anywayIf you are stuck then try 'Econometric Analysis' by William Greene, I know this has all the answers to your questions.

I found this book very approachable..............it also has exercises that are useful.With regards to first difference I found it very useful to plot the first difference and see what it looks like.... does it look stationery?Regards

obeelde · January 29th, 2004, 8:23 pm

Shunya,Some preliminaries"x(t) is the time series at time t; e(t) is the white noise (mean zero, variance sigma^2) shock at time t;AR(p) denotes autoregressive, i.e. the variable today is linearly related to p previous values of itself, e.g. x(t) = a0 + a1*x(t-1) +....+ap*x(t-p) + e(t)MA(q) denotes moving average, i.e. the variable today is a weighted sum of q past shocks, eg. x(t) = a0 + b1*e(t-1) +...+bq*e(t-q) + e(t)ARMA(p,q) denotes autoregressive and moving average process, eg. x(t) = a0 + a1*x(t-1) +....+ap*x(t-p) + e(t) + b1*e(t-1) +...+ bq*e(t-q) ARIMA(p,d,q) denotes a process that has to be differenced d times before it is stationary, then the d'th differences have an ARMA(p,q)1. Think of a random walk for a stock price, p(t), i.e. P(t) = p(t-1) + e(t), where we assume that the e(t) are iid(0;sigma^2) and P(0) = 1 (a constant)The variance of P(t) grows with time, so P(t) is non-stationary. In terms of the ARIMA notation, this is an ARIMA(0,1,0) process because you have to difference it once to obtain a stationary process. The differenced process has no AR or MA component.The DF test is basically a test of the null hypothesis that a process is ARIMA(0,1,0) versus the alternative hypothesis, ARMA(1,0), i.e. is the process a random walk or is it a stationary, first-order Autoregressive process.The DF test is not that useful in practice because many economic time series may have AR and MA components, AND we never really know what the order of these AR and MA components, i.e. we don't know p and q.So, it is better to use the ADF.Keep in mind that when we think of nationarity in the DF sense, we are thinking of a process that behaves like a random walk (when we only have to difference once to obtain stationarity). And, random walk behavior is consistent with our intution regarding stock prices: it implies that shocks (news releases) have a permanent effect on stock prices and it is consistent with geometric brownian motion that is used in the Black Scholes model (although in this case, we are talking about a random walk in the log of the process, and we allow for a positive drift component).2. For stationary data you use ARMA(p,q) models and for nonstationary data you would use ARIMA(p,d,q) models, i.e. difference d times to the data is stationary and then fit the AR and MA components.3. I am not sure that I completely understand your question, but constants are usually included in regressions because it forces the fitted error to have a mean of zero. (You can check this by setting up a least squares criterion for a simple regression and note that the first derivative of the sum of squared errors with respect to the constant or intercept provides the result that the sum of the errors equals zero at the minimum)4. For the process, Y(t) = a*Y(t-1) + u(t), it is stationary when -1 < a < 1. When a = 1 or -1 you have the unit root problem, and when a>1 the process is explosive. [Yes it does fall in the same interval as correlation, although the boundary points are excluded. However, don't try to find some relationship between these roots and correlation, per se]5. (natural) Log Returns are useful for a couple of reasons: 1. they provide continously compounded returns and 2. they "stabilize" the variance (compare a daily move in the Berkshire Hatheway stock price ($88,700) to a daily move in IBM (approx. $100). Clearly in dollars, arithmetic returns, a 1% move in either stock will be vastly different, but a log return will be approx the same.6. A trend stationary process is a process that has a determinstic time trend and some stationary ARMA components, e.g. x(t) = c0+ c1*t + a1*X(t-1) + ...+ap*X(t-p) + e(t) + b1*e(t-1) +...+bq*e(t-q). Note that there is no unit root component here!A difference stationary process is an ARIMA(p,d,q) process where the nonstationarity is not introduced by a time trend, but by the integrated natured of the process.NOte: There was a huge debate in the Economics literature about these processes. The key issue is that if GDP is trend stationary, ashock to an economy decays over time and the gdp reverts back to its trend. On the other hand, if GDP is difference stationary, then a shock to the economy has a permanent effect, just like every shock to a random walk has a permanent effect on its level.7. Consider the random walk process you used in 4, Y(t) = y(t-1) +u(t). If you substitute in for y(t-1), y(t-2), etc, you end up with Y(t) = sum_{s=0,t} u(s). The sum is required for an integral and hence you get "integrated". I hope that some of these responses will clarify some of the issues.RegardsOwen

ScilabGuru · January 29th, 2004, 11:29 pm

log returns have couple of proprties making them popular:1. They are close to relative returns since log(x_2/x_1)=log(1+x_2-x_1/x_1))~ x_2-x_1/x_12. They are additive like absolute returns, namely log(x_k/x_1)=log(x_k/x_n) +log(x_n/x_1)