ARIMA | Jason Siu

type

Post

Created date

Jun 16, 2022 01:21 PM

(Integrated part) Stationarity and differencing

A stationary time series is the one whose properties DO NOT depend on the time.

If a time series has a trend or seasonality, it is nonstationary.

A white noise series is stationary.

To apply ARIMA models, our time series should be stationary to begin with.

Here is what it looks like :

Roughly horizontal

Constant variance

(Like some mean between two patterns in a series)

No patterns predictable in the long-term

(Can use Histogram to check the distribution. If same, this is stationary.)

So here asks 2 important questions :

How to test if a time series is stationary?

What to do if a time series is nonstationary?

(Integrated part) Q1: How to test if a time series is stationary?

Here are some examples :

Here is tricky

This seems like there is seasonality, but it is rather cyclic.

It's cyclic in the sense that there is not a fixed period and the time between the peaks or the troughs is not determined by the calendar; it's determined by the ecology of links, and their population cycle.

So, this one is actually stationary, even though you might originally think it's not.

If i took a section of the graph of some length s and i took another section at a completely different point in time where the starting point is randomly chosen say over here then the distribution is the same

Or we can look at the graph of ACF to determine if the ts is stationary.

Also, the values drop to 0 quickly - a sign of stationarity.

(Integrated part) Q2: What to do if a time series is nonstationary?

We will do transformation if is not nonstationary.

We do log, box-cox, or whatever suitable (taught previously). →

We then differencing →

12 here if that is yearly seasonal.

If needed, we will do a second differencing.

In practice, we never to beyond the second-order difference.

Seasonal difference:

It is the difference between an observation at time t and the previous observation from the same season.

The above formula assumes m is 12.

Unit root test - objectively determines the need for differencing

There are multiple tests which we can use to test if the ts is stationary

1. ACF

The fact that in ACF, the spikes go to zero quickly suggests that the series is stationary. It is not white-noise though.

2. The LjungBox test → 細係 non-stationary

A small p-value implies the series is ≠ white noise, and hence non-stationary.

3. The augmented Dickey-Fuller (ADF) test → 細係 stationary

A small p-value implies the series = white noise, and hence stationary.

Null hypothesis = the data are non-stationary and non-seasonal

4. the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test → 細係 stationary

Null hypothesis = the data are stationary; we look for evidence that the null hypothesis is false.

Small p-values (e.g., less than 0.05) suggest that differencing i required.

5. STL decomposition strength → >0.64 係 non-stationary

Non-seasonal ARIMA models

(AR part) Autoregressive model

You might think that this looks like linear regression, and indeed it is the extension. The difference is that the regression one has a bunch of explanatory variables; however, in the AR (p) model, we are going to regress on its own lag values.

Q: Why use a univariant model?

Other explanatory variables are not available.

Other explanatory variables are not directly observable.

Examples: inflation rate, unemployment rate, exchange rate, firm's sales, gold prices, interest rate, etc.

Changing the parameters = changes ts patterns.

is white noise. Chaning only change the scale of the series, not the patterns.

If we add C, then we assumed the trend continues in long term.

Stationarity condition

P is the order of model

We normally restrict autoregressive models to stationary data, in which case some constraints on the values of the parameters are required.

For an AR(1) model:

For an AR(2) model:

When p≥3

p≥3, the restrictions are much more complicated. The Fablepackage takes care of these restrictions when estimating a model.

(MA part) Moving Average (MA) models

Moving Average (MA) models ≠ moving average smoothing!

This is a multiple regression with past errors as predictors.

Non-seasonal ARIMA models

Combining differencing with autoregression and a moving average model, we obtain a non-seasonal ARIMA model.

= the differenced series (it may have been differenced more than once).

The “predictors” on the right hand side include both lagged values of and lagged errors.

Params of ARIMA(p,d,q)

Special case

Q: How do you choose P and Q?

Before answering this q, we need to know how C, P and Q affect the model.

Changing d affects the prediction interval; The higher the value of d, the more rapidly the prediction intervals increase in size (d越大，pi 越大).

Changing p affects Here to solve

ACF & PACF

The above shows the ACF. However, the problem with the ACF function is that, when we calculate the correlation between in the case that they are corr, then y t − 1 and y t − 2 must also be correlated.

However, then y t and y t − 2 might be correlated, simply because they are both connected to y t − 1, rather than because of any new information contained in y t − 2 that could be used in forecasting y t .

In short, there is an interaction effect.