type
Post
Created date
Jun 16, 2022 01:21 PM
category
Data Science
tags
Applied forecasting
Economics
status
Published
Language
From
summary
slug
password
Author
Priority
Featured
Featured
Cover
Origin
Type
URL
Youtube
Youtube
icon
ARIMA models are based on the autocorrelation in the data. It composes 3 parts: AR-I-MA.
- AR: Autoregressive models
- I: Stationarity and unit root
- MA: Moving average models
(Integrated part) Stationarity and differencing
A stationary time series is the one whose properties DO NOT depend on the time.
- If a time series has a trend or seasonality, it is nonstationary.
- A white noise series is stationary.
To apply ARIMA models, our time series should be stationary to begin with.

Here is what it looks like :
- Roughly horizontal
- Constant variance
- (Like some mean between two patterns in a series)
- No patterns predictable in the long-term
- (Can use Histogram to check the distribution. If same, this is stationary.)

So here asks 2 important questions :
- How to test if a time series is stationary?
- What to do if a time series is nonstationary?
(Integrated part) Q1: How to test if a time series is stationary?
Here are some examples :
Here is tricky
This seems like there is seasonality, but it is rather cyclic.

It's cyclic in the sense that there is not a fixed period and the time between the peaks or the troughs is not determined by the calendar; it's determined by the ecology of links, and their population cycle.
So, this one is actually stationary, even though you might originally think it's not.
If i took a section of the graph of some length s and i took another section at a completely different point in time where the starting point is randomly chosen say over here then the distribution is the same

Or we can look at the graph of ACF to determine if the ts is stationary.


(Integrated part) Q2: What to do if a time series is nonstationary?
We will do transformation if is not nonstationary.
- We do log, box-cox, or whatever suitable (taught previously). →
- We then differencing →
- 12 here if that is yearly seasonal.
- If needed, we will do a second differencing.
- In practice, we never to beyond the second-order difference.
Seasonal difference:
It is the difference between an observation at time t and the previous observation from the same season.
The above formula assumes m is 12.

Unit root test - objectively determines the need for differencing
There are multiple tests which we can use to test if the ts is stationary
1. ACF
The fact that in ACF, the spikes go to zero quickly suggests that the series is stationary. It is not white-noise though.

2. The LjungBox test
→ 細係 non-stationary
A small p-value implies the series is ≠ white noise, and hence non-stationary.

Box.test(type = "Ljung-Box")

Box.test(type = "Ljung-Box")
3. The augmented Dickey-Fuller (ADF) test
→ 細係 stationary
A small p-value implies the series = white noise, and hence stationary.
Null hypothesis = the data are non-stationary and non-seasonal



4. the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test
→ 細係 stationary
Null hypothesis = the data are stationary; we look for evidence that the null hypothesis is false.
Small p-values (e.g., less than 0.05) suggest that differencing i required.



5. STL decomposition strength → >0.64 係 non-stationary

Non-seasonal ARIMA models
(AR part) Autoregressive model

You might think that this looks like linear regression, and indeed it is the extension. The difference is that the regression one has a bunch of explanatory variables; however, in the AR (p) model, we are going to regress on its own lag values.
Q: Why use a univariant model?
- Other explanatory variables are not available.
- Other explanatory variables are not directly observable.
- Examples: inflation rate, unemployment rate, exchange rate, firm's sales, gold prices, interest rate, etc.
- Changing the parameters = changes ts patterns.
- is white noise. Chaning only change the scale of the series, not the patterns.
- If we add C, then we assumed the trend continues in long term.
Stationarity condition
P is the order of model
We normally restrict autoregressive models to stationary data, in which case some constraints on the values of the parameters are required.
- For an AR(1) model:
- For an AR(2) model:
When p≥3
p≥3, the restrictions are much more complicated. The
Fable
package takes care of these restrictions when estimating a model.(MA part) Moving Average (MA) models

- Moving Average (MA) models ≠ moving average smoothing!
- This is a multiple regression with past errors as predictors.
Non-seasonal ARIMA models
Combining differencing with autoregression and a moving average model, we obtain a non-seasonal ARIMA model.

- = the differenced series (it may have been differenced more than once).
- The “predictors” on the right hand side include both lagged values of and lagged errors.
Params of ARIMA(p,d,q)

Special case

Q: How do you choose P and Q?
- Before answering this q, we need to know how C, P and Q affect the model.
- Changing d affects the prediction interval; The higher the value of d, the more rapidly the prediction intervals increase in size (d越大,pi 越大).
- Changing p affects Here to solve
ACF & PACF

The above shows the ACF. However, the problem with the ACF function is that, when we calculate the correlation between in the case that they are corr, then y t − 1 and y t − 2 must also be correlated.
However, then y t and y t − 2 might be correlated, simply because they are both connected to y t − 1, rather than because of any new information contained in y t − 2 that could be used in forecasting y t .
In short, there is an interaction effect.

ACF
These measure these relationships after removing the effects of lags.

Q: How to pick the order of AR(p) using ACF vs. PACF?

Q: How to pick the order of MA(q) using ACF vs. PACF?


Seasonal ARIMA models


It works similar to non-seasonal ARIMA. But it adds the seasonal order terms written as (P, D, Q)
Q: How do you choose the order using ACF /PACF




Estimation and order selection
ARIMA modelling in R
Forecasting
Seasonal ARIMA models
ARIMA vs ETS
FAQ
A good model contains an unbiased residual. How do we know if our residuals are unbiased?
our residuals are unbiased when it is =0. If the above plot floats around 0, then it would be unbiased and that also means it is a white noise series
What is the process to make data stationary ?
1. Transform the data to remove changing variance
2. Seasonally difference the data to remove seasonality
3. Regular difference if the data is still non-stationary
“ARIMA-based prediction tends to be narrow.” True or False?
Yes, because only the variation in the errors has been accounted for.
- There is also variation in the parameter estimates, and in the model order, that has not been included in the calculation.
- The calculation assumes that the historical patterns that have been modelled will continue into the forecast period.
Math
- Author:Jason Siu
- URL:https://jason-siu.com/article/50f18eb6-2caa-4d89-acb8-cf897323aac6
- Copyright:All articles in this blog, except for special statements, adopt BY-NC-SA agreement. Please indicate the source!
Relate Posts