# CS代考 ARMA Models Model fitting – cscodehelp代写

ARMA Models Model fitting

7.6 Estimation of ARMA model parameters

So far we assumed that the parameters of the AR and MA models are given.

Now we consider how to estimate these parameters given n observations, x1, x2, . . . , xn.

Note lower case xt instead of upper case Xt.

We will assume the white noise is normally distributed.

The expectation μ is estimated using the sample mean μˆ = x ̄. A mean zero AR, MA or ARMA will be fitted to X −μˆ.

The parameters we want to estimate are φ1, . . . , φp for AR(p), θ1, . . . , θq for MA(q) and σW2 for white noise.

We have p + q + 1 unknown parameters in total.

In a first step we assume that p and q are known, e.g. from an analysis of the ACF and PACF.

Later we will discuss another model selection tool to choose p and q.

98/118

ARMA Models Model fitting

OLS regression for AR(p)

From now on we will assume that E(Xt) = 0.

A very simple way of fitting an AR(p) is to fit it as a regression model using OLS for estimating

Xt =φ1Xt−1+φ2Xt−2+···+φpXt−p+Wt

.

Suppose we have n datapoints x1, x2, . . . , xn.

The response variable is the n − p datapoints Xp+1, . . . , Xn.

explanatory variables are all the sequences from lag 1 (Xp,…,Xn −1) to lag p (X1,…,Xn −p).

Hence, in using OLS regression, we will lose the information from the first p observations.

This makes this simple approach is inefficient.

99/118

ARMA Models Model fitting

Example: OLS regression for US manufacturing data

Consider the US manufacture data from the lab. This is monthly new orders in USD.

y = ts(scan(“USNewOrders.txt”), frequency = 12,

start = c(1960, 1))

ACF and PACF suggest an AR(4) model. (See next slide.) Fitting the AR model with OLS could be done by lm in R.

But generating the regressors in R would be tedious.

There is already a function in R for OLS on time series

ar.ols(y, aic = FALSE, order.max = 4)

100 / 118

ARMA Models Model fitting

Example: OLS regression for US manufacturing data

1960 1970 1980

1990 2000 2010

Series y

Time Series

ACF and PACF

0.5 1.0 1.5 2.0

0.0

Lag in Years

Series y

Year

0.0

0.5

1.0

1.5 2.0

Lag in Years

101 / 118

Xt

30 40 50 60 70

Partial ACF ACF

0.0 0.4 0.8 0.0 0.4 0.8

ARMA Models Model fitting

Example: OLS regression for US manufacturing data

> ar.ols(y, aic = FALSE, order.max = 4)

Call:

ar.ols(x = y, aic = FALSE, order.max = 4)

Coefficients: 1234

0.8543 0.1102 -0.0220 -0.0786

Intercept: 0.009228 (0.1537)

Order selected 4 sigma^2 estimated as 14.12

102 / 118

ARMA Models Model fitting

Method of moments estimators

An alternative to OLS is the method of moments.

Idea

We can derive equations for the moments E(Xtk) for k = 1,2,… in terms of the parameters.

1 n

xik. This allows us to work out estimators for the parameters.

It is easy to estimate the moments E(Xtk) by n For example, the first moment is E (Xt ) = μ

i=1

The moment estimator is the sample mean

1 n

μˆ = x ̄ = n

The method of moments for AR(p) is given by the so called Yule-Walker equations.

i=1

x i

103 / 118

ARMA Models Model fitting

Example: Second moment of AR(2) process

Consider the mean zero stationary AR(2) model: Xt = φ1Xt−1 + φ2Xt−2 + Wt

Multiply both sides by Xt−h and take expectations to get:

E(XtXt−h) = φ1E(Xt−1Xt−h)+φ2E(Xt−2Xt−h)+E(WtXt−h)

E(WtXt−h) = 0 as past lags are unrelated to future errors. As E(Xt) = 0 the expectation of the product of two lags is

the covariance:

γ(h)=φ1γ(h−1)+φ2γ(h−2) forh=1,2,…

Dividing by γ(0) = Var(Xt) gives a relation for the correlation ρ(h)=φ1ρ(h−1)+φ2ρ(h−2) forh=1,2,…

104 / 118

ARMA Models Model fitting

Yule-Walker equations

Consider the general mean zero stationary AR(p) model: Xt =φ1Xt−1+…+φpXt−p+Wt

We can get a similar relation ρ(h)=φ1ρ(h−1)+…+φpρ(h−p) forh=1,2,…,p.

The variance of the white noise can be written as σW2 =γ(0)[1−φ1ρ(1)···−φpρ(p)].

The formula of ρ(h) gives a system of p linear equations with p unknowns from all the lags h = 1,…,p.

For estimation replace

γ(0) with the sample variance s2 = γˆ(0); and

ρ(h) with their sample estimates from the ACF, i.e. γˆ(0).

Then solve to find (φ1,φ2,…,φp,σW2 ).

105 / 118

ARMA Models Model fitting

Yule-Walker estimation for an AR(1)

Consider a mean zero causal stationary AR(1) (|φ1| < 1):
Xt = φ1Xt−1 + Wt The Yule-Walker equations give:
ρ(1) = φ1ρ(0) = φ1 σW2 =γ(0)[1−φ21]
Replacing the ρ(1) with the sample ACF gives: n
(xt+1 − x ̄)(xt − x ̄)
φˆ1 = i=1
n
( x t − x ̄ ) 2 i=1
The estimated error variance is then
σW2 = γ(0)[1 − φˆ21].
106 / 118
ARMA Models Model fitting
Example: Yule-Walker for US manufacturing data
> ar.yw(y, aic = FALSE, order.max = 4)

Call:

ar.yw.default(x = y, aic = FALSE, order.max = 4)

Coefficients: 1234

0.8458 0.1137 -0.0205 -0.0772

Order selected 4 sigma^2 estimated as 14.57

107 / 118

ARMA Models Model fitting

Method of moments comments

For AR(p) models, the Yule-Walker equations provide optimal estimators in the sense of normal distributed estimates with minimum variance.

The reason is that given our assumptions, the AR(p) is a linear model and these equations are essentially least squares estimators.

Such a method of moment estimator for an MA (or ARMA) model will not provide an optimal estimators because these processes are non-linear in the parameters.

108 / 118

ARMA Models Model fitting

Method of moments for MA(1) model

Consider the invertible MA(1) model with its AR(∞) representation:

∞

Xt =Wt +θ1Wt−1 =−(−θi)jXt−j +Wt

j=1

This is non-linear in the parameter θ1.

The autocorrelation is ρ(1) = θ1/(1 + θ12).

Plug in the sample estimates

√

ρˆ(1) = θˆ1 gives θˆ = 1± 1−4 ρˆ(1)2 1+θˆ12 1 2 ρˆ(1)

which is a non-linear solution.

In general, for an arbitrary order MA(q) a closed form solution does not exist and typically an iterative quasi-Newton method is used to solve the non-linear problem.

A preferred alternative for MA(q) and ARMA(p, q) models is maximum likelihood estimation.

109 / 118

ARMA Models Model fitting

Maximum likelihood estimation (MLE)

Idea Choose the parameters which maximise the “likelihood” of observing the data.

We consider MLE for stationary AR(1) |θ1| < 1: Xt = φ1Xt−1 + Wt
The likelihood function is the joint distribution of the observations
L(φ1,σW2 ) = f(x1,x2,...,xn|φ1,σW2 ). Dropping the conditioning on the parameters for notational
convenience, this can be expanded as L(φ1,σW2 ) = f(x1,x2,...,xn)
= f(x1)f(x2|x1)f(x3|x1,x2)...f(xn|x1,...,xn−1) which for AR(1) reduces to
L(φ1, σW2 ) = f (x1)f (x2|x1)f (x3|x2) . . . f (xn|xn−1).
110 / 118
ARMA Models Model fitting
Maximum likelihood estimation AR(1)
The likelihood for the AR(1) is:
L(φ1,σW2 )=f(x1)f(x2|x1)f(x3|x2)...f(xn|xn−1).
The log-likelihood is
l(φ1,σW2 )=logL(φ1,σW2 ).
Since Wt ∼ N(0, σW2 ) i.i.d. the log-likelihood is l(φ1,σW2 ) = log L(φ1,σW2 )
with
= −n−1log(2π)+ n−1log(σW2 )− 1 S(φ1)+logf(x1) 2 2 2σW2
n
S(φ1) = (xt − φ1xt−1)2.
t=2
111 / 118
ARMA Models Model fitting
Maximum likelihood estimation AR(1)
Two approaches are commonly used to handle the initial value x1
and it’s contribution to the likelihood function f (x1):
1 Treat x1 as fixed, so that ln f (x1) does not contribute to the log-likelihood. Referred to as conditional MLE.
2 Use the stationarity assumption
x1 ∼ Normal 0, σW2 /(1 − φ21) and thus
log f (x1) = −1 log(2π)−1 log σW2 +1 ln(1−φ21)− 1 x12(1−φ21) 2 2 2 2σW2
The conditional MLE of φ1 has a closed form. The minimizer of the sum of squares S(φ1) = nt=2(xt − φ1xt−1)2 is found by differentiating with respect to φ1 and equating to zero. This gives:
nt=2 xt−1xt φ1=n x2 .
t=2 t−1
ˆ
112 / 118
ARMA Models Model fitting
Maximum likelihood estimation AR(1)
The estimate of φˆ1 can be used to estimate the variance
Remark
1 n
σˆW2=n−1 (xt−φˆ1xt−1)2.
t=2
In general, MLE of an AR(p) conditional on the first p observations is equivalent to least squares estimation.
MLE for MA(q) and ARMA(p, q) is again non-linear.
113 / 118
ARMA Models Model fitting
Example: MLE for US manufacturing data
MLE is what we did in the lab, using the arima function: > usorders.ar4 = arima(y, order = c(4, 0, 0))

> usorders.ar4

Call:

arima(x = y, order = c(4, 0, 0))

Coefficients:

ar1 ar2 ar3 ar4 intercept

0.853 0.1098 -0.0215 -0.0795 0.1376

s.e. 0.041 0.0541 0.0540 0.0410 1.1060

sigma^2 estimated as 14.29: log likelihood = -1655.58, aic = 3323.17

114 / 118

ARMA Models Model selection

7.7 Model selection

A commonly used criteria for model selection to choose p or q is the AIC (Akaike Information Criterion).

It measures how much of the information present in the data is lost by using the model.

The AIC is available for all the model fitting options we have looked at.

The AIC is commonly the default in R.

When using MLE for an ARMA(p, q) the order p and q are

chosen to minimise

−2l(φˆ1,…,φˆp,θˆ1,…,θˆq,σˆW2 )+2(p+q+1)n.

n−p−q−2 The final term is a penalty on the complexity of the model.

The smaller the AIC the better the model.

115 / 118

ARMA Models Model selection

Model diagnostics – residuals

As with all statistical modelling, a key stage is to assess the assumptions underlying the model.

A key assumption is the underlying innovations Wt are a white noise process.

It is common to plot the residuals over time, and their sample ACF to assess this assumption.

The tsdiag function in R plots the standardised innovations: x − xˆ

tt

Pˆ t − 1 t

where Pˆt−1 is the estimated one-step ahead prediction error t

(details in the textbook, not important here).

The standardised residuals for a good model should behave as

a white noise sequence with mean zero and variance one.

116 / 118

ARMA Models Model selection

Model diagnostics – residuals

A Ljung-Box test should be used to assess whether the residuals are white noise.

H0 : Wt is white noise Test statistic at lag h is

Q(h)=n(n+2)h ρˆW(k)2

where ρˆW (k) is the sample ACF at lag k.

Under mild conditions Q(h) is asymptotically χ2h−p−q

distributed.

Hence, p-values can be derived from χ2h−p−q.

Small p-values will lead to rejection of the null, suggesting that Wt is not white noise.

The p-values are plotted by tsdiag.

n−k k=1

117 / 118

ARMA Models Model selection

Example: Diagnostics for US manufacturing data

Standardized Residuals

1960

1970 1980

1990

2000

2010

Time

ACF of Residuals

0.0

0.5

1.0

Lag

p values for Ljung−Box statistic

1.5

2.0

●●● ●● ●●

2 4 6 8 10

lag

118 / 118

●

●

●

p value ACF

0.0 0.4 0.8 −0.2 0.2 0.6 1.0 −4 −2 0 2 4