Nonlinear Econometrics for Finance Lecture 3

. Econometrics for Finance Lecture 3 1 / 18

Recap: testing asset pricing models

Copyright By cscodehelp代写 加微信 cscodehelp

Prices are discounted expectations of future cash flows:

pt = Et[mt+1 (pt+1 + dt+1)].

Dividing by pt both sides, we can now re-write the pricing equation in terms of

(pt+1 + dt+1) pt

] ⇒ 1 = Et[mt+1(1 + Rt+1)].

We now have our pricing equation in terms of returns:

Et(mt+1(1 + Rt+1)) = 1.

Equivalently, by taking 1 to the left-hand side, we can write a “conditional

expected pricing error”:

Et(mt+1(1 + Rt+1) − 1) = 0.

Taking unconditional expectations of both sides, by the law of iterated expectations, we now have an “unconditional expected pricing error” or a moment condition:

E(mt+1(1 + Rt+1) − 1) = 0.

. Econometrics for Finance Lecture 3 2 / 18

Recap: testing asset pricing models

Consider, now, N assets.

We can stack the moment conditions for all N assets one on top of the other to obtain

1+Rt+1 1 Emt+1 … − … = 0.

1+RN 1 t+1

Notice that mt+1 depends on parameters. Different asset pricing models will, therefore, lead to a

different stochastic discount factor mt+1 and different moment conditions. In the Consumption CAPM with CRRA utility: mt+1 = β ct+1 −γ .

In this case, the moment conditions become

c −γ 1+Rt+1

− …=0.

t+1 … ct N

There are two parameters to estimate: the subjective discount factor β and the coefficient of

relative risk aversion γ. We could write mt+1(θ) with θ = (β, γ).

1

1+Rt+1 N vector

. Econometrics for Finance Lecture 3 3 / 18

Recap: testing asset pricing models

The moment conditions depend on an expectation. We do not know the expectation. We can, however, compute empirical means. Sample means converge to expectations by the law of large numbers.

Estimation: GMM estimates θ by setting the difference between the sample mean of mt+1(θ)(1 + Rt+1) and 1 as close as possible to 0:

T−1 1+R1 1 1 t+1

mt+1(θ) … − … T t=1 1+RN 1

= gT(θ)

more compact notation

Nvector − 1+Rt+1 Nvector − 1

Testing: Given an estimate for θ, denoted by θT , GMM evaluates the size of

the pricing errors (Hansen, 1982): how close to 0 is the difference between the

sample mean of mt+1(θT )(1 + Rt+1) and 1? The larger the pricing errors, the worse the pricing model.

. Econometrics for Finance Lecture 3 4 / 18

GMM: The criterion

Estimation of θ:

arg min gT (θ)⊤ WT gT (θ) = θ

arg min QT (θ)

Thus,wechooseθT sothat ∂Q (θ ) ≈0.

1×N N×N N×1

Assume the dimension of the vector θ is d with N ≥ d. (The number of

parameters is not larger than the number of assets.)

Typically, we cannot estimate θ to make the pricing errors exactly zero.

However, we want to make the pricing errors as small as possible.

In order to do so, we minimize a quadratic criterion: arg min QT (θ) .

θ 1×1

Note: the weight matrix WT tells you how much emphasis you are putting on specific moments (i.e., on specific assets).

If WT = IN , i.e., the identity matrix, then you are effectively treating all assets in the same way. In this case, the criterion minimizes the sum of the squared pricing errors.

. Econometrics for Finance Lecture 3 5 / 18

GMM: The criterion

Example with 2 assets

The model:

) − 1 ) − 1

(θ)(1 + R1 t+1

(θ)(1 + R2 t+1

g1(X g2(X

, θ) , θ)

= E(g(X , θ)) = 0, t+1

where 2 is the number of assets (1 moment condition per asset).

Empirically (after replacing “expectations” with “sample means”):

T−1 1 T−1 T−1

1 mt+1(θ)(1+Rt+1)−1 = 1 g1(Xt+1,θ) = 1 g(X ,θ)=g (θ)≈0.

T t=1 mt+1(θ)(1+Rt+1)−1 Estimation criterion:

T t=1 g (Xt+1,θ)

T t=1 2×1

1 T−1 1

1 T−1 2

t=1 g (Xt+1,θ)

t=1 g (Xt+1,θ) θT = argmin T t=1 g (Xt+1,θ) T t=1 g (Xt+1,θ) WT 1 T−1 2

= argming (θ)⊤ W g (θ) = argminQ (θ). T TT T

θ θ 1×2 2×2 2×1 1×1

. Econometrics for Finance Lecture 3 6 / 18

GMM: The criterion

Example with 2 assets

Recall the estimation criterion:

θT = argmin θ

T t=1 g (Xt+1,θ) 2×2

g (Xt+1,θ) T

argminw1

g (Xt+1,θ)

1 T−1 T t=1

w3 T t=1 g (Xt+1,θ)

1 T−1 T t=1

g (Xt+1,θ)

g (Xt+1,θ)

t=1 g (Xt+1,θ) WT 1 T−1 2 .

If WT = I2, then we minimize the sum of the squared pricing errors:

T t=1 g (Xt+1,θ) 1 T−1 2

T t=1 g (Xt+1,θ)

If WT is a generic symmetric matrix, then we minimize a “weighted” sum of the squared pricing errors:

θT = argmin θ

t=1 g (Xt+1,θ)

T t=1 g (Xt+1,θ) T

2

= argmin 1 1 g1(Xt+1,θ) +1 1 g2(Xt+1,θ) .

T−1 θ T t=1 T t=1

T−1 2

g (Xt+1,θ) +w2

T−1 11 12

g (Xt+1,θ) g (Xt+1,θ). T t=1

1T−1 1

1T−11

1T−1 1

w 1 T−1 2

3 2 T t=1g(Xt+1,θ)

g (Xt+1,θ) +

. Econometrics for Finance Lecture 3

GMM: Some important ingredients

Recall the criterion function:

QT(θ) = gT(θ)⊤ WT gT(θ).

1×1 1×N N×N N×1

Thus, for m = 1, …, d, the first derivative of the criterion function is:

⊤ ∂QT (θ)

T−1 T−1

Tt=1

N×N

∂QT (θ) ∂θ

∂θ 1 ∂QT (θ) 1 ∂g(Xt+1,θ) 1

= ··· where =2

g(X ,θ) Tt+1

∂QT (θ) ∂θd

and, for m, j = 1, …, d, the second derivative of the criterion function is:

∂2QT (θ) ⊤

∂2QT (θ) ∂θ1∂θ1 ∂2QT (θ)

∂2QT (θ) ∂θ1∂θ2 ∂2QT (θ)

· · · ∂2QT (θ) ∂θ1∂θd

··· ··· 1222 ∂θ∂θ ··· ··· ··· ··· ∂2Q (θ)

d×d ··· ··· ··· T ∂θd∂θd

T−1 ⊤ T−1 where ∂ QT(θ) = 2 1 ∂g(Xt+1,θ) W 1 ∂g(Xt+1,θ)

T T t=1 ∂θm T t=1 ∂θj

T−1 ⊤T−1 + 21∂g(Xt+1,θ) W 1g(X ,θ).

Tt+1 T t=1 ∂θm∂θj T t=1

. Econometrics for Finance Lecture 3

GMM: Assumptions

1 We will assume that the data is IID for now. We will consider dependent, stationary data in the future.

2 We will assume that the weight matrix WT is such that WT →p W .

3 Because WT will be defined as a fixed matrix (the identity matrix, for example) or as a data-driven sample average, this property will always be true.

4 For the sample average, it will be true by the WLLN.

. Econometrics for Finance Lecture 3 9 / 18

GMM: A useful Taylor’s expansion (around θ0)

By Taylor’s expansion, stopped at the first order, around the true

∂ Q ( θ ) ∂ Q ( θ ) ∂ 2 Q ( θ )

T T − T 0 = T 0 θT−θ0 .

d×1 vector

d×d matrix

Note: ∂QT (θT ) ≈ 0. In fact, we are minimizing Q (θ) with respect

to θ and θT is the minimizer. It follows that

∂2QT (θ0)−1 ∂QT (θ0)

θT−θ0 =− . ∂θ∂θ⊤ ∂θ

. Econometrics for Finance Lecture 3 10 / 18

GMM: a useful Taylor’s expansion (around θ0)

∂2QT(θ0)−1 ∂QT(θ0) θT−θ0 =− .

∂θ∂θ⊤ ∂θ Elements of the d × 1 gradient vector ∂QT (θ0) : For m = 1, …, d,

T−1 ⊤ T−1

∂QT(θ0)=2 1∂g(Xt+1,θ0) W 1g(X

∂θm T t=1 ∂θm T t=1 Elements of the d × d Hessian matrix ∂2QT (θ0) : For m, j = 1, …, d,

⊤ T−1 ∂ QT(θ0) = 2 1 ∂g(Xt+1,θ0) W 1 ∂g(Xt+1,θ0)

T T t=1 ∂θm T t=1 ∂θj

T−1 ⊤ T−1

+ 21∂g(Xt+1,θ0) W 1g(X ,θ0).

T t=1 ∂θm∂θj T t=1

. Econometrics for Finance Lecture 3

Consistency: the gradient

Elements of the d × 1 gradient vector ∂QT (θ0) : ∂θ

∂Q (θ0) T=2

1 ∂g(X

⊤

t+1 T ∂θm

W g(X ,θ0) T t+1

T t=1

p →Wp

→ E g(Xt+1,θ0) = 0

p ∂g(Xt+1,θ0)

→p 2Γ⊤0,m W 0

E g(Xt+1 , θ0 )

= 0? Because this is what the moment conditions imply! See All convergences in probability are due to the WLLN.

Important: Why is first slide.

Thus, for the full gradient, we have:

∂QT(θ0) →p2Γ0,1 Γ0,2 … Γ0,d⊤W0=2Γ⊤0 W0=0 ∂θ

. Econometrics for Finance Lecture 3 12 / 18

Consistency: the Hessian

Elements of the d × d Hessian matrix ∂2QT (θ0) :

2 T−1 T−1 ∂ Q (θ0) 1 ∂g(X ,θ0) 1 ∂g(X ,θ0)

T t+1t+1

p ∂g(Xt+1,θ0)

p →W p ∂g(Xt+1,θ0)

= Γ0,m ⊤

→ E ∂θj = Γ0,j

1 T − 1 ∂ g ( X , θ 0 ) 1 T − 1

t+1 +2 WT g(Xt+1,θ0) .

T ∂θm∂θ T

t=1 jt=1

p ∂g(Xt+1,θ0)

→p W →p Eg(X ,θ ) = 0

→E 2Γ⊤0,m W Γ0,j

All convergences in probability are due to the WLLN.

Thus, for the full Hessian, we have:

∂2QT(θ0) →p 2Γ0,1 Γ0,2 … Γ0,d⊤ W Γ0,1

Γ0,d = 2Γ⊤0 WΓ0

. Econometrics for Finance Lecture 3

Consistency: putting gradient and Hessian together

−1 θ − θ = − ∂ 2 Q T ( θ 0 )

∂ Q T ( θ 0 ) p→ 0 ∂θ∂θ⊤ ∂θ

p →p 2Γ⊤0 WΓ0 →2Γ⊤0 W0

Conclude: The GMM estimator (θT ) is a consistent estimator for the true parameter vector (θ0).

In other words, it converges to θ0 in probability as T → ∞.

. Econometrics for Finance Lecture 3 14 / 18

Asymptotic normality: the standardized gradient Elements of the d × 1 standardized gradient vector ∂QT (θ0) :

√ ∂QT(θ0) T=2

T−1 T−1 1∂g(Xt+1,θ0) 1

W √g(X,θ0)

T t+1

Tt=1∂θmTt=1

p →W

p ∂g(Xt+1,θ0) → E ∂θm

2Γ⊤0,m W N (0, Φ0 )

d ⊤

→ N0, E g(X ,θ )g(X ,θ ) t+1 0 t+1 0

The first two terms converge in probability (by the WLLN). The last term converges in distribution (by the CLT). The entire term converges by Slutsky’s theorem (in distribution). Thus, for the full standardized gradient, we have:

√ ∂QT(θ0) d

⊤ ⊤ ⊤ WN(0,Φ0) = 2Γ0 WN(0,Φ0) = N(0,4Γ0 WΦ0WΓ0)

→2 Γ0,1 . Bandi

Γ0,2 … Γ0,d

Nonlinear Econometrics for Finance Lecture 3 15 / 18

Asymptotic normality: putting standardized gradient and Hessian together

∂θ∂θ⊤ ∂θ

√ TθT−θ0 =−

√ ∂QT(θ0) T

→p 2Γ⊤0 WΓ0 →d N(0,4Γ⊤0 WΦ0WΓ0)

N(0,(2Γ⊤0 WΓ0)−14Γ⊤0 WΦ0WΓ0(2Γ⊤0 WΓ0)−1) N(0,(Γ⊤0 WΓ0)−1Γ⊤0 WΦ0WΓ0(Γ⊤0 WΓ0)−1)

V(θT)= 1(Γ⊤WΓ0)−1Γ⊤WΦ0WΓ0(Γ⊤WΓ0)−1

∂g(Xt+1,θ0) Γ0=E ⊤

⊤ Φ0 = E g(Xt+1, θ0)g(Xt+1, θ0) .

∂θ N×d

. Econometrics for Finance Lecture 3

Asymptotic normality: implications

Conclude: The GMM estimator is asymptotically normally distributed (as T → ∞).

The asymptotic variance depends on three quantities, Γ0, Φ0 and W.

Γ0 and Φ0 are expectations. They can be estimated using sample means by the WLLN.

W can just be replaced by the initial weight matrix (which is a choice variable). Thus,

Notice that, in order to compute Γ0 and Φ0, we need to use θT (since θ0 is

Once we have V(θT ), we can compute confidence intervals, test hypothesis and

T0 0 0 V(θT ) = 1 (Γ⊤WT Γ0)−1Γ⊤WT Φ0WT Γ0(Γ⊤WT Γ0)−1

Φ0 = 1 g(Xt+1,θT)g(Xt+1,θT)⊤. t=1

T−1 Γ0=

∂g(Xt+1, θT ) ∂θ⊤

so on. In other words, we can do statistical inference.

. Econometrics for Finance Lecture 3 17 / 18

Let us see GMM estimation in practice using Matlab …

. Econometrics for Finance Lecture 3 18 / 18

程序代写 CS代考 加微信: cscodehelp QQ: 2235208643 Email: kyit630461@163.com