# CS计算机代考程序代写 finance Principal Components Analysis

Principal Components Analysis
Chris Hansman
Empirical Finance: Methods and Applications Imperial College Business School
February 15-16
1/86

Today: Four Parts
1. Geometric Interpretation of Eigenvalues and Eigenvectors 2. Geometric Interpretation of Correlation Matricies
3. An Introduction to PCA
4. An Example of PCA
2/86

Topic 1: Geometry of Eigenvalues and Eigenvectors
1. Technical definitions of eigenvalues and eigenvectors 2. Geometry of matrix multiplication: rotate and stretch 3. Eigenvectors are only stretched
4. Length of eigenvectors doesn’t matter
3/86

A Review of Eigenvalues and Eigenvectors
􏰒 Consider a square n×n matrix A.
􏰒 An eigenvalue λi of A is a (1×1) scalar:
􏰒 The corresponding eigenvector of a ⃗vi is an (n × 1) vector 􏰒 Whereλi,⃗vi satisfy:
A⃗vi =λi⃗vi
4/86

Geometric Interpretation of Eigenvalues and Eigenvectors
􏰒 Consider the square n×n matrix A.
􏰒 A times any (n×1) vector gives an (n×1) vector
􏰒 Useful to think of this as a linear function that: 􏰒 Takes n×1 vectors as inputs
􏰒 Gives n×1 vectors as outputs:
f : Rn → Rn
􏰒 Specifically, for the input vector⃗v, this is the function that outputs: f (⃗v ) = A⃗v
5/86

Geometric Interpretation of Eigenvalues and Eigenvectors
􏰒 Consider the square n×n matrix A.
􏰒 Think of this matrix as the function that maps vectors to vectors:
􏰒 Lets say
f (⃗v ) = A⃗v
􏰉5 0􏰊 􏰉2􏰊 A= 2 3 and⃗v= 1
􏰒 Whatisf(⃗v)? 􏰒 menti.com
6/86

The Matrix A Can be Thought of as a Function
􏰒 Consider the square n×n matrix A.
􏰒 Think of this matrix as the function that maps vectors to vectors:
􏰒 Lets say
􏰒 Whatisf(⃗v)?
f (⃗v ) = A⃗v
􏰉5 0􏰊 􏰉1􏰊
A= 2 3 and⃗v= 0
􏰉5􏰊 f(⃗v)=Ax= 2
7/86

The Matrix A Rotates and Stretches a Vector ⃗v 􏰒 Lets say
􏰉5 0􏰊 􏰉1􏰊 􏰉5􏰊 A= 2 3 and⃗v= 0 ⇒A⃗v= 2
v=(1 0)’
8/86

The Matrix A Rotates and Stretches a Vector ⃗v 􏰒 Lets say
􏰉5 0􏰊 􏰉1􏰊 􏰉5􏰊 A= 2 3 and⃗v= 0 ⇒A⃗v= 2
v=(1 0)’
Av=(5 2)’
9/86

The Matrix A Rotates and Stretches a Vector ⃗v 􏰒 Lets say
􏰉5 0􏰊 􏰉−1􏰊 􏰉−5􏰊 A= 2 3 and⃗v= 2 ⇒A⃗v= 4
v=(−1 2)’
10/86

The Matrix A Rotates and Stretches a Vector ⃗v 􏰒 Lets say
􏰉5 0􏰊 􏰉−1􏰊 􏰉−5􏰊 A= 2 3 and⃗v= 2 ⇒A⃗v= 4
Av=(−5 4)’
v=(−1 2)’
11/86

The Matrix A Rotates and Stretches a Vector ⃗v 􏰒 Lets say
􏰉5 0􏰊 􏰉0􏰊 􏰉0􏰊
A= 2 3 andv⃗2= 1 ⇒Av⃗2= 3 =3v⃗2
v2=(0 1)’
12/86

The Matrix A Rotates and Stretches a Vector ⃗v 􏰒 Lets say
􏰉5 0􏰊 􏰉0􏰊 􏰉0􏰊
A= 2 3 andv⃗2= 1 ⇒Av⃗2= 3 =3v⃗2
Av2=(0 3)’
v2=(0 1)’
13/86

For Some Vectors⃗v, Matrix A Only Stretches 􏰒 Lets say
􏰉5 0􏰊 􏰉0􏰊 􏰉0􏰊
A= 2 3 andv⃗2= 1 ⇒Av⃗2= 3 =3v⃗2
Av2=(0 3)’ = 3v2
v2=(0 1)’
14/86

For Some Vectors⃗v, Matrix A Only Stretches 􏰒 Lets say
􏰒
􏰉5 0􏰊 􏰉0􏰊 􏰉0􏰊
A= 2 3 andv⃗2= 1 ⇒Av⃗2= 3 =3v⃗2
􏰉0􏰊
Some vectors, like v⃗2 = 1 have a special relationship with A:
􏰒 The matrix A only stretches v⃗2 􏰒 No rotations!
􏰉0􏰊 Are there any other vectors like v⃗2 = 1 ?
􏰉1􏰊 Lets try v⃗1 = 1
􏰒 􏰒
15/86

For Some Vectors⃗v, Matrix A Only Stretches 􏰒 Lets say
􏰉5 0􏰊 􏰉1􏰊 􏰉5􏰊
A= 2 3 andv⃗1= 1 ⇒Av⃗1= 5 =1v⃗1
v1=(1 1)’
16/86

For Some Vectors⃗v, Matrix A Only Stretches 􏰒 Lets say
􏰉5 0􏰊 􏰉1􏰊 􏰉5􏰊
A= 2 3 andv⃗1= 1 ⇒Av⃗1= 5 =1v⃗1
Av1=(5 5)’
v1=(1 1)’
17/86

For Some Vectors⃗v, Matrix A Only Stretches 􏰒 Lets say
􏰉5 0􏰊 􏰉1􏰊 􏰉5􏰊
A= 2 3 andv⃗1= 1 ⇒Av⃗1= 5 =5v⃗1
Av1=(5 5)’ = 5v1
v1=(1 1)’
18/86

For Some Vectors⃗v, Matrix A Only Stretches
􏰒 For the matrix A, we’ve found two vectors with this special property: 􏰉1􏰊 􏰉5􏰊
v⃗1= 1 withAv⃗1= 5 = 5 v⃗1 􏰐􏰏􏰎􏰑
λ1
􏰉0􏰊 􏰉0􏰊
v⃗2= 1 withAv⃗2= 3 = 3 v⃗2
􏰐􏰏􏰎􏰑
λ1 􏰒 We call these vectors eigenvectors of the matrix A
19/86

For Some Vectors⃗v, Matrix A Only Stretches
􏰒 For the matrix A, we’ve found two vectors with this special property: 􏰉1􏰊 􏰉5􏰊
v⃗1= 1 withAv⃗1= 5 = 5 v⃗1 􏰐􏰏􏰎􏰑
λ1
􏰉0􏰊 􏰉0􏰊
v⃗2= 1 withAv⃗2= 3 = 3 v⃗2
􏰐􏰏􏰎􏰑
λ2 􏰒 Note that they get stretched by different factors
􏰒 5forv⃗1,3forv⃗2
􏰒 We call these stretching factors eigenvalues:
λ1=5, λ2=3
20/86

Defining Eigenvalues and Eigenvectors
􏰒 This notion of only stretching is the defining feature of eigenvalues and eigenvectors:
􏰒 Eigenvalue λi and corresponding eigenvector ⃗vi are λi , ⃗vi such that: A⃗vi =λi⃗vi
􏰒 In our example:
􏰒 And:
A⃗v1 = λ1⃗v1
􏰉5 0􏰊􏰉1􏰊 􏰉1􏰊
231=51
A⃗v2 = λ2⃗v2
􏰉5 0􏰊􏰉0􏰊 􏰉1􏰊
231=31
21/86

Length of Eigenvector Doesn’t Change Anything
􏰒 Imagine multiplying an eigenvector by some constant (e.g. 1 ): 2
􏰉5 0􏰊 􏰉1􏰊 􏰉5􏰊
A= 2 3 andv⃗1= 1 ⇒Av⃗1= 5 =5v⃗1
Av1=(5 5)’ = 5v1
v1=(1 1)’
22/86

Length of Eigenvector Doesn’t Change Anything
􏰒 Imagine multiplying an eigenvector by some constant (e.g. 1 ): 2
􏰉5 0􏰊 􏰉0.5􏰊 􏰉2.5􏰊
A= 2 3 andv⃗1= 0.5 ⇒Av⃗1= 2.5 =5v⃗1
v1=(0.5 0.5)’
22/86

Length of Eigenvector Doesn’t Change Anything
􏰒 Imagine multiplying an eigenvector by some constant (e.g. 1 ): 2
􏰉5 0􏰊 􏰉0.5􏰊 􏰉2.5􏰊
A= 2 3 andv⃗1= 0.5 ⇒Av⃗1= 2.5 =5v⃗1
Av1=(2.5 2.5)’ = 5v1
v1=(0.5 0.5)’
22/86

Length of Eigenvector Doesn’t Change Anything
􏰒 Imagine multiplying an eigenvector by some constant (e.g. 1 ): 2
􏰉5 0􏰊 􏰉0.5􏰊 􏰉2.5􏰊
A= 2 3 andv⃗1= 0.5 ⇒Av⃗1= 2.5 =5v⃗1
Av1=(2.5 2.5)’ = 5v1
v1=(0.5 0.5)’
22/86

Length of Eigenvector Doesn’t Change Anything
􏰒 Imagine multiplying an eigenvector by some constant (e.g. 2): 􏰉5 0􏰊 􏰉0􏰊 􏰉0􏰊
A= 2 3 andv⃗2= 2 ⇒Av⃗2= 6 =3v⃗2
Av2=(0 3)’ = 3v2
v2=(0 1)’
23/86

Length of Eigenvector Doesn’t Change Anything
􏰒 Imagine multiplying an eigenvector by some constant (e.g. 2): 􏰉5 0􏰊 􏰉0􏰊 􏰉0􏰊
A= 2 3 andv⃗2= 2 ⇒Av⃗2= 6 =3v⃗2
v2=(0 2)’
23/86

Length of Eigenvector Doesn’t Change Anything
􏰒 Imagine multiplying an eigenvector by some constant (e.g. 2): 􏰉5 0􏰊 􏰉0􏰊 􏰉0􏰊
A= 2 3 andv⃗2= 2 ⇒Av⃗2= 6 =3v⃗2
Av2=(0 6)’
v2=(0 2)’
23/86

Length of Eigenvector Doesn’t Change Anything
􏰒 Imagine multiplying an eigenvector by some constant (e.g. 2): 􏰉5 0􏰊 􏰉0􏰊 􏰉0􏰊
A= 2 3 andv⃗2= 2 ⇒Av⃗2= 6 =3v⃗2
Av2=(0 6)’ = 3v2
v2=(0 2)’
23/86

Length of Eigenvector Doesn’t Change Anything
􏰒 Any multiple of an eigenvector is also an eigenvector: 􏰒 if vi is an an eigenvector, so is cvi for any scalar c.
􏰒 As a result, often normalize them so that they have unit length 􏰒 i.e. vi′vi =1
􏰒 Best to think of an eigenvector vi as a direction 􏰒 Think of eigenvalue λi as a stretching factor
24/86

Finding Eigenvalues of Symmetric Matricies
􏰒 From here, focus on symmetric matricies (like covariance Σx ) 􏰒 How do we calculate the eigenvalues?
􏰒 Use a computer
􏰒 But if you have to, in the 2×2 case:
􏰉a b􏰊 A=bd
(a+d)+􏰺(a−d)2 +4b2 2 (a+d)−􏰺(a−d)2 +4b2 2
􏰉7 0􏰊 A=02
λ1 =
λ2 = 􏰒 What are the eigenvalues of:
􏰒 Menti.com
25/86

Finding Eigenvectors of Symmetric Matricies
􏰒 From here, we will focus on symmetric matricies (like Σx )
􏰒 Given the eigenvalues, how do we calculate the eigenvectors?
􏰒 Again, use a computer
􏰒 But if you have to, simply use:
Avi =λivi
􏰒 Important note: Symmetric matrices have orthogonal eigenvectors,
that is:
for any i ̸=j
vi′vj =0
26/86

Finding Eigenvalues of Diagonal Matrices
􏰒 Diagonal matrices are a subset of symmetric matrices: 􏰉a 0􏰊
A=0d
􏰒 How do we calculate the eigenvalues and eigenvectors?
27/86

Topic 2: Geometric Interpretations of Correlation Matrices
1. Uncorrelated assets: eigenvalues are variances
2. Correlated assets: first eigenvector finds direction of maximum variance
28/86

􏰉za􏰊 Uncorrelated Standardized Data: z = zb
z2
−10 −5 0 5 10
−10 −5 0 5 10 z1
􏰉1 0􏰊 Cov(z)=Σz= 0 1
29/86

􏰉xa􏰊 Uncorrelated (Non-Standardized) Data: x = xb
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
􏰉4 0􏰊 Cov(x)=Σx= 0 4
30/86

􏰉xa􏰊 Uncorrelated (Non-Standardized) Data: x = xb
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
􏰉3 0􏰊 Cov(x)=Σx= 0 1
31/86

􏰉xa􏰊 Uncorrelated (Non-Standardized) Data: x = xb
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
􏰉3 0􏰊 Cov(x)=Σx= 0 1
V2
V1
31/86

􏰉xa􏰊 Uncorrelated (Non-Standardized) Data: x = xb
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
􏰉3 0􏰊 Cov(x)=Σx= 0 1
V2
V1
31/86

Eigenvalues of Σx with Uncorrelated Data 􏰉3 0􏰊
Σx= 0 1
􏰒 What are the eigenvalues and eigenvectors of Σx ?
􏰒 Uncorrelated assets: eigenvalues are variances of each asset return! 􏰒 Eigenvectors:
􏰉1􏰊 􏰉0􏰊 v1= 0 , v2= 1
􏰒 First eigenvalue points in the direction of the largest variance 􏰒 We sometimes write the eigenvectors together as a matrix:
􏰉1 0􏰊 Γ=(v1 v2)= 0 1
32/86

􏰉xa􏰊 Uncorrelated (Non-Standardized) Data: x = xb
||V1||=λ1=3
V2
V1
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
􏰉3 0􏰊 Cov(x)=Σx= 0 1
32/86

􏰉xa􏰊 Uncorrelated (Non-Standardized) Data: x = xb
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
􏰉1 0􏰊 Cov(x)=Σx= 0 3
33/86

Eigenvalues of Σx with Uncorrelated Data 􏰉1 0􏰊
Σx= 0 3
􏰒 What are the eigenvalues and eigenvectors of Σx ?
􏰒 With uncorrelated assets eigenvalues are just the variances of each asset return!
􏰒 Eigenvectors:
􏰉0􏰊 􏰉1􏰊 v1= 1 , v2= 0
􏰒 Note that the first eigenvalue points in the direction of the largest variance
􏰒 We sometimes write the eigenvectors together as a matrix:
􏰉0 1􏰊 Γ=(v1 v2)= 1 0
34/86

􏰉xa􏰊 Correlated Data: z = xb
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
􏰉2 1􏰊 Cov(x)=Σx= 1 2
35/86

Eigenvalues of Σx with Correlated Data 􏰉2 1􏰊
Σx= 1 2
􏰒 What are the eigenvalues and eigenvectors of Σx ?
􏰒 With correlated assets eigenvalues are a bit trickier
􏰒 The eigenvalues are 3 and 1
􏰒 Which of the following is not an eigenvector of Σx ? 􏰇1 −􏰇1 2􏰇1
􏰇1 􏰇1  􏰇1 r=2,w=2,s=2
􏰒 menti.com
222
36/86

􏰉xa􏰊 Correlated Data: z = xb
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
􏰉2 1􏰊 Cov(x)=Σx= 1 2
V1
37/86

Eigenvectors are Γ = (v1 v2) = 2 2 22
V2=(−1 1)’ V1=(1 1)’
−10 −5 0 5 10 xa
􏰉2 1􏰊 Cov(x)=Σx= 1 2
􏰇1 −􏰇1 􏰇1 􏰇1
xb
−10 −5 0 5 10
37/86

Eigenvectors of Σx with Correlated Data 􏰉2 1􏰊
Σx= 1 2
􏰇1 −􏰇1
􏰇1 􏰇1 Γ=(v1 v2)= 2 2
22
􏰒 Just as with uncorrelated data, first eigenvector finds the direction with the most variability
􏰒 Second eigenvector points in the direction that explains the maximum amount of the remaining variance
􏰒 Note that the two are perpendicular
􏰒 This is the geometric implication of the fact that they are orthogonal:
vi′vj =0
􏰒 The fact that they are orthogonal also implies:
Γ′ = Γ−1
38/86

Eigenvalues of Σx with Correlated Data 􏰉2 1􏰊
Σx= 1 2 λ1=3 λ2=1
􏰒 The eigenvalues are the same as our uncorrelated data
􏰒 Note that the scatter plot looks quite similar to our uncorrelated data
􏰒 Just rotated a bit
􏰒 Imagine rotating the data so that the first eigenvector is lined up with the x-axis
􏰒 The first eigenvalue is the variance (along the x axis) of this rotated data
􏰒 The second eigenvalue is the variance along the y axis
39/86

Eigenvalues Represent Variance along the Eigenvectors
V2=(−1 1)’ V1=(1 1)’
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
􏰉2 1􏰊 Cov(x)=Σx= 1 2
40/86

Eigenvalues Represent Variance along the Eigenvectors
xb
−10 −5 0 5 10
−10 −5 0 5 10 xa
􏰉2 1􏰊 Cov(x)=Σx= 1 2
41/86

What is This Rotation
􏰒 So with a little rotation, we take our data drawn from 􏰉xa􏰊
With
x= xb
􏰉2 1􏰊
Σx= 1 2
􏰒 And back what looks like our uncorrelated data, which was
generated by
̃ 􏰉3 0􏰊 Σ=01
􏰒 How do we rotate x into this uncorrelated data? Γ′x
42/86

Topic 3: Introduction to Principal Components Analysis
43/86

Principal Components Analysis
􏰒 This notion of rotation underlies the concept of Principal Components Analysis
􏰒 Consider a general cross-section of returns on m assets xt
􏰐􏰏􏰎􏰑
m×1
E[xt] = α Cov(xt) = Σx
44/86

Principal Components Analysis
xt 􏰐􏰏􏰎􏰑
m×1 E[xt] = α Cov(xt) = Σx
􏰒 Define the normalized asset returns: x ̃t = xt − α 􏰒 Let the eigenvalues of Σx be given by:
λ1 ≥λ2 ≥λ3 ≥···≥λm 􏰒 Let the eigenvectors be given by:
v1,v2,v3,···vm
45/86

Principal Components Analysis
Cov(xt) = Σx
􏰒 Note that the eigenvectors are orthogonal vi′vj =0
􏰒 Because the scaling doesn’t matter, we can normalize: vi′vi =1
􏰒 These scaled, normalized vectors are called orthonormal
􏰒 As before, let Γi be the matrix with eigenvectors as columns:
Γi = [v1 v2 ···vm]
46/86

Principal Components Analysis
􏰒 Define the Principal components variables as the rotation: p = Γ ′ x ̃ t
􏰒 Or written out further:
􏰒E[p]= 0 􏰐􏰏􏰎􏰑
m×1
􏰐􏰏􏰎􏰑
m×1
 v 1′ ( x t − α )  v′(xt −α)
2 p= .  
v m′ ( x t − α )
47/86

Principal Components Analysis
􏰒 Recall the eigendecomposition:
Σx = ΓΛΓ′
􏰒 Where
􏰒 Hence
λ1 ··· 0 …
Λ= . .. .  0 ··· λm
Cov(p) = Cov(Γ′x ̃ ) = Γ′Cov(x )Γ tt
= Γ′ΓΛΓ′Γ = Λ
48/86

Aside: Variance Decomposition
􏰒 A nice result from linear algebra:
mm ∑var(xit)= ∑λi
i=1 i=1
􏰒 So the proportion of the total variance of xi that is explained by the
largest eigenvalue λi is simply:
λi ∑mi=1 λi
49/86

Principal Components Analysis
􏰒 Our Principal components variables provide a transformation of the data into variables that are:
􏰒 Uncorrelated (orthogonal)
􏰒 Ordered by how much of the total variance they explain (size of
eigenvalue)
􏰒 What if we have many m, but the first few (2, 5, 20) Principal components explain most of the variation:
􏰒 Idea: Use these as “factors” 􏰒 Dimension reduction!
50/86

Principal Components Analysis
􏰒 Note that because Γ′ = Γ−1
xt =α+Γp