程序代写代做代考 matlab AI algorithm handoutE.dvi

handoutE.dvi

ECS130 Scientific Computing Handout E February 13, 2017

1. The Power Method

(a) Pseudocode:

Power Iteration

Given an initial vector u0,
i = 0
repeat
ti+1 = Aui
ui+1 = ti+1/‖ti+1‖2 (approximate eigenvector)
θi+1 = u

H
i+1Aui+1 (approximate eigenvalue)

i = i+ 1
until convergence

(b) Practical stopping criterion: |θi+1 − θi| ≤ tol · |θi|.

A =





−261 209 −49
−530 422 −98
−800 631 −144



 .

and λ(A) = {10, 4, 3}. Let u0 = (1, 0, 0)
T , then

i 1 2 3 · · · 10
θi 994.49 13.0606 10.07191 · · · 10.0002

(d) Convergence analysis: Assume
A = XΛX−1

with Λ = diag(λ1, λ2, . . . , λn) and |λ1| > |λ2| ≥ . . . ≥ |λn|. Then, we can show that

• ui =
Aiu0

‖Aiu0‖
→ x1/‖x1‖, where x1 = Xe1 as i → ∞.

• θi → λ1 as i → ∞.

(e) The convergence rate depends on
|λ2|
|λ1|

. Therefore, if
|λ2|
|λ1|

is close to 1, then the power
method could be very slow convergent or doesn’t converge at all.

2. Inverse Iteration

(a) Purposes:

• Overcome the drawbacks of the power method (slow convergence)

• find an eigenvalue closest to a particular given number (called shift): σ

(b) Observation: if λ is an eigenvalue of A, then

• λ− σ is an eigenvalue of A− σI,

• 1
λ−σ

is an eigenvalue of (A− σI)−1.

1/(lambda-sigma)

sigma

Inverse Iteration

Given an initial vector u0 and a shift σ
i = 0
repeat
solve (A− σI)ti+1 = ui for ti+1
ui+1 = ti+1/‖ti+1‖2 (approximate eigenvector)
θi+1 = u

H
i+1Aui+1 (approximate eigenvalue)

i = i+ 1
until convergence

(d) Convergence analysis: Assume A = XΛX−1 with Λ = diag(λ1, λ2, . . . , λn) and λk is the
eigenvalue cloest to the shift σ. It can be shown that

• ui → xk/‖xk‖ as i → ∞, where xk = Xek
• θi converges to λk i → ∞.

• Convergence rate depends on maxj 6=k
|λk−σ|
|λj−σ|

(e) Advantages: the ability to converge to any desired eigenvalue (the one nearest to the shift
σ). By choosing σ very close to a desired eigenvalue, the method converges very quickly
and thus not be as limited by the proximity of nearby eigenvalues as is the original power
method. The method is particularly effective when we have a good approximation to an
eigenvalue and want only its corresponding eigenvector.

(f) Drawbacks: (a) expensive in general: solving (A − σI)ti+1 = ui for ui+1. One LU
factorization of A−σI is required, which could be very expensive for large matrices, (b)
Only compute one eigenpair.

3. Orthogonal iteration (subspace iteration, simultaneous iteration)

(a) Purpose: compute a p-dimensional invaraint subspace, p > 1, rather than one eigenvector
at a time.

(b) Pseudocode:

Orthogonal Iteration

Given an initial n× p orthogonal matrix Z0
i = 0
repeat
Yi+1 = AZi
Yi+1 = Zi+1Ri+1 (QR decomposition)
i = i+ 1

until convergence

The use of QR decomposition keeps the vectors spanning span{AiZ0} of full rank.

(c) Convergence: under mild conditions, Zi converges to the invariant subspace spanned by
the first p eigenvectors corresponding to the p dominant eigenvalues, where

|λ1| ≥ |λ2| ≥ · · · ≥ |λp| > |λp+1| ≥ · · · ≥ |λn|.

If we let Bi = Z
T
i AZi, then

‖AZi − ZiBi‖ → 0 as i → ∞

and eigenvalues of Bi approximate the dominant eigenvalues of A.

Convergence rate depends on |λp+1|/|λp|.

(d) Example: Let Z0 = [e1, e2, e3] and

A =

-0.4326 1.1892 -0.5883 -0.0956 -0.6918 -0.3999

-1.6656 -0.0376 2.1832 -0.8323 0.8580 0.6900

0.1253 0.3273 -0.1364 0.2944 1.2540 0.8156

0.2877 0.1746 0.1139 -1.3362 -1.5937 0.7119

-1.1465 -0.1867 1.0668 0.7143 -1.4410 1.2902

1.1909 0.7258 0.0593 1.6236 0.5711 0.6686

Eigenvalues of A = -2.1659 +- 0.5560i, 2.1493, 0.2111 +- 1.9014i, -0.9548

i = 10: Eigenvalues of Z’_10*A*Z_10: -1.4383 +- 0.3479i, 2.1500

i = 30: Eigenvalues of Z’_30*A*Z_30: -2.1592 +- 0.5494i, 2.1118

i = 70: Eigenvalues of Z’_70*A*Z_70: -2.1659 +- 0.5560i, 2.1493

(e) An important special case: Let p = n and Z0 = I, then Ai = Z
T
i AZi converges to the

Schur form of A provided that

(1) all eigenvalues of A have distinct absolute values and

(2) all the principal submatrices of S have full rank, where we assume A = SΛS−1.

(f) Example: the same test matrix, numerical results of orthogonal iteration with Z0 = I:

A_10 =

-1.6994 -0.2201 -0.8787 1.4292 0.3847 0.0112

0.0007 1.1325 -1.2186 1.2245 -0.0867 -0.0648

0.2637 -1.9636 -0.1598 2.3959 -0.8136 -0.4311

-0.0364 -0.2346 0.5527 -0.4393 -1.9263 -1.2496

-0.4290 1.3482 1.1484 0.6121 -0.5937 -0.2416

0.0003 -0.0013 -0.0003 0.0011 -0.0014 -0.9554

A_30 =

-2.4055 -1.0586 -1.3420 0.0991 -1.1210 0.1720

0.0517 0.9645 -1.6519 0.8512 0.7215 0.7654

0.2248 -1.9947 -0.7656 -1.1876 -0.2736 0.1552

0.0029 0.0263 -0.0682 0.1381 -2.3094 -0.6765

0.0147 -0.0808 -0.0569 1.5462 0.3082 0.8476

0.0000 0.0000 0.0000 0.0000 0.0000 -0.9548

A_70 =

-2.0800 1.6092 -0.6426 1.1025 -0.1553 0.0734

-0.1690 -1.6820 1.7425 -0.0311 0.9229 -0.3087

0.0677 1.2521 1.5795 -0.1458 1.2092 0.7088

0.0000 0.0000 -0.0005 0.5575 -1.7386 -1.0520

0.0000 0.0004 0.0003 2.1489 -0.1353 -0.3250

0.0000 0.0000 0.0000 0.0000 0.0000 -0.9548

….

As we see Ak is converging to a quasi-upper triangular matrix as k increasing.

4. QR iteration

(a) Our goal is to reorganize orthogonal iteration to incorporate shifting and inverting as in
the inverse iteration. This will make it more efficient and eliminate the assumption that
eigenvalues differ in magnitude.

(b) Pseudocode

QR Iteration, without shift

A0 = A
i = 0
repeat
Ai = QiRi (QR decomposition)
Ai+1 = RiQi
i = i+ 1

until convergence

• Observe that
Ai+1 = RiQi = Q

T
i QiRiQi = Q

T
i AiQi

• Ai+1 is orthogonally similar toA0 = A. Therefore Ai+1 andA have same eigenvalues:

Ai+1 = (Q0Q1 · · ·Qi−1Qi)
TA(Q0Q1 · · ·Qi−1Qi).

Note that Q0 · · ·Qi−1Qi is an orthogonal matrix since all Qj are.

• Ai computed by QR iteration is identical to the matrix Z
T
i AZi implicitly computed

by orthogonal iteration. In fact, we have

Proposition. Let Ai be the matrix computed by QR iteration. Then Ai =
ZTi AZi, where Zi is the matrix computed from orthogonal iteration starting
with Z0 = I.

Therefore, Ai converges to Schur form if all the eigenvalues have different absolute
values.

(d) Example. The same test matrix, numerical results of QR iteration

A_10 =

-1.6994 0.2201 -0.8787 -1.4292 -0.3847 0.0112

-0.0007 1.1325 1.2186 1.2245 -0.0867 0.0648

0.2637 1.9636 -0.1598 -2.3959 0.8136 -0.4311

0.0364 -0.2346 -0.5527 -0.4393 -1.9263 1.2496

0.4290 1.3482 -1.1484 0.6121 -0.5937 0.2416

0.0003 0.0013 -0.0003 -0.0011 0.0014 -0.9554

A_30 =

-2.4055 -1.0586 1.3420 -0.0991 1.1210 0.1720

0.0517 0.9645 1.6519 -0.8512 -0.7215 0.7654

-0.2248 1.9947 -0.7656 -1.1876 -0.2736 -0.1552

-0.0029 -0.0263 -0.0682 0.1381 -2.3094 0.6765

-0.0147 0.0808 -0.0569 1.5462 0.3082 -0.8476

0.0000 0.0000 0.0000 0.0000 0.0000 -0.9548

A_70 =

-2.0800 1.6092 -0.6426 1.1025 -0.1553 -0.0734

-0.1690 -1.6820 1.7425 -0.0311 0.9229 0.3087

0.0677 1.2521 1.5795 -0.1458 1.2092 -0.7088

0.0000 0.0000 -0.0005 0.5575 -1.7386 1.0520

0.0000 0.0004 0.0003 2.1489 -0.1353 0.3250

0.0000 0.0000 0.0000 0.0000 0.0000 -0.9548

Note that the results are identical to the orthogonal iteration.

5. QR iteration with shifts ⇒ QR Algorithm

(a) Purpose: accelerate the convergence of QR iteration by using shifts

(b) Pseudo-code

QR Iteration with Shifts

A0 = A
i = 0
repeat
Choose a shift σi
Ai − σiI = QiRi (QR decomposition)
Ai+1 = RiQi + σiI
i = i+ 1

until convergence

Ai+1 = Q
T
i AiQi.

Therefore, Ai+1 and A are orthogonally similar, and Ai+1 and A have the same eigen-
values.

(d) How to choose the shifts σi?

• If σi is an exact eigenvalue of A, then it can be shown that

Ai+1 = RiQi + σiI =

[

A′ a
0 σi

]

This means that the algorithm converges in one iteration. If more eigenvalues are
wanted, we can apply the algorithm again to the n− 1 by n− 1 matrix A′.

• In practice, a common choice of the σi is

σi = Ai(n, n).

A motivation of this choice is by observing that the convergence of the QR iteration
(without a shift), the (n, n) entry of Ai usually converges to an eigenvalue of A first.

(e) Example. The same test matrix as before. The following is the numerical result of QR
iteration with a shift.

With the shift σ0 as an exact eigenvalue σ0 = 2.1493, then

A_1 =

-1.4127 1.4420 1.0845 -0.6866 -0.1013 -0.2042

-1.2949 -0.2334 1.4047 -1.3695 1.5274 -0.7062

0.5473 0.1343 -0.7991 -0.6716 1.1585 0.0736

-0.2630 0.0284 0.5440 -1.4616 -1.5892 0.9205

-1.6063 -0.3898 0.3410 0.1623 -0.9576 -0.5795

0.0000 0.0000 0.0000 0.0000 0.0000 2.1493

With the shifts σi = Ai(n, n).

A_7 =

-2.4302 2.0264 -0.2799 -0.2384 0.3210 -0.0526

-0.1865 -1.4295 -1.3515 0.0812 0.8577 -0.0388

-0.1087 -0.8991 0.4491 0.4890 -1.8463 -1.2034

-0.0008 0.0511 -0.5997 -0.7839 -0.8088 -0.5188

-0.0916 -0.8273 1.6940 0.0645 -0.6698 -0.0854

0.0000 0.0000 0.0000 0.0000 0.0000 2.1493

We observe that by 7th iteration, we have found an eigenvalue of A.

(f) Note that the QR decomposition in the algorithm takes O(n3) flops. Even if the al-
gorithm took n iterations to converge, the overall cost of the algorithm will be O(n4).
This is too expensive (today, the complexity of algorithms for all standard matrix com-
putation problems is at O(n3).) However, if the matrix is initially reduced to upper
Hessenberg form, then the QR decomposition of a Hessenberg form costs O(n2) flops.
As a result, the overall cost of the algorithm is reduced to O(n3). This is referred to as
the Hessenberg QR algorithm, the method of choice for dense eigenvalue problem today,
say Matlab’s eigensolver eig use LAPACK’s implementation of the QR algorithm.

(g) QR algorithm is one of the top 10 algorithms in the 20th century.

Published by admin

Leave a Reply Cancel reply