# CS代写 Quiz – cscodehelp代写

Quiz

1

Question 1

If 99.9% of your sequential program execution time is spent inside a loop that can be parallelized, what is the maximum speedup you can achieve via parallelizing this loop?

49

Quiz

1

Question 1

If 99.9% of your sequential program execution time is spent inside a loop that can be parallelized, what is the maximum speedup you can achieve via parallelizing this loop?

Solution: Maximum speedup = 1/(1-99.9%) = 1/0.1% = 1000

50

Quiz

1

Question 2

If a two-hour parallel program running on one thousand cores spends one hour in its parallel portion, what is the scaled speedup of the program on one thousand cores?

51

Quiz

1

Question 2

If a two-hour parallel program running on one thousand cores spends one hour in its parallel portion, what is the scaled speedup of the program on one thousand cores?

Solution: Scaled speedup = (1+1*1000)/2 = 500.5

52

Quiz

1

Question 3

A parallel program achieves a speedup of 2 on 4 cores. Is it likely to achieve a speedup of 4 on 8 cores?

53

Quiz

1

Question 3

A parallel program achieves a speedup of 2 on 4 cores. Is it likely to achieve a speedup of 4 on 8 cores?

Speedup = 1/[f+(1-f)/4]=2 Serial fraction f = 1/3 Speedup = 1/[1/3+(1-1/3)/8] = 2.4 < 4 Impossible to achieve a speedup of 4 on 8 cores
54
1
Quiz
Question 4
A computer animation program generates feature movies frame-by- frame. Each frame can be generated and written to its own file independently. If it takes 90 seconds to render a frame and 10 second to write a frame into a file, what is the maximum speedup you can achieve if you have 100 single-processor computers to use?
55
1
Quiz
Question 4
A computer animation program generates feature movies frame-by- frame. Each frame can be generated and written to its own file independently. If it takes 90 seconds to render a frame and 10 second to write a frame into a file, what is the maximum speedup you can achieve if you have 100 single-processor computers to use?
Solution: Maximum speedup = 100 because each processor can handle separate frames independently.
56
Quiz
1
Question 5
Let n>f(p) denote the Isoefficiency relation of a parallel system and M(n) denote the amount of memory for a problem of size n. Use the scalability function to rank the following parallel systems from most to least scalable:

(a). f(p)=p and M(n)=n;

(b). f(p)=p and M(n)=n^2;

(c). f(p)=p*log(p) and M(n)=n;

(d). f(p)=p*log(p) and M(n)=n^2.

57

Quiz

1

Question 5

Let n>f(p) denote the Isoefficiency relation of a parallel system and M(n) denote the amount of memory for a problem of size n. Use the scalability function to rank the following parallel systems from most to least scalable:

(a). f(p)=p and M(n)=n;

(b). f(p)=p and M(n)=n^2;

(c). f(p)=p*log(p) and M(n)=n;

(d). f(p)=p*log(p) and M(n)=n^2.

Solution:

a. Scalability function = M(f(p))/p = (Cp)/p = C

b. Scalability function = M(f(p))/p = (Cp)^2/p = C^2p

c. Scalability function = M(f(p))/p = (Cp*log(p))/p = C*log(p)

d. Scalability function = M(f(p))/p = (Cp*log(p))^2/p = C^2*p*log^2(p) From most scalable to least scalable is a > c > b > d.

58

1

Quiz

Question 6

Your program has an execution time of n^3 and uses n^2 bytes space when the problem size is n. You have worked very hard to make it fully parallel and reduced the communication time to 24𝑛2𝑙𝑜𝑔2𝑝 on p cores. If you want to achieve a speed up of 1000 with 1024 cores, what is the minimum memory each core must have? If you have 1024 cores with 1 GB (2^30 bytes) memory per core, what is the maximum speedup you can achieve?

59

1

Quiz

Question 6

Your program has an execution time of n^3 and uses n^2 bytes space when the problem size is n. You have worked very hard to make it fully parallel and reduced the communication time to 24𝑛2𝑙𝑜𝑔2𝑝 on p cores. If you want to achieve a speed up of 1000 with 1024 cores, what is the minimum memory each core must have? If you have 1024 cores with 1 GB (2^30 bytes) memory per core, what is the maximum speedup you can achieve?

𝑛3 1 1 𝑆𝑝𝑒𝑒𝑑𝑢𝑝 = 𝑛3/𝑝 + 24𝑛2𝑙𝑜𝑔𝑝 = 1 1024 = 1

2 1024 + 24𝑙𝑜𝑔2 /𝑛 1024 + 240/𝑛

= 1000 ⇒ 𝑛 = 10240000

So the minimum memory each core must have is n^2/p = 1024*10^8 bytes = 10^8 KB = 100GB

If each core only has 1GB, n^2 /p=n^2 /1024≤2^30 ⇒n≤2^20 ⇒

𝑠𝑝𝑒𝑒𝑑𝑢𝑝 ≤ 220 = 830 So the maximum speedup is 830.

220+240 210

60

1

Quiz

Question 7

Assume a parallel program takes 1000 seconds to finish when using 1 core and 600 seconds to finish when using 2 cores. Is it possible to finish the program within 400 seconds when using 4 cores? If your experiment actually takes 500 seconds when using 4 cores, what is the minimum execution time when using 6 cores?

61

1

Quiz

Question 7

Assume a parallel program takes 1000 seconds to finish when using 1 core and 600 seconds to finish when using 2 cores. Is it possible to finish the program within 400 seconds when using 4 cores? If your experiment actually takes 500 seconds when using 4 cores, what is the minimum execution time when using 6 cores?

Solution(1/2): The speedup is 5/3 when using 2 cores, so using Karp-

we can get serial fraction 𝑒 =

Ψ1 − 𝑝1 3 − 1 1 1 = 5 2 = 10 =

When

using 4 cores, the speedup = 1/[1/5+(1-1/5)/4] = 2.5 and execution

time = 1000/2.5 = 400 seconds. So, it’s possible to finish the program

within 400 seconds when using 4 cores.

1−1 1−1 1 5 𝑝22

62

1

Quiz

Question 7

Assume a parallel program takes 1000 seconds to finish when using 1 core and 600 seconds to finish when using 2 cores. Is it possible to finish the program within 400 seconds when using 4 cores? If your experiment actually takes 500 seconds when using 4 cores, what is the minimum execution time when using 6 cores?

Solution(2/2): Actually the speedup is 2 when using 4 cores, so using

Ψ1 − 𝑝1 1 − 1 1 1

= 2 4 = 4 =

Karp- we can get serial fraction 𝑒 =

1−1 1−1 3 3 𝑝44

p24

6

e 1/5 1/3 >=1/3

Therefore, when using 6 cores, serial fraction e =

13 ⇒ Ψ ≤ 49, so the minimum execution time is 444.4 seconds.

Ψ1 − 𝑝1

1 − 1 1 − 1

= Ψ 6 = Ψ 6 ≥

1−1 𝑝66

1−1 5

63

Pop Quiz

◼ A parallel program executing on 32 processors spends 5% of its time in sequential code. What is the scaled speedup of this program?

Example 1: Reduction

◼ Sequential algorithm complexity T(n,1) = (n)

◼ Parallel algorithm

◆Computational complexity = (n/p) ◆Communication complexity = (log p)

◼ Parallel overhead T0(n,p) = (p log p)

Reduction (continued)

◼ Isoefficiency relation: n C p log p

◼ We ask: To maintain same level of efficiency, how must

n increase when p increases? ◼ M(n) = n

M(Cplogp)/ p=Cplogp/ p=Clogp ◼ The system has good scalability

Example 2: Floyd’s Algorithm

◼ Sequential time complexity: T(n,1)=(n3) ◼ Parallel computation time: (n3/p)

◼ Parallel communication time: (n2log p) ◼ Parallel overhead: T0(n,p) = (pn2log p)

Floyd’s Algorithm (continued)

◼ Isoefficiency relation

n3 C(p n2 log p) n C p log p

◼ M(n) = n2

M(Cplogp)/p=C2p2log2 p/p=C2plog2 p

◼ The parallel system has poor scalability

Example 3: Finite Difference

◼ Sequential time complexity per iteration: T(N,1)=(n2)

◼ Parallel communication complexity per iteration: (n/p) ◼ Parallel overhead: T_o(n,p)=(n p)

◼ Space Complexity: M(n) = n2

◼ What is the scalability function?

Finite Difference (continued)

◼ Isoefficiency relation n2 Cnp n C p

◼ M(n) = n2

M(C p)/p=C2p/p=C2

◼ This algorithm is perfectly scalable