# CS代考程序代写 algorithm COMP3027/COMP3927 Tutorial questions The University of Sydney 2019 Semester 1 Tutorial 12 School of Computer Science

COMP3027/COMP3927 Tutorial questions The University of Sydney 2019 Semester 1 Tutorial 12 School of Computer Science

Pre-tutorial questions

Do you know the basic concepts of this week’s lecture content? These questions are only to test yourself. They will not be explicitly discussed in the tutorial, and no solutions will be given to them.

1. Reduction.

(a) What is a polynomial time reduction? (b) Is polynomial time reductions transitive?

(c) What are the three types of standard reductions? 2. Classes

(a) What’s the definition of the class P? (b) What’s the definition of the class NP?

(c) How do you prove that a problem is in NP? 3. Revision: Algorithmic technique

(a) What is a Greedy algorithm? Example of a Greedy algorithm?

(b) What is a Divide-and-Conquer algorithm? Example of a Divide-and-Conquer algorithm?

(c) What is a Sweepline algorithm? Example of a Sweepline algorithm?

(d) What is a Dynamic Programming algorithm? Example of a Dynamic Programming algorithm?

(e) What is a Flow Network algorithm? Example of a Flow Network algorithm?

Tutorial

Problem 1

Show that the following decision problems are all in NP

1. Longest path (is there a simple s-t path of cost ≥ d)

2. Minimum Spanning Tree (is there a spanning tree of weight ≤ d) 3. Maximum Spanning Tree (is there a spanning tree of weight ≥ d) 4. Vertex Cover (can we cover the edges using ≤ d vertices)

5. Independent Set (can we pick a set of ≥ d independent vertices) 6. Min-cut (is there a s-t cut with capacity ≥ d)

1

Solution:

1. Certificate: a path P of cost ≥ d

Certifier: check that P starts at s, check that P is a simple path in the graph, check that P ends at t, check that the cost of P is at least d. If any of these are false, return NO, otherwise return YES.

Time: All the steps of the algorithm can easily be implemented in O(n + m), as the length of the path can’t exceed n (if so, it would contain a cycle so we’d immediately output NO), so the certifier is polynomial.

Correctness:

• Certifier YES implies certificate is valid: This is trivial. All the requirements of the question are satisfied, therefore P is a simple path of length ≥ d, therefore the answer should be YES.

• Certificate is valid implies certifier YES: This is trivial. If P is a simple s-t path of length ≥ d, then none of the conditions which would cause the certifier to answer NO are satisfied, so it’ll answer YES.

2. … These questions are all similarly trivial. You can use a ’solution’ to the question as the certificate, then it’s just a question of showing that you can verify that the solution is a valid one in polynomial time. Don’t forget to prove the equivalence between a valid certificate and the certifier answering yes when it’s given as input.

Ask on the discussion forum if you’re unsure about a particular problem.

Problem 2

In the k-COLOR problem, we are given an undirected graph G = (V,E) and we have to decide whether the vertices of G can be colored using at most k colors such that no edge has both endpoints with the same color. Show that 4-COLOR is NP-hard (Hint: Use a reduction from 3-COLOR.)

Solution: (Sketch.) Let G = (V, E) be an instance of 3-COLOR. Create a 4-color instance G′ = (V′,E′) where V′ consists of V plus 4 new vertices v1,v2,v3,v4, and E′ consists of E plus all the edges between the new vertices and all the edges between v1 and V . The instance G′ can be created in time O(n + m) since we only need to copy over G, and create a constant number of new vertices and edges between them plus the n new edges between v1 and V .

Suppose that G is 3-colorable. Consider a 3-coloring of G, call the colors R, G, B and let c(v) denote the color used by vertex v. To get a 4-coloring c′ of G′, for each v ∈ V , we use the same color as c. Then, we color v2, v3, v4 using the colors R, G, B and for v1 we use a fourth color, call it Y . This is a valid 4-coloring of G′: all the edges e in G have endpoints that are colored differently because c is a 3-coloring of G, the edges between v1 and V are also fine because v1 is colored differently than any vertex in G, and finally, the edges between v1, v2, v3, v4 are fine by construction.

Suppose that G′ is 4-colorable. Then it must color V using at most 3 colors since otherwise, v1 will have the same color as one of the vertices in V . Thus, G is 3-colorable.

(Alternative reduction.) Create G′ by simply adding a single vertex v1 that connects to all of V . Same proof as above.

2

Revision

1 Greedy algorithms

Greedy algorithms can be some of the simplest algorithms to implement, but they’re often among the hardest algorithms to design and analyze. You can often stumble on the right algorithm but not recognize that you’ve found it, or might find an algorithm you’re sure is correct and be unable to prove its correctness.

The standard way of proving the correctness of a greedy algorithm is by using an exchange argument. They work by showing that you can iteratively transform any optimal solution into the solution produced by the greedy algorithm without changing the cost of the optimal solution, thereby proving that the greedy solution is optimal. Typically, exchange arguments are set up as follows:

1. Define your solutions. You will be comparing your greedy solution X to an optimal solution Xopt, so it’s best to define these variables explicitly.

2. Compare solutions. Next, show that if X ̸= Xopt, then they must differ in some specific way. This could mean that there’s a piece of X that’s not in Xopt, or that two elements of X that are in a different order in Xopt, etc. You might want to give those pieces names.

3. Exchange Pieces. Show how to transform Xopt by exchanging some piece of Xopt for some piece of X. You’ll typically use the piece you described in the previous step. Then, prove that by doing so, you did not increase the cost of Xopt and you therefore have a different optimal solution.

4. Iterate. Argue that you have decreased the number of differences between X and Xopt by performing the exchange, and that by iterating this process you can turn Xopt into X without impacting the quality of the solution. Therefore, X must be optimal. This last step might require a formal argument using an induction proof. However, in most cases this is not needed.

Problem 3

Your friend is working as a camp counselor, and he is in charge of organizing activities for a set of junior- high-school-age campers. One of his plans is the following mini-triathlon exercise: each contestant must swim 20 laps of a pool, then cycle 10 km, then run 3 km. The plan is to send the contestants out in a staggered fashion, via the following rule: the contestants must use the pool one at a time. In other words, first contestant swims the 20 laps, gets out, and starts biking. As soon as the first contestant is out of the pool, the second contestant begins swimming the 20 laps; as soon as he/she’s out and starts cycling, a third contestant begins swimming . . . and so on.)

Each contestant has a projected swimming time (the expected time it will take him or her to complete the 20 laps), a projected cycling time (the expected time it will take him or her to complete the 10 km of cycling), and a projected running time (the time it will take him or her to complete the 3 km of running. Your friend wants to decide on a schedule for the triathlon: an order in which to sequence the starts of the contestants. Let’s say that the completion time of a schedule is the earliest time at which all contestants will be finished with all three legs of the triathlon, assuming they each spend exactly their projected swimming, biking, and running times on the three parts.

What’s the best order for sending people out, if one wants the whole competition to be over as early as possible? More precisely, give an efficient algorithm that produces a schedule whose completion time is as small as possible. Prove the correctness of your algorithm.

3

Solution: Let the contestants be numbered 1, . . . , n, and let si, bi, ri denote the swimming, biking, and running times of contestant i. Here is an algorithm to produce a schedule: arrange the contestants in order of decreasing bi + ri, and send them out in this order. We claim that this order minimizes the completion time.

We prove this by an exchange argument. Let X be the schedule produced by the greedy algorithm and let Xopt be an optimal schedule. If X and Xopt are identical then we are done. Suppose Xopt ̸= X then the optimal solution must contain two contestants i and j so that j is sent out directly after i, but bi + ri < bj + rj . We will call such a pair (i, j) an inversion. Consider the solution obtained by swapping the orders of i and j. In this swapped schedule, j completes earlier than he/she used to. Also, in the swapped schedule, i gets out of the pool when j previously got out of the pool; but since bi + ri < bj + rj , i finishes sooner in the swapped schedule than j finished in the previous schedule. Hence our swapped schedule does not have a greater completion time, and so it too is optimal.
Continuing in this way, we can eliminate all inversions without increasing the completion time. To be very rigorous here, you would probably proceed by induction to show this, but for most cases (including this case) this is immediate. At the end of this process, we will have a schedule in the order produced by our algorithm, whose completion time is no greater than that of the original optimal order we considered. Thus the order produced by our algorithm must also be optimal.
Problem 4
Assume that you are given n white and n black dots, lying on a line. The dots appear in any order of black and white. Design a greedy algorithm which connects each black dot with a (different) white dot, so that the total length of wires used to form such connected pairs is minimal. The length of wire used to connect two dots is equal to their distance along the line.
Solution: Suppose you have a sorted list of the white dot positions and the black dot positions. Then you should match the ith white dot with the ith black dot.
To prove that the algorithm is optimal consider an optimal solution Xopt and the solution X produced by the greedy algorithm. Again, if X = Xopt then we are done. Otherwise there must exist a matched pair wi,bj with i ̸= j in Xopt. We call such a pair an inversion. To simplify the argument we assume that i < j and that wi is the leftmost white point that is part of an inversion.
Since wi is matched to bj, bi must be matched to some wk with k ≥ i.
We argue that we can remove an inversion in the optimal solution Xopt and the cost of the matching will not increase. First consider the cost of matching these two pairs in the optimal solution:
C(wi,bj)+C(wk,bi)=bj −wi +|wk −bi|. If we replace these two pairs with (wi,bi) and (wk,bj) we would get:
C(wi,bi)+C(wk,bj)=|wi −bi|+|wk −bj|.
There are two cases depending on wk’s relative position between bi and bj. When its smaller than both of them the cost stays the same. When its smaller than just bj one saves 2wk −2bi by swapping and when its larger than both one saves 2bj − 2bi. Therefore we can remove the inversion in Xopt and guarantee that we still have an optimal solution. This argument can be iterated until we have no more inversions in the solution and, hence, the greedy must be optimal since it has no inversions.
4
2 Divide-and-Conquer
The divide-and-conquer strategy solves a problem by:
1. Breaking it into subproblems that are themselves smaller instances of the same type of the original problem.
2. Recursively solving these subproblems.
3. Appropriately combining (merging) their answers.
The real work is done in three different places: in the partitioning of problems into subproblems; when the subproblems are so small that they are solved outright; and in the gluing together of partial answers.
The standard way of proving correctness for a divide-and-conquer algorithm is by using induction as follows.
• Base case: Solve trivial instances directly, without recursing.
• Inductive step: Reduce the solution of a given instance to the solution of smaller instances, by recursing. For divide-and-conquer algorithms it usually requires a bit of work to prove that the step of merging two (or more) solutions to smaller problems into the solution for the larger problem.
Problem 5
Suppose we are given numbers a, n, where n > 0 is an integer. We wish to calculate the number an. What is the quickest way to do this? How many multiplication operations do we have to perform? Of course, we may compute 198 by calculating 19 × 19 = 361, then calculating 193 = 361 × 19 = 6859, then 194 = 6859 × 19 = 130321, and so on, until we get 198. This takes seven multiplications in total. Is this the quickest possible? Note that 8 = 2 × 4, so we can also write 198 = 194 × 194. If we compute 194 first, and then square it, we need only one more multiplication. The straightforward method would require four more multiplications: 198 = 194 × 19 × 19 × 19 × 19. Similarly, 194 = 192 × 192. So if we calculate 192 = 361 with one multiplication, 194 = 3612 = 130321 with one more, we get 198 = 1303212 = 16983563041 with the third multiplication. This cleverer method requires only three multiplications. The method above seems to work when the exponent n is even. What do we do when it is odd? Say, we would like to calculate 197. We may write 7 = 6+1, so 197 = 196 ×19, then 196 = 193 ×193, and finally 193 = 192 ×19. So 193 = 361×19 = 6859, 196 = 68592 = 47045881, and 197 = 47045881 × 19 = 893871739. The straightforward method of calculation requires 6 multiplications, and we needed only 4 here. We can combine the ideas from the two examples above to get a procedure to calculate the power an for any pair a, n.

Algorithm 1 Power

1: 2: 3: 4: 5: 6: 7: 8: 9:

10: 11: 12:

functionPower(a,n) if n=1then

return a end if

if n is even then

b = Power(a, n/2) return b2

else

b = Power(a, (n − 1)/2)

return a × b2 end if

end function

5

Prove that the algorithm correctly computes an. Can you bound the number of multiplications for each n?

Solution: The algorithm Power(a,n) correctly calculates an, as long as it correctly cal- culates the relevant smaller powers of a. We may prove this by induction over n that this happens for any number a.

Proof The base case is n = 1, where the algorithm correctly returns a. For the induction hypothesis, assume that the algorithm is correct for all k, with 1 ≤ k ≤ n − 1. Consider Power(a,n). If n is even, the algorithm returns the value b2, where b =Power(a,n/2). Since n > 1 and even, n/2 is an integer, and 1 ≤ n/2 ≤ n − 1. By the induction hypothesis, Power(a, n/2) correctly returns an/2, so the algorithm returns b2 = an. If n is odd (n > 1) then (n − 1)/2 is an integer, and 1 ≤ (n − 1)/2 ≤ n − 1. By the induction hypothesis, b =Power(a, (n − 1)/2) = a(n−1)/2 , and the algorithm correctly returns a × b2 = an in this case as well. So the algorithm is correct by induction. Consider the number of multiplications secribed by the algorithm, it can be described by the recurrence T (n) = O(1) + T (n/2) which solves to O(log n). A closer analysis shows that the bound is 2 log2 n.

3 Dynamic programming

Like greedy algorithms, dynamic programming algorithms can be deceptively simple. The algorithms, once written out, are often so straightforward that it’s hard to believe they work correctly. Consequently, one of the challenges in writing dynamic programming algorithms is rigorously establishing their correctness. Fortunately, dynamic programming proofs are often relatively straightforward and follow a standard pattern.

Typically, dynamic programming algorithms are based on a recurrence relation involving the optimal solution, so the correctness proof will primarily focus on justifying why that recurrence relation is correct. The general outline of a correctness proof for a dynamic programming algorithm is as following:

1. Define subproblems. Dynamic programming algorithms usually involve a recurrence involving some quantity OP T (…) over one or more variables (usually, these variables represent the size of the problem along some dimension). Define what this quantity represents and what the parameters mean. This might take the form “OPT(k) is the maximum number of people that can be covered by the first k cell towers” or “OP T (u, v, i) is the length of the shortest path from u to v of length at most i.”

2. Write a recurrence. Now that you’ve defined your subproblems, you will need to write out a recurrence relation that defines OPT(…) in terms of some number of subproblems. Make sure that when you do this you include your base cases.

3. Prove that the recurrence is correct. Having written out your recurrence, you will need to prove it is correct. Typically, you would do so by going case-by-case and proving that each case is correct.

4. Prove the algorithm evaluates the recurrence. Next, show that your algorithm actually evaluates the recurrence by showing that the table values match the value of OPT and that as you fill in the table, you never refer to a value that hasn’t been computed yet. To be fully rigorous, you would probably need to prove this by induction. However, in most cases a few sentences should suffice here.

5. Prove the algorithm is correct. Having shown that you’ve just evaluated the recurrence correctly, your algorithm will probably conclude with something like “return A[m,n]”. Prove that this table value is the one that you actually want to read.

Problem 6

Let G = (V,E) be an undirected graph with n nodes. Recall that a subset of the nodes is called an

6

independent set if no two of them are joined by an edge. Finding large independent sets is difficult in general; but here we’ll see that it can be done efficiently if the graph is “simple” enough.

Call a graph G = (V,E) a path if its nodes can be written as v1,v2,…,vn, with an edge between vi and vj if and only if the numbers i and j differ by exactly 1. With each node vi, we associate a positive integer weight wi.

Give an algorithm that takes an n-node path G with weights and returns an independent set of maximum total weight. independent of the values of the weights. Prove the correctness and complexity of your algorithm.

Solution:

1. Let Si denote an optimal independent set on {v1, . . . , vi}, and let OP T [i] denote its

weight. Define OPT[0] = 0 and note that OPT[1] = w1.

2-3. Now, for i > 1, either vi belongs to Si or it doesn’t. In the first case, we know that vi−1 cannot belong to Si, and so OPT[i] = wi + OPT[i − 2]. In the second case, OP T [i] = OP T [i − 1]. Thus we have the recurrence

OPT[i]=max(OPT[i−1],wi +OPT[i−2]).

4. From the recurrence it immediately follows in this simple case that we can compute the

values of OP T [i], in increasing order from i = 1 to n.

5. OPT[n] is the value we want since it stores the weight of an optimal solution of the

elements v1,…,vn.

Note that we can compute Sn by tracing back through the computations of the max operator (similar to what we did for knapsack). Since we spend constant time per iteration, over n iterations, the total running time is O(n).

4 Network Flow

The general idea we’ve used to solve a problem X with network flows is to “reduce” X to the problem of computing a max flow (or something similar). That is, we modify X into an equivalent problem that can be solved using network flows (for example, bipartite matching). Since we are reducing a problem X to another problem Y the correctness proof requires us to prove that a solution for Y is a solution for X and vice versa. For example, for bipartite matching we proved that if there is a matching of size k in the bipartite graph G then there’s a flow of value at most k in the corresponding flow network G′, and if there’s a flow of value k in G′ then there’s a matching of size at most k in G.

Problem 7

Back in the euphoric early days of the Web, people liked to claim that much of the enormous potential in a company like Yahoo! was in the “eyeballs” — the simple fact that it gets millions of people looking at its pages every day. And further, by convincing people to register personal data with the site, it can show each user an extremely targeted advertisement whenever he or she visits the site, in a way that TV networks or magazines couldn’t hope to match. So if the user has told Yahoo! that they’re a 20-year old computer science student from USyd, the site can throw up a banner ad for apartments in Glebe; on the other hand, if they’re a 50-year-old investment banker from Vaucluse the site can display a banner ad pitching Luxury Cars instead.

But deciding on which ads to show to which people involves some serious computation behind the scenes. Suppose that the managers of a popular Web site have identified k distinct demographic groups G1, G2, . . . , Gk. (These groups can overlap; for example G1 can be equal to all residents of Sydney, and

7

G2 can be equal to all people with a degree in computer science.) The site has contracts with m different advertisers, to show a certain number of copies of their ads to users of the site. Here’s what the contract with the ith advertiser looks like:

• For a subset Xi ⊆ {G1,…,Gk} of the demographic groups, advertiser i wants its ads shown only to users who belong to at least one of the demographic groups in the set Xi.

• For a number ri, advertiser i wants its ads shown to at least ri users each minute.

Now, consider the problem of designing a good advertising policy — a way to show a single ad to each user of the site. Suppose at a given minute, there are n users visiting the site. Because we have registration information on each of these users, we know that user j (for j = 1, 2, . . . , n) belongs to a subset Uj ⊆ {G1, . . . , Gk} of the demographic groups. The problem is: is there a way to show a single ad to each user so that the site’s contracts with each of the m advertisers is satisfied for this minute? (That is, for each i = 1, 2, . . . , m, at least ri of the n users, each belonging to at least one demographic group in Xi, are shown an ad provided by advertiser i.)

Give an efficient algorithm to decide if this is possible, and if so, to actually choose an ad to show each user.

Solution: We define a flow network G = (V, E) as follows.

• There is a source s, vertices v1,…,vn for each person vertices w1,…,wm for each

advertiser, and sink t.

• There is an edge of capacity 1 from vi to each wj for which person i belongs to a

demographic group that advertiser j wants to target.

• There is an edge with a capacity of 1 from s to each vi; and for each j, there is an

edge with lower bound rj from wj to t.

• Finally, the source has a demand of −j rj, and the sink has a demand of j rj. All

other nodes have demand 0.

Now, if there is a valid circulation in this graph, then there is an integer circulation. In such a circulation, one unit of flow on the edge (vi,wj) means that we show an ad from advertiser j to person i. With this meaning, each advertiser shows their required number of ads to the appropriate people.

Conversely, if there is a way to satisfy all the advertising contracts, then we can construct a valid circulation as follows. We place a unit of flow on each edge (vi,wj) for which i is shown an ad from advertiser j; we put a flow on the edge (wj,t) equal to the number of ads shown from advertiser j; and we put a unit of flow on each edge (s,vi) for which person i sees an ad.

Thus, there is a valid circulation in this graph if and only if there is a way to satisfy all the advertising contracts; and the flow values in an integer-valued circulation can be used, as above, to decide which ads to show to which people.

8