# CS代考计算机代写 information theory AI algorithm CS 591 B1: Communication Complexity, Fall 2019 Problem Set 2

CS 591 B1: Communication Complexity, Fall 2019 Problem Set 2

Due: 5:00PM, Friday, October 25, 2019.

Homework Policies:

• Submit your completed assignment by email to mbun[at]bu[dot]edu. Please include the string “CS591PS2” somewhere in your subject line.

• Solutions must be typeset, e.g., using LATEX or Microsoft Word.

• To help your instructor calibrate the length and difficulty of future assignments, please

include with each problem an estimate of how long it took you to solve it.

• You are encouraged to collaborate on the homework problems with each other in small groups (2-3 people). Collaboration may include brainstorming or exploring possible solutions together on a whiteboard, but should not include one person telling the others how to solve a problem. You must also write up the solutions independently (in your own words) and acknowledge your collaborators at the beginning of the first page.

• You may freely use without proof any results proved in class, in Mark’s lecture notes posted on the class webpage, or in the main body of the texts assigned as reading. Note that this excludes results that appear in the texts as problems and exercises. You may, of course, use such results but you have to prove them first.

• You may read papers and other outside sources to help you solve these problems. If you do so, you must acknowledge these sources and write the solutions in your own words.

• Start early! The problems are presented roughly in the order of the course content they correspond to, so you may get started on the first few problems as soon as the assignment is released. Late assignments will receive credit only with prior permission of the instructor.

1

Problem 1 (Zero- and One-Sided Error Communication). Our discussion in class has fo- cused on randomized communication with two-sided error, but we could make other choices. For example, we could consider protocols that only make one-sided error. A randomized protocol Π computes a function f with one-sided error if for every (x, y) with f (x, y) = 0,

Pr[Π(x, y) = 0] = 1 and for every (x, y) with f(x, y) = 1,

Pr[Π(x,y) = 1] ≥ 1. 2

The RPcc communication cost of a protocol is the length of the shortest public coin protocol computing f. We can similarly define the coRPcc complexity by exchanging the roles of 0s and 1s above.

(a) What is the RPcc communication complexity of the Equality function EQn? How about the coRPcc communication complexity?

(b) We may also define randomized protocols with zero error, but which may with some probability output ⊥ indicating “I don’t know.” A randomized protocol Π : X × Y → {0, 1, ⊥} computes a function f with zero error if for every (x, y),

4

Pr[Π(x, y) ∈ {f(x, y), ⊥}] = 1 Pr[Π(x,y) = ⊥] ≤ 1.

and

The ZPPcc communication complexity of a protocol is length of the shortest public coin

protocol computing f with zero error.

Show that if f has a fooling set of size s, then ZPPcc(f) ≥ log s − O(1).

(c) Letting RPcc, coRPcc, ZPPcc consist of sequences of functions with polylogarithmic complexity in each of these models respectively, show that ZPPcc = RPcc ∩ coRPcc.

Problem 2 (Shearer’s Lemma). Shearer’s Lemma is an important tool in information theory with lots of applications to combinatorics. It states that if X = (X1, . . . , Xn) is a collection of random variables and A ⊆ [n] is independent from X such that Pr[j ∈ A] ≥ ε for every j ∈ [n], then

εH(X) ≤ H(XA|A), where XA = (Xj)j∈A is the subcollection of X of indices in A.

(a) Derive the subadditivity of entropy, i.e., H(X1X2 . . . Xn) ≤ H(X1) + · · · + H(Xn), as a special case of Shearer’s Lemma.

2

(b) Here is an application to counting satisfying assignments to boolean functions. Let f1, . . . , fm : {0, 1}n → {0, 1} be boolean functions, and let F (x) = ∧mi=1fi(x). Suppose Prx∼U [fi(x) = 1] = p for every i where U is the uniform distribution over {0, 1}n. If the fi’s depend on disjoint sets of variables, i.e., each fi depends on some subset Si ⊆ [n] of variables and Si ∩ Sj = ∅ for all i ̸= j, then by independence we have

m

Pr[F(x)=1]= Pr[fi(x)=1]=pm.

x∼U

i=1

x∼U

Prove a similar result when each variable xj is allowed to appear in exactly k of the subsets Si. Namely, suppose each fi depends only on the variables in Si ⊆ [n] and #{i∈[m]:j∈Si}=kforeveryj∈[n]. Showthat

Pr [F(x) = 1] ≤ pm/k. x∼U

Hint: Apply Shearer’s Lemma with X being uniform over the satisfying assignments to F.

Problem 3 (Internal vs. External Information). Show that if μ is a product distribution, i.e., A and B are independent when (A, B) ∼ μ, then ICext(Π) = IC (Π) for every protocol

Π.

Problem 4 (Correlated Sampling Revisited). Consider the following correlated sampling problem. Let π be a distribution over X which decomposes as a product of two functions p · q. Suppose Alice holds p and Bob holds q, and moreover, that they hold estimates q′ and p′ of each others’ functions such that

1. p · q′ and p′ · q are valid probability mass functions, i.e., they take values in [0, 1] and sum to 1 and

2. There is a known parameter M > 0 such that p(x) ≤ Mp′(x) and q(x) ≤ Mq′(x) for every x ∈ X.

Show that there is a public coin protocol that allows Alice and Bob to agree on a sample x ∼ π with probability at least 1 − ε using communication poly(M, log(1/ε)).

Hint: Interpret the public randomness as a sequence of the form (xi, ai, bi) where the goal is to discover the first index i for which ai ≤ p(xi) and bi ≤ q(xi).

Problem 5 (Augmented Indexing). Consider the following Augmented Indexing Problem: Alice is given a string x ∈ {0,1}n and Bob is given an index i and the prefix x1,…,xi−1. The goal is for Bob to output xi given one-way communication from Alice to Bob with probability at least 1 − δ.

3

μμ

(a) Provealowerboundof(1−h(δ))nontheone-wayrandomizedcommunicationcomplexity of Augmented Indexing, where h(δ) = δ log(1/δ) + (1 − δ) log(1/(1 − δ)) is the binary entropy function.

(b) Augmented Indexing is a fairly contrived problem, but it turns out to be useful in applications. In the strict turnstile streaming model, a pair (it,vt) arrives in every round t = 1, . . . , m. The index it represents an element of the universe [n] and vt ∈ {−M,−M + 1,…,M} represents an additive update to the count of how many times element it has been seen. The strict turnstile model imposes the condition that at every point in time t, the count of every element is non-negative.

In the Distinct Elements problem, we wish to estimate the number of elements which have non-zero count at the end of the stream. Show that any randomized strict turnstile streaming algorithm which, with probability ≥ 0.99, estimates this count to within a factor of 2 requires space Ω(log n) even for some m = O(n) and M = O(1).

(c) Optional Challenge Problem: Show that any algorithm which estimates this count to within a factor of (1 + ε) requires space Ω(log(ε2n)/ε2) as long as log(ε2n)/ε2 ≤ n.

Hint: Use a reduction from the Gap-Hamming problem to Indexing. In the Gap- Hamming problem, Alice and Bob receive x, y ∈ {0, 1}n with the promise that |x − y| ≥ n/2 + √n or |x − y| ≤ n/2 − √n and their goal is to decide which is the case.

4