# CS代考计算机代写 interpreter python algorithm Computational Complexity and Computability

Computational Complexity and Computability

Lecture 9 – Information & Kolmogorov Complexity

Koushik Pal

University of Toronto

February 8, 2021

Information

Two fundamental concepts in computer science: Algorithm

Information

Unlike algorithms, there is no universally accepted comprehensive definition for information.

One definition of information is via computability theory.

Information

Question: Can we quantify how much information is contained in a string?

Example

x = y = z =

Idea: The more we can “compress” a string, the less “information” it contains.

Thesis: The amount of information in a string is equivalent to the shortest way of describing that string.

Information

Question: How to describe strings? Answer: Use Turing machines with inputs.

To be more specific, describe a string x as ⟨M,w⟩ such that M is a TM that, on input w, halts with only x on its tape.

Kolmogorov Complexity

Definition

Let x ∈ {,}∗.

The minimal description of x, denoted d(x), is the lexicographically shortest string ⟨M,w⟩ such that M is a TM that on input w halts with only x on its tape.

The descriptive complexity (also known as Kolmogorov complexity) of x, denoted K(x), is the length of the minimal description of x, i.e.,

K(x) = |d(x)|.

Encoding

Problem: How to figure out where M ends and w starts in the encoding ⟨M, w⟩?

There are many possible solutions. Here are two examples. Assume the alphabet is {, }.

Write each bit of ⟨M⟩ twice, i.e., as and as , and use as a delimiter between ⟨M⟩ and w; e.g.

⟨M, w⟩ = · · · .

⟨M⟩

In this case, |⟨M, w⟩| = |⟨M⟩| + |w| + .

w

If ⟨M⟩ = zz …zk ∈ {,}∗ and w = ww …wn ∈ {0,1}∗, let

⟨M,w⟩ = zz…zkww …wn. In this case, |⟨M, w⟩| = |⟨M⟩| + |w| + .

Properties of Kolmogorov Complexity

Property 1

The amount of “information” in x isn’t much more than |x|. Theorem

There is a constant c such that for all x ∈ {, }∗, K(x) ≤ |x| + c.

Proof.

Define the TM Mid, which on any input w, immediately halts, thereby leaving w on the tape.

Clearly, ⟨Mid,x⟩ is a description of x, and hence

K(x) ≤ |⟨Mid, x⟩| = |⟨Mid⟩| + |x| + = |x| + c,

where c = |⟨Mid⟩| + .

Properties of Kolmogorov Complexity

Property 2

The amount of “information” in xx isn’t much more than in x, i.e., repetitive strings have low information.

Theorem

There is a constant c such that for all x ∈ {, }∗, K(xx) ≤ K(x) + c.

Proof.

Define a TM N as follows: On input ⟨M, w⟩:

Simulate M on w; let s be the result. Output ss.

Let ⟨M, w⟩ be the minimal description of x. Then ⟨N, ⟨M, w⟩⟩ is a description of xx. Hence,

K(xx) ≤ |⟨N, ⟨M, w⟩⟩| = |⟨N⟩| + K(x) + = K(x) + c, where c = |⟨N⟩| + .

Properties of Kolmogorov Complexity Corollary

There is a constant c such that for all n ≥ and x ∈ {,}∗, K(xn) ≤ K(x) + c log n.

In particular, K(()n) = O(log n).

Proof.

Define a TM T as follows: On input ⟨n, ⟨M, w⟩⟩:

Simulate M on w; let s be the result. Print s for n times.

Let ⟨M, w⟩ be the minimal description of x. Then ⟨T, ⟨n, ⟨M, w⟩⟩⟩ is a description of xn. Hence,

K(xn)≤|⟨T,⟨n,⟨M,w⟩⟩⟩| ≤ |⟨T⟩|+⌈logn⌉+K(x)+ ≤ K(x) + O(log n).

Does the model matter?

Question: TMs are programming languages. If we used another programming language, could we get significantly shorter descriptions?

Answer: No!

Does the model matter?

Definition

An interpreter is a semi-computable function p : {, }∗ → {, }∗ (which takes programs as inputs and prints their outputs).

The minimal description of x under p, denoted dp(x), is the lexicographically shortest string s for which p(s) = x. (For example, dPython(x) is the shortest binary encoding of a Python program that outputs x.)

Finally, define Kp(x) = |dp(x)| as the descriptive complexity of x under p.

Does the model matter?

Theorem

For every interpreter p, there is a constant c such that for all x ∈ {,}∗,

K(x) ≤ Kp(x) + c.

(In other words, using any other programming language would only

change K(x) by some constant.) Proof.

Define the TM Mp, which on any input w, outputs p(w).

Then ⟨Mp,dp(x)⟩ is a description of x. Hence,

K(x) ≤ |⟨Mp, dp(x)⟩| = |⟨Mp⟩| + Kp(x) + = Kp(x) + c,

where c = |⟨Mp⟩| + .