# CS代考 COMP90038 Algorithms and Complexity – cscodehelp代写

COMP90038 Algorithms and Complexity

Time/Space Tradeoffs—String Search Revisited

Olya Ohrimenko

(Slides from Harald Søndergaard)

Lecture 16

Semester 2, 2021

Algorithms and Complexity (Sem 2, 2021) Time/Space Tradeoffs ⃝c University of Melbourne 1 / 27

Weekly update

Olya’s consultation session shortly after this lecture Next week: non-teaching week

Algorithms and Complexity (Sem 2, 2021) Time/Space Tradeoffs ⃝c University of Melbourne 2 / 27

Last Lecture: Revision

Balanced trees (AVL, 2-3)

For AVL trees, balance factor at a node: height of the left subtree minus height of the right subtree. Empty subtree has height -1.

Algorithms and Complexity (Sem 2, 2021) Time/Space Tradeoffs ⃝c University of Melbourne 3 / 27

Spending Space to Save Time

Often we can find ways of decreasing the time required to solve a problem, by using additional memory in a clever way.

For example, in Lecture 6 we considered the simple recursive way of finding the nth Fibonacci number and discovered that the algorithm uses exponential time.

However, suppose the same algorithm uses a table to tabulate the function Fib as we go: As soon as an intermediate result Fib(i) has been found, it is not simply returned to the caller; the value is first placed in slot i of a table (an array). Each call to Fib first looks in this table to see if the required value is there, and only if it is not, the usual recursive process kicks in.

Algorithms and Complexity (Sem 2, 2021) Time/Space Tradeoffs ⃝c University of Melbourne 4 / 27

Fibonacci Numbers with Tabulation

We assume that, from the outset, all entries of the table F are 0.

function Fib(n)

if n = 0 or n = 1 then

return 1 result ← F[n]

if result = 0 then

result ← Fib(n − 1) + Fib(n − 2) F [n] ← result

return result

(I show this code just so that you can see the principle; in Lecture 6 we already discovered a different linear-time algorithm, so here we don’t really need tabulation.)

Algorithms and Complexity (Sem 2, 2021) Time/Space Tradeoffs ⃝c University of Melbourne 5 / 27

Sorting by Counting

Suppose we need to sort large arrays, but we know that they will hold keys taken from a small, fixed set (so lots of duplicate keys).

For example, suppose all keys are single digits: 633810879253531876512153

Then we can, in a single linear scan, count the occurrences of each key in array A and store the result in a small table:

key 0123456789 Occ 1 4 2 5 0 4 2 2 3 1

Now use a second linear scan to make the counts cumulative: key 0 1 2 3 4 5 6 7 8 9

Occ 1 5 712121618202324

Algorithms and Complexity (Sem 2, 2021) Time/Space Tradeoffs ⃝c University of Melbourne 6 / 27

Sorting by Counting

We can now create a sorted array S[1]..S[n] of the items by simply slotting items into pre-determined slots in S (a third linear scan).

633810879253531876512153 key 0 1 2 3 4 5 6 7 8 9

Occ 1 5 712121618202324

Place the last record (with key 3) in S[12] and decrement Occ[3]

(so that the next ‘3’ will go into slot 11), and so on.

for i ← n to 1 do S[Occ[A[i]]] ← A[i] Occ[A[i]] ← Occ[A[i]] − 1

Algorithms and Complexity (Sem 2, 2021) Time/Space Tradeoffs ⃝c University of Melbourne 7 / 27

Sorting by Counting (Example)

key 0 1 2 3 4 5 6 7 8 9 Occ 1 5 712121618202324

for i ← n to 1 do S[Occ[A[i]]] ← A[i] Occ[A[i]] ← Occ[A[i]] − 1

Algorithms and Complexity (Sem 2, 2021) Time/Space Tradeoffs ⃝c University of Melbourne 8 / 27

Sorting by Counting

Note that this gives us a linear-time sorting algorithm (for the cost of some extra space).

However, it only works in situations where we have a small range of keys, known in advance.

The method never performs a key-to-key comparison.

The time complexity of key-comparison based sorting has been

proven to be in Ω(n log n).

Algorithms and Complexity (Sem 2, 2021) Time/Space Tradeoffs ⃝c University of Melbourne 9 / 27

String Matching Revisited

String Matching Applications: text editors, natural language processors, virus scanning, websearch engines.

In Lecture 5 we saw a brute-force approach to string search. “Strings” are usually built from a small, pre-determined alphabet.

Most of the better algorithms rely on some pre-processing of strings before the actual matching process starts.

The pre-processing involves the construction of a small table (of predictable size).

Levitin refers to this as “input enhancement”.

Algorithms and Complexity (Sem 2, 2021) Time/Space Tradeoffs ⃝c University of Melbourne 10 / 27

Horspool’s String Search Algorithm

Comparing from right to left in the pattern. Very good for random text strings.

STRINGSEARCHEXAMP EXAM

We can do better than just observing a mismatch here.

Because the pattern has no occurrence of I, we might as well slide it 4 positions along.

This decision is based only on knowing the pattern.

Algorithms and Complexity (Sem 2, 2021) Time/Space Tradeoffs ⃝c University of Melbourne 11 / 27

Horspool’s String Search Algorithm

STRINGSEARCHEXAMP EXAM

EXAM

Here we can slide the pattern 3 positions, because the last occurrence

of E in the pattern is its first position. STRINGSEARCHEXAMP

EXAM

EXAM EXAM

EXAM EXAM

Algorithms and Complexity (Sem 2, 2021) Time/Space Tradeoffs ⃝c University of Melbourne 12 / 27

Horspool’s String Search Algorithm

What happens when we have longer partial matches?

SEARCHING BIRCH

Char Shift

A5 B4 C1 . .

H5 I3 . .

R2 S5 . .

Z5

BIRCH

The shift is determined by the last character in the

pattern.

BIRCH

Algorithms and Complexity (Sem 2, 2021)

Time/Space Tradeoffs

⃝c University of Melbourne

13 / 27

Horspool’s String Search Algorithm

What happens when we have longer partial matches?

↓

SEARCHING BIRCH

Char Shift

A5 B4 C1 . .

H5 I 3 . .

R2 S5 . .

Z5

BIRCH

The shift is determined by the last character in the

pattern.

Note that this is the same as the character in the text that we first matched against. Hence the skip is always determined by that character, whether it matched or not.

BIRCH

Algorithms and Complexity (Sem 2, 2021) Time/Space Tradeoffs ⃝c University of Melbourne 14 / 27

Horspool’s String Search Algorithm

Building (calculating) the shift table is easy. We assume indices start from 0.

Let alphasize be the size of the alphabet.

function FindShifts(P[·],m) ⊲ Pattern P has length m for i ← 0 to alphasize − 1 do

Shift[i] ← m

for j ← 0 to m − 2 do

Shift[P[j]] ← m − (j + 1)

Algorithms and Complexity (Sem 2, 2021) Time/Space Tradeoffs ⃝c University of Melbourne 15 / 27

Horspool’s String Search Algorithm

function Horspool(P[·],m,T[·],n) FindShifts(P , m)

i←m−1

while i