# 代写代考 FIT2004 Assessed Preparation – cscodehelp代写

Assessed Preparation
Week 6 Studio Sheet (Solutions)
Useful advice: The following solutions pertain to the theoretical problems given in the tutorial classes. You are strongly advised to attempt the problems thoroughly before looking at these solutions. Simply reading the solu- tions without thinking about the problems will rob you of the practice required to be able to solve complicated problems on your own. You will perform poorly on the exam if you simply attempt to memorise solutions to the tutorial problems. Thinking about a problem, even if you do not solve it will greatly increase your understanding of the underlying concepts. Solutions are typically not provided for Python implementation questions. In some cases, psuedocode may be provided where it illustrates a particular useful concept.
Problem1. Ahashtablecanbeusedtostoredifferenttypesofelementssuchasintegers,strings,tuples,oreven arrays. Assume we want to use a hash table to store arrays of integers (i.e., each element is an array of integers). Consider the following potential hash functions for hashing the arrays of positive integers. Rank them in terms of quality and give a brief explanation of the problems with each of them. Assume that they are all taken mod m , where m is the table size.
• Returnthefirstnumberinthearray(e.g.,ifarray=[10,5,7,2],hashindexwillbe10%m)
• Returnarandomnumberinthearray(e.g.,ifarray=[10,5,7,2],hashindexwillbearandomlychosen
element from the array mod m)
• Returnthesumofthenumbersinthearray(e.g.,ifarray=[10,5,7,2],hashindexwillbe24%m)
Give a better hash function than these three and explain why it is an improvement.
From best to worst:
1. Returnthesumofthenumbersinthearray.Thisisadecenthashfunction,asitaccountsforallof the contents of the array. It is not amazing since it will produce identical hashes for permutations of the same elements (or arrays having the same sum), e.g., array1=[10,5,7,2], array2=[2,5,7,10] and array3=[9,6,7,2] will all collide.
2. Return the first number in the array This is not very good since it only accounts for one element of the array, so any two arrays that begin with the same element will collide.
3. Returnarandomnumberinthearray.Thisisnotevenavalidhashfunctionsinceitmightreturn a different value for the same array when used multiple times. Hash functions must be consistent, that is they must always produce the same value for the same key.
A better hash function would be
h(A)=A+Ax +Ax2 +…+A[n]xn,
for some value of x > 1. This is better since if we use this hash function, all of the array is taken into account, and permuting the order of the elements is likely to change the value of the hash. We can make it even better by selecting x >any value we could see in our input (for example, if we were hashing just a-z, giving values of 1-26, then we could choose x=27) and taking the sum mod p for a suitable prime p , which yields the standard polynomial hash function.
Problem2. Insert5intothefollowingAVLtreeandshowtherebalancingprocedurestepbystep. 1
We insert 5 using the ordinary BST insertion procedure and we obtain 50
This tree is imbalanced, as the node containing 20 has a height 2 left subtree, and a height 0 right subtree, resulting in a balance factor of 2. To rectify this, we rotate the node 10:
5 20 52 60
Studio Problems
Problem3. Insert6intothefollowingAVLtreeandshowtherebalancingprocedurestepbystep.
We insert 6 using the ordinary BST insertion procedure and we obtain the following tree. 3
This tree is imbalanced at node 8 since it has a left subtree of height 3 and a right subtree of height 1, leading to a balance factor of 2. Note that the root node 3 is also imbalanced, but we always perform rotations on the lower levels first since this may fix the imbalances at higher levels.
Since node 8 has a taller left subtree and node 5 has a taller right subtree, this is a left-right imbalance so we must perform a double rotation on node 7 to correct it. We perform the first rotation to obtain:
1 5 9 −→ 1 7 9 475
646 We then perform the second rotation and are left with:
1 7 9 −→ 5
46 This is a balanced tree, so we are done.
Problem4. Placeholdertoretaincorrectquestionnumbering
Problem 5. Recall the algorithm for deleting an element from a binary search tree. If that element has no
children, we do nothing. If it has one child, we can simply remove it and move its child upwards. Otherwise, if the node has two children, we swap it with its successor and then delete it. This algorithm works because of the following fact: The successor of a node with two children has no left child. Prove this fact.
(a) First prove that the successor of a node x with two children must be contained in the right subtree of x (b) Use this fact to prove that the successor of x can not have a left child
This looks suspiciously similar to the Fibonacci sequence, so lets compare them:
h 012345678 F(h) 1 1 2 3 5 8 13 21 34 n(h) 1 2 4 7 12 20 33 54 88
This table strongly suggests the pattern
n (h ) = F (h + 2) − 1
Lets prove this by induction. We have n(0) = F (2)−1 = 1 and n(1) = F (3)−1 = 2 as required. Now suppose n (h ) = F (h + 2) − 1 for all h ≤ k for some value of k . We need to show that n (k + 1) = F (k + 3) − 1. From the recurrence, we have
n(k +1)=1+n(k)+n(k −1). Invoking the inductive hypothesis, we can write
n (k + 1) = 1 + F (k + 2) − 1 + F (k + 1) − 1, =F(k +2)+F(k +1)−1
By the definition of Fibonacci numbers, F (k + 2) + F (k + 1) = F (k + 3), hence we have n (k + 1) = F (k + 2) + F (k + 1) − 1,
= F (k + 3) − 1,
as required. Hence by strong induction on k , we have n (h ) = F (h + 2) − 1 for all h .
Next, we prove that the Fibonacci numbers have the desired bound. In the base case, we have F(0)=1≥1.5−1, F(1)=1≥1.50.
Now,supposethatF(n−1)≥1.5n−2 andF(n)≥1.5n−1. WeneedtoprovethatF(n+1)≥1.5n. Fromthe definition of Fibonacci numbers, we have
F (n + 1) = F (n ) + F (n − 1), using using the inductive hypothesis, we can write the bound
F (n + 1) ≥ 1.5n −1 + 1.5n −2 , =1.5×1.5n−2 +1.5n−2,
=(1.5+1)1.5n−2, ≥2.25×1.5n−2,
= (1.5)21.5n−2, =1.5n.
Therefore by induction on n, we have F (n) ≥ 1.5n−1.
Finally, combining the above, we have n(h) = F (h + 2) − 1 ≥ 1.5h+1 − 1. Rearranging, we find 1.5h+1 ≤
n(h)+1, and hence
h +1≤log1.5(n(h)+1) =⇒ h =O(log(n)),
Problem 13. (Advanced) Consider a hashtable implementing linear probing, with size m = 17 using the hash as required.
h(k)=(7k+11) modm.
Give a sequence of keys to insert into the table that will cause its worst-case behaviour.
We want a sequence of keys that will all hash to the same slot, which will cause linear probing to probe the entire cluster every time. Let’s design a set of keys that all hash to slot 0. We want
7k+11≡0 mod17. Adding 6 to both sides, this is equivalent to
7k≡6 mod17.
Now we want to divide out the 7 to get an expression for k . To do so, we need to multiply by some number x such that 7x ≡ 1 mod 17 (in other words, we need the modular multiplicative inverse of 7 mod 17). We can simply find this by trial and error. Looking at multiples of 7, we find
1×7=7≡7 mod17, 2×7=14≡14 mod17, 3×7=21≡4 mod17, 4×7=28≡11 mod17, 5×7=35≡1 mod17.
Therefore we find that 5 is the inverse we are looking for. We therefore can write 5×7k ≡5×6 mod17,
from which we get
Therefore, a worst-case sequence of keys would be
k =13+17i, i ≥0,
i.e. k = 13, 30, 47, 64, 81, ….
Problem 14. (Advanced) Suppose that we insert n keys into a hashtable with m slots using a totally random hash function. What is the expected number of pairs of colliding elements? The pairs (k , k ′ ) and (k ′ , k ) are considered the same and should not be counted twice.
k≡13 mod17.
First, we define the following.
􏰅1 ifh(ki)=h(kj), Ii , j = 0 otherwise.
We call this an indicator function on the condition h (ki ) = h (k j ). For each item ki , the number of elements it collides with is given by
i−1 #collisionswithki =􏰁Ii,j.
The total number of collisions is therefore #collisions=􏰁􏰁Ii,j
So, by linearity of expectation, the expected number of collisions is
n i−1 n i−1 
E 􏰁􏰁Ii,j =􏰁􏰁E􏰍Ii,j􏰎. i=1 j=1 i=1 j=1
We have by the definition of expectation E􏰍Ii,j􏰎=1×Pr􏰍h(ki)=h(kj)􏰎+0×Pr􏰍h(ki)̸=h(kj)􏰎=Pr􏰍h(ki)=h(kj)􏰎.
Since the hash is totally random,
E 􏰍 I i , j 􏰎 = P r 􏰍 h ( k i ) = h ( k j ) 􏰎 = m1 . Therefore the expected number of colliding pairs is
n i−1 1 􏰈n􏰉1 n(n−1) 􏰁􏰁= = .
n i−1 i=1 j=1
i=1j=1m 2m 2m
Problem 15. (Advanced) Recall the algorithm for insertion into a hashtable using Cuckoo hashing. Assume that m ≥ 2n and MaxLoop = O (log(n )). Given that the probability that an insertion triggers a rebuild is O (n −2 ), and insertions succeed in expected constant time when a rebuild is not triggered:
(a) Provethattheprobabilitythatasequenceofninsertionssucceedsis1−O􏰆1􏰇 n
(b) ProvethattheexpectedtimecomplexityofinsertionisO(1) Solution
(a) The probability that an insertion triggers a rebuild is O(n−2). The probability that any of the n insertions triggers a rebuild is
􏰈n􏰉 Pr 􏰑{insertion i triggers a rebuild} .
The events are not mutually exclusive, but the probability of their union is bounded above by the
sum of their individual probabilities. Formally, this is called the union bound, or Boole’s inequality.
􏰑 insertion i triggers a rebuild ≤ P 􏰆insertion i triggers a rebuild􏰇 , i=1 i=1
≤nO(n−2), = O 􏰊 1 􏰋.
So the probability that a rebuild is not triggered is 1 − O 􏰆 1 􏰇 as required. n
(b) Let X denote the time taken by an insertion. A successful insertion takes O (1), and an unsuccessful insertion takes O (MaxLoop) to try, plus the cost of n sub