# 程序代写代做代考 database algorithm CS44800 – Homework 1

CS44800 – Homework 1

Spring 2017

Due: Thursday, February 23, 2017, in class. Before class starts.

(There will be 10% penalty for each late day. After 4 late days, the homework will not be

accepted.)

___________________________________________________________________________

Part 1: Relational Algebra and Calculus (10 points)

Suppliers (sid : integer, sname : string, address : string)

Parts (pid : integer, pname : string, color : string)

Catalog (sid : integer, pid : integer, cost : integer)

The key fields are underlined, and the domain of each field is listed after the field name.

Therefore, sid is the key for Suppliers, pid is the key for Parts, and sid and pid are together form

the key for Catalog. The Catalog relation lists the prices charged for parts by Suppliers.

Based on the above schema, write the following queries in SQL, relational algebra, tuple

relational calculus and domain relational calculus:

1. Find the sids of suppliers who supply some red part or are at 221 Packer Ave.

2. Find pairs of sids such that the supplier with the first sid charges more for some part than

the supplier with the second sid.

3. Find the pids of parts supplied by at least two different suppliers.

4. Find the pids of the most expensive parts supplied by suppliers named Yosemite Sham.

5. Find the pids of parts supplied by every supplier at less than $200. (If any supplier either

does not supply the part or charges more than $200 for it, the part is not selected.)

___________________________________________________________________________

Part 2: Tree-Based Indexing (35 points)

1. Show the result of inserting 12, 30, 15, 29, 1, 17, 3, 13, 35, 18, 27 and 8 into an initially

empty B+ tree of order 3.

2. Show the result of deleting 15, 8 and 30 from the previous B+ tree. Show the B+

tree after each deletion.

3. Assume that you have just built a dense B+ tree index using Alternative (2) on a heap

file containing 20,000 records. The key field for this B+ tree index is a 40-byte string, and

it is a candidate key. Pointers (i.e., record ids and page ids) are exactly 10-byte values.

The size of one disk page is 1000 bytes. The index was built in a bottom-up fashion

using the bulk-loading algorithm, and the nodes at each level were filled with a fill factor

of 100%.

Please answer the following:

a. How many levels does the resulting tree have? Show your steps.

b. For each level of the tree, how many nodes are at that level? Show your steps.

4. If your database system supported both a static and a dynamic tree index (e.g, ISAM

and B+ trees), would you ever consider using the static index in preference to the

dynamic index? Justify your answer.

___________________________________________________________________________

Part 3: Hash Based Indexing (25 points)

1. Assume that you have an empty hash table where each page can store up to two

elements (i.e., page size = 2). Assume further that the initial number of buckets you have

is two. Show the result of inserting 12, 30, 15, 29, 1, 17, 3 and 13 into the hash table

using:

a. Extensible Hashing

b. Linear Hashing

2. Describe a situation where extendable hashing suffers from performance limitations.

Does linear hashing suffer from the same performance limitation you described? Justify

your answer.

State clearly any assumptions you may assume.

___________________________________________________________________________

Part 4: Disk and Files (30 points)

Consider a disk with a sector size of 512 bytes, 2000 tracks per surface, 50 sectors per track, 5

double-sided platters and average seek time of 10 msec, rotational delay time of 2 msec and

the transfer rate is 100MB per second. Suppose a block size of 1024 bytes is chosen and a file

containing 100,000 records of 100 bytes each is to be stored on a such a disk and no record is

allowed to span 2 blocks.

1. How many records fit onto a block?

2. How many blocks are required to store the entire file? If the file is arranged sequentially

on disk how many surfaces are needed?

3. How many records of 100 bytes each can be stored using this disk?

4. If pages are stored sequentially on disk, with page 1 on block 1 of track 1, what page is

stored on block 1 of track 1 on the next disk surface? How would your answer change if

the disk were capable of reading and writing from all heads in parallel?

5. What time is required to read a file containing 100,000 records of 100 bytes each

sequentially? Again, how would your answer change if the disk were capable of reading /

writing from all heads in parallel (and the data was arranged optimally)?

6. What is the time required to read a file containing 100,000 records of 100 bytes each in

a random order? To read a record, the block containing the record has to be fetched

from disk. Assume that each block request incurs the average seek time and rotational

delay.