Discussion 9 handout

Group members (names & NetIDs)

Objectives

Implement a dictionary’s “put” and “get” operations using a hash table
Resolve hash table collisions using chaining
Dynamically resize a hash table to maintain expected performance
Verify implementation using “specs”-style unit tests

Orientation

Download the dis09 project code and open it as a project in IntelliJ as you would an assignment.

The file “HashDict.java” starts by declaring an Entry class, which stores a single key and its associated value. Note that the value of an Entry can be changed.

Next, review the fields of HashDict itself. Its state includes an array of “buckets”, each of which is represented as a linked list. Remember that the elements of arrays of objects default to null. Inside each bucket is 0 or more Entry objects.

Object diagram

Suppose we put the following key–value pairs into an instance of HashDict<String, Integer>:

“Donnie” (hash code: 7) → 1386
“Leo” (hash code: 6) → 1452
“Mikey” (hash code: 3) → 1475

Assume that buckets has a length of 4 and that modular hashing is used to derive an index from the hash code. Draw an object diagram for the dictionary instance. You may represent a LinkedList<T> as an object with a head field of type Node<T>, which has fields for data (a T) and next (a Node<T>).

Implementing operations

Get

Study the partial implementation of get(). What should be done for each Entry in the bucket that the requested key belongs in?

Complete the TODO accordingly.

Put

Note that the implementation of put() has a similar structure to get(), but the method has different responsibilities in each case. Complete the first three TODOs by answering the question posed by each one.

Tip: try to avoid doing too much work in “special cases” if it is possible for that work to be handled in the general case. Bugs tend to hide in special cases.

What should happen when the bucket the key belongs in is null?

What should be done for each entry in the bucket that the key belongs in?

What should be done if we did not return in the above loop?

Testing

Run the HashDictTest test suite. You should pass all of the tests except the one checking the load factor limit. Notice how the tests are organized into a hierarchy of requirements that the class’s operations should fulfill. This is called “specification-style testing”, and it helps guide coverage while making failures easier to diagnose.

Resizing

In order to maintain good performance as the dictionary grows, the hash table must be dynamically resized. Implement the resize() method according to its specifications.

Next, return to put() and make use of this method to enforce the load factor invariant documented in the class description. When enforcing the load factor invariant, how much bigger should the new table be to ensure an amortized O(1) resizing cost?

Ensure that the entire test suite now passes.

Reflection

Handling the case where a bucket is null complicates the implementation of every operation in the class. Alternatively, we could have constructed a new empty list for every element whenever a new bucket array is allocated. Are there any drawbacks to this alternative approach?