Discussion 9 handout
Group members (names & NetIDs)
Objectives
- Implement a dictionary’s “put” and “get” operations using a hash table
- Resolve hash table collisions using chaining
- Dynamically resize a hash table to maintain expected performance
- Verify implementation using “specs”-style unit tests
Orientation
Download the dis09 project code and open it as a project in IntelliJ as you would an assignment.
The file “HashDict.java” starts by declaring an Entry
class, which stores a single key and its associated value.
Note that the value of an Entry
can be changed.
Next, review the fields of HashDict
itself.
Its state includes an array of “buckets”, each of which is represented as a linked list.
Remember that the elements of arrays of objects default to null.
Inside each bucket is 0 or more Entry
objects.
Object diagram
Suppose we put the following key–value pairs into an instance of HashDict<String, Integer>
:
- “Donnie” (hash code: 7) → 1386
- “Leo” (hash code: 6) → 1452
- “Mikey” (hash code: 3) → 1475
Assume that buckets
has a length of 4 and that modular hashing is used to derive an index from the hash code.
Draw an object diagram for the dictionary instance.
You may represent a LinkedList<T>
as an object with a head
field of type Node<T>
, which has fields for data
(a T
) and next
(a Node<T>
).
Implementing operations
Get
Study the partial implementation of get()
.
What should be done for each Entry
in the bucket that the requested key belongs in?
Complete the TODO accordingly.
Put
Note that the implementation of put()
has a similar structure to get()
, but the method has different responsibilities in each case.
Complete the first three TODOs by answering the question posed by each one.
Tip: try to avoid doing too much work in “special cases” if it is possible for that work to be handled in the general case. Bugs tend to hide in special cases.
What should happen when the bucket the key belongs in is null?
What should be done for each entry in the bucket that the key belongs in?
What should be done if we did not return in the above loop?
Testing
Run the HashDictTest
test suite.
You should pass all of the tests except the one checking the load factor limit.
Notice how the tests are organized into a hierarchy of requirements that the class’s operations should fulfill.
This is called “specification-style testing”, and it helps guide coverage while making failures easier to diagnose.
Resizing
In order to maintain good performance as the dictionary grows, the hash table must be dynamically resized.
Implement the resize()
method according to its specifications.
Next, return to put()
and make use of this method to enforce the load factor invariant documented in the class description.
When enforcing the load factor invariant, how much bigger should the new table be to ensure an amortized O(1) resizing cost?
Ensure that the entire test suite now passes.
Reflection
Handling the case where a bucket is null complicates the implementation of every operation in the class. Alternatively, we could have constructed a new empty list for every element whenever a new bucket array is allocated. Are there any drawbacks to this alternative approach?