7. Sorting Algorithms
Sorting is a central component in the design of data structures and algorithms. We have already seen the power of sorted data when we analyzed binary search; when an array is sorted, we can locate a particular entry (or determine it is not present in the array) in \(O(\log N)\) time, versus the \(O(N)\) time of a linear search. As we progress in the course, we will see that sorting has connections to other data structures such as binary search trees and heaps. Sorting does not come for free. We need to develop algorithms that can sort data, and the choices that we make in their design will impact different factors of their performance. Today, we’ll introduce three sorting algorithms. We’ll use loop invariants to understand how they work, and we’ll analyze their complexities.
Insertion Sort
The first algorithm that we’ll consider is insertion sort. You may be familiar with this sorting approach already, as it is what many people will use to perform small sorts in real life, such as alphabetizing a small stack of papers. In insertion sort, we build up a sorted “subarray”, adding one additional entry to this subarray in each iteration. We can represent this with array diagrams. The “Pre” and “Post” diagrams are rather uninteresting; at the start, we know nothing about the order of the entries. At the end, all the entries should be sorted.
Now, we’ll introduce a loop variable i
to track the progress of the sort. Entering the i
‘th iteration, insertion sort guarantees that the first i
entries of the array have been sorted, and the remaining entries are in their original positions.
We can use this invariant diagram to sketch out the code for insertionSort()
. Since this loop accesses every entry of a
in order from left to right, it is naturally amenable to a for
-loop (though equally valid to express with a while
-loop). At the end of each iteration, we increment i
. To maintain the loop invariant, we need to “insert” i
in its correct position within a[..i)
such that a[..i] = a[..i+1)
becomes sorted. We will defer this insertion to a helper method, insert()
, that we will define next.
|
|
|
|
We've just demonstrated a useful programming technique: delegating a non-trivial "subtask" to a helper method that we can implement later. By breaking up the logic into separate, simpler methods, we can focus on each one individually, using the specs to help guide our development. While this insertionSort()
code is still relatively simple, delegating to helper methods will become more beneficial when we develop more intricate subroutines later in this lecture and throughout the course.
Let’s see how this main loop of insertionSort()
works in action. Step through the following animation to see how the sort progresses. Here, the shading indicates the entries that remain unchanged. The unshaded segment is sorted and grows over the course of the algorithm. We use the specification of insert()
to understand its behavior.
previous
next
insert()
Now, we must complete the definition of the insert()
helper method. It is easiest to think of this method as working from right to left. The entry a[i]
starts in index i
but needs to move to the left if it is smaller than entries of a[..i)
. We continue scanning left until we see an entry that is \(\leq\) a[i]
which indicates that we have found a[i]
’s sorted position. We can swap a[i]
with its smaller left neighbors as we are performing this scan, which will result in all the elements being in their sorted positions once the scan concludes. Use this description to develop the “Pre”, “Post”, and “Inv” array diagrams for the loop in insert()
. We’ll use j
to denote its loop variable.
View the “Pre” diagram.
View the “Post” diagram.
View the “Inv” diagram.
Let’s use these diagrams to define insert()
. From the “Pre” diagram, we should initialize our loop variable j = i
. Using the “Post” and “Inv” diagrams, we should decrement j
in each iteration and guard the loop by the condition j > 0 && a[j-1] > a[j]
(i.e., a[i]
has a larger element to its left, so it hasn’t reached its sorted position). Within the loop, we should swap a[j]
and a[j-1]
so we can grow the “\(>\) a[i]
” range and then decrement j
to make progress.
|
|
|
|
Complexity Analysis
Let’s compute the space and time complexities of insertionSort()
in terms of \(N\) = a.length
. For the space complexity, we note that at most 3 stack frames are active at any point of execution (one for insertionSort()
, one for insert()
, and one for swap()
). Each of the methods uses \(O(1)\) space for local variables, so insertionSort()
has an overall \(O(1)\) space complexity.
For the time complexity, let’s consider the methods from inside to outside. As we noted previously, swap()
performs \(O(1)\) work. The runtime of insert()
is bounded by its at-most \(O(i) = O(N) \) loop iterations that each perform \(O(1)\) work, giving a worst-case \(O(N)\) time complexity. The insertionSort()
method consists of \(O(N)\) calls to insert()
, giving it a worst-case time complexity \(O(N) \cdot O(N) = O(N^2)\).
Benefits / Drawbacks
Generally, \(O(N^2)\) is poor performance for a sorting algorithm. The next algorithm that we will consider, merge sort, will have worst-case time complexity \(O(N \log N)\). However, one nice feature of insertionSort()
is that it is naturally adaptive, meaning it will perform better when the original order of the entries is close to sorted. In this case, the number of iterations of the insert()
loop can be significantly smaller than \(O(N)\), leading to a better runtime. We explore this idea more in Exercise 7.4.
In practice, many programming languages default to insertionSort()
for small or nearly sorted inputs in their provided sorting routines because of its simple implementation and adaptivity. Of the “standard” \(O(N^2)\) sorting algorithms, which also include selection sort and bubble sort (both introduced in the exercises), it almost always exhibits the best “wall-clock” performance.
Another benefit of insertionSort()
is that it works as an online algorithm. It does not require that all the array elements are present upfront; rather, it can receive the elements as it is executing and proceed to insert()
them where they belong. Online algorithms have performance benefits in real-time systems since data can be processed as it arrives. A final benefit of insertionSort()
is stability.
We say that a sorting algorithm is stable if it preserves the relative order of equivalent elements.
When we are sorting primitive types, as we are in this lecture, stability doesn’t mean much. However, stability is an important property when sorting and managing more complicated data, such as entries in a data base. We will return to this idea when we talk about the Comparable
/Comparator
interfaces in upcoming lectures, which give us a way to sort reference types.
Merge Sort
If you have ever worked with someone else to sort a large stack of papers, you should be somewhat familiar with our next sorting approach. Having two people look at the same papers is inefficient, so you likely split the pile in half, separately sorted these halves on your own, and then “merged” your sorted piles once you were both finished.
merge()
Let’s start our development of merge sort by implementing this merging step in a helper method merge()
. Before we look at the code, which is a bit intricate, let’s think about the high-level ideas of how such a merging procedure works. Suppose you are given two sorted arrays a
and b
and a third “empty” array c
that is large enough to fit all the entries of a
and b
. How would you go about filling in the entries of c
so that it contains all the entries of a
and b
in sorted order?
We can fill in the entries of c
one at a time, starting from the left.
- The first entry should be the minimum entry among everything in
a
andb
. Sincea
andb
are sorted, this will either bea[0]
orb[0]
, and we determine which one with a comparison. Sinceb[0]
is smaller, we copy it toc[0]
and then move on to the next entry. - The second entry will be the minimum entry among
a
andb
except forb[0]
. Again, using the fact thata
andb
are sorted, this is eithera[0]
orb[1]
(the next-smallest entry ofb
afterb[0]
). The smaller of these two entries should be copied toc[1]
.
We proceed with this logic. The fact that a
and b
are sorted means we only ever need to consider one of their entries at a time. With a single comparison, we can determine the next entry of c
. We’ll need a couple of local variables to keep track of things as we run this procedure. We’ll need a variable (i
) tracking our position in the a
array, a variable (j
) tracking our position in the b
array, and a variable (k
) tracking our position in the c
array. With these variables, the general merge logic becomes “copy the smaller of a[i]
and b[j]
into c[k]
, then increment k
and either i
or j
(the index of the smaller element) and repeat”. Once either a
or b
is empty, we copy the remaining entries of the other to the end of c
.
Now that we have the intuitive idea for the merge()
method, we are ready to look at its signature and specifications.
|
|
|
|
Rather than taking in two arrays and producing a third array, our merge()
method takes in two contiguous ranges of the same array a
(delineated by indices begin
, mid
, and end
) and places its output in their combined range. We will see why this is useful when we get to the main mergeSort()
method. Unfortunately, we will not perform the entire merge()
operation in-place (i.e., within array a
itself by swapping around its entries). Doing this naively requires large shifts similar to insert()
and will not result in the \(O(N \log N)\) time complexity we desire. More advanced approaches for in-place merge sorting of arrays fall well beyond our scope.
Instead, we’ll allocate a separate work
array to use as scratch space during the method. We’ll copy a[begin..mid)
to the work
array at the start of the method. Then, we can treat work
and a[mid..end)
as the two input arrays “a
” and “b
” in our above reasoning and treat a[begin..end)
as the output array “c
”.
Here, it was critical that we copied the left range to work
rather than the right range since it "clears" at the beginning of a[begin..end)
to start writing output. Even in the extreme case that every element of a[begin..mid)
is less than a[mid..end)
, we will have space to add these elements without overwriting a[mid..end)
. In other words, our method will maintain an invariant that k < j
. This is what allows us to use the same space for one input array and the output array.
Let’s draw the array diagrams to plan out the loop in merge()
. The “Pre” diagram will depict the state after the copy to the work
array. We’ll use shading to indicates the ranges of the array whose values are not relevant.
At the end of the loop, the entire range a[begin..end)
should be sorted. We omit the work
array from this diagram, since it is not relevant to the post-condition.
For the invariant, we will use the loop variables i
, j
, and k
as described above to track our positions in the two input and the output ranges. We’ve included two versions of the diagram below. The first diagram has some of the range labels omitted. As a good checkpoint, try filling in the conditions yourself before comparing with the second hidden diagram.
View the labeled “Inv” diagram.
Using these array diagrams, we can implement the merge()
method as follows:
|
|
|
|
Let’s take some time to unpack some of the expressions that appear in this loop.
- We’ve chosen to use the condition
k < j
as the loop guard. Oncek == j
, all the entries from thework
array will have been written back toa[begin..end)
, leaving a fully sorted array. - Within the condition of the
if
-statement, ifj == end
, then we have exhausted all the elements in the right input. The element that we write toa[k]
must bework[i]
. - When
j != end
andk < j
, there are elements left in both input ranges. Thus, we should comparework[i]
anda[j]
and write the smaller one toa[k]
.
We have finished (for now, we’ll do one more modification later to reduce the space complexity) the definition of merge()
. The following animation traces through an execution on the example given above.
previous
next
The mergeSort()
Algorithm
Let’s use the merge()
method that we just used to develop a recursive sorting algorithm, mergeSort()
. The algorithm has the following structure:
- Divide the input array (range) into two half-ranges at its midpoint.
- Sort both of these half-ranges.
- Use
merge()
to combine these sorted half-ranges into the full sorted output.
We’ll carry out the sorting in step 2 recursively, by making two calls to mergeSort()
. As our base cases, when we reach a range that has length 0 or 1, its entries will be (trivially) sorted, so we can immediately return
. Just as with our merge()
method, it will be more convenient to pass array views into our recursive sorting method. As we saw last time, we can accomplish this through the introduction of auxiliary index parameters to the recursive method signature (separate from the cleaner client-facing method). The full code is given below.
|
|
|
|
Merge sort is categorized as a divide-and-conquer algorithm. It works by splitting the problem (sorting) into smaller versions of the same problem, recursively solving those subproblems, and then combining the results to obtain a solution to the larger problem. The following animation helps to illustrate this structure and will be useful for our complexity analysis of mergeSort()
.
previous
next
Time Complexity
To analyze the time complexity of mergeSort()
, let’s start with the merge()
method. The array copy will require \(O(\texttt{mid - begin})\) operations. The loop runs for at most end - begin
iterations (since k
is incremented in each iteration and k
\(\leq\) j
\(\leq\) end
) and performs \(O(1)\) work per iteration. Thus, we can bound the runtime of merge by \(O(\texttt{end - begin})\), linear in the length of its subarray. Outside of this merge()
call, each invocation of mergeSortRecursive()
performs only \(O(1)\) non-recursive work, so the runtime of merge()
dominates.
Next, we must understand the call structure of the recursion, which we can do by analyzing the call stack diagram (which is closely related to the diagram in the previous animation). We’ll label each call with the size of its subarray, which is the parameter of interest for the time complexity.
The depth of the recursion is \(O(\log N)\) (the number of times we can halve the initial array length before reaching a base case). In each of the \(O(\log N)\) levels, the sums of the lengths of all the subarrays at that level is equal to \(N\); each entry belongs to exactly one subarray at each level of splitting. Thus, the total amount of work done at each level is \(O(N)\). Summing over all the levels gives an overall runtime of \(O(N \log N)\).
There is a bit of a subtlety in this runtime analysis. We could not use the naive strategy of multiplying a (uniform) bound on the non-recursive work done in any call \(O(N)\) by the total number of call frames (which turns out to be \(2N - 1 = O(N)\)), as this would give a correct but too-loose runtime bound of \(O(N^2)\). Instead, we gave a non-uniform bound on the non-recursive work (linear in the subarray length) and summed these bounds up in a clever way (first across each depth level of the recursion, and then over the levels) to get a tighter \(O(N \log N)\) bound. This will be one of the trickiest runtime calculations that we will see in the course.
Space Complexity
Next, let’s analyze the space complexity of mergeSort()
. From the call stack diagram shown above, we see that the depth of the recursion is \(O(\log N)\), which will contribute \(O(\log N)\) to the space complexity. However, there is another factor we must consider. Within the merge()
method, we allocate (heap) memory for our work
arrays. The largest work array will have length up to half of the total array length, so \(O(N)\). If we assume that the space for these work
arrays is deallocated when their call frames is removed from the runtime stack, we can bound their memory usage by the sum of the maximum subarray length at each depth, which is \(1 + 2 + 4 + \dots + \frac{N}{2} = O(N)\). Without this assumption on when memory is deallocated, the space complexity bound would grow to \(O(N \log N)\) by a similar calculation as our runtime analysis.
However, we can make a small modification to our code to ensure at most \(O(N)\) memory is used. Note that merge()
is non-recursive; at any point in the execution of mergeSort()
, we can be executing merge()
in at most one call frame. Thus, we can allocate one work
array at the start of mergeSort()
that is shared by all the call frames. We’ll need to “plumb” a reference to this common array through all the recursive calls so that it is accessible from within the merge()
method. This work
array must be long enough to work for the outermost call frame, which will have the longest subarray to copy (length \(\lfloor \frac{N}{2} \rfloor = O(N)\)).
|
|
|
|
Note that we’ve replaced the Arrays.copyOfRange()
call, which constructs a new array, with a System.arraycopy()
call, which copies entries from one existing array to another, in merge()
. After making this modification, our mergeSort()
algorithm has a guaranteed \(O(N)\) worst-case space complexity.
Benefits / Drawbacks
Merge sort is a widely used sorting algorithm because of its \(O(N \log N)\) runtime, which can be shown to be optimal for comparison-based sorting algorithms (i.e., sorting algorithms that work by comparing and swapping around the entries and do not rely on assumptions about their possible values). It is another example of a stable sorting algorithm. Together, these facts lead to merge sort being the default stable sorting algorithm in many language libraries (including Java’s).
Another benefit of merge sort is that its divide-and-conquer recursive structure makes it naturally amenable to parallel and/or distributed computing (i.e., execution on multiple different machines). This becomes important when dealing with large data sets, which may not fit in the memory single machine or for which even the \(O(N \log N)\) runtime guarantee is impractical. Since the two recursive calls made from each invocation of mergeSortRecursive()
concern disjoint subarrays of data, they can be processed independently, and their results can be combined later.
One potential drawback of merge sort is that it is not very adaptive; it will have the same asymptotic performance no matter how “mixed up” the array is (including if the array is already sorted). If you suspect that the data you are working with may be nearly sorted, as is the case in many applications, an alternate algorithm (even insertionSort()
, which has a worse worst-case time complexity) may be preferable.
Quicksort
Finally, we’ll consider the quicksort algorithm. It is a nice companion to merge sort, since it uses some of the same ideas. It is also a recursive algorithm in which the recursive case consists of two recursive calls and a linear pass over its subarray. However, quicksort does these in the opposite order, first performing its linear partition step and then recursively sorting both segments of the partition.
To motivate quicksort, suppose that we are given an (unsorted) array:
Now, let’s identify a particular entry of this array that we’ll call the pivot. Suppose that we select the 3 at index 0. How much work do we need to do to move the pivot to its correct, sorted position? Take some time to think about this question, as it forms the basis for the quicksort algorithm.
We can determine the pivot’s sorted location in \(O(N)\) time (where \(N\) again represents the length of the array). We’ll iterate over the other entries and count how many are smaller than the pivot. If there are \(i\) smaller entries, then the pivot’s sorted position is at index \(i\). Rather than just counting the entries, we can rearrange them so that all the smaller entries sit to the left of the pivot and the pivot is in its correct position (meaning everything to the right of the pivot will at least as large as the pivot). We’ll soon seed that we can accomplish this rearrangement (or partitioning) in-place with a single, linear-time scan over the array (loop invariant incoming!). After the partitioning, we are left with the following situation:
The pivot is in its correct position, everything to its left will remain to its left in the final sorted array, and everything to its right will remain to its right. Thus, we have transformed our one large sorting problem into two separate smaller sorting problems, which we can solve recursively. Just as in merge sort, we can ground this recursion with a simple base case: a subarray with length 0 or 1 is trivially sorted, so we can immediately return
. This is the quicksort algorithm.
|
|
|
|
The following animation executes this implementation of quicksort()
on our example array, relying on the specifications for partition()
to illustrate its behavior.
previous
next
partition()
To complete our implementation of quicksort()
, we must define the partition()
method. At the start of the method, we’ll swap the pivot into the begin
index. We have no information about the other entries.
At the end of the loop, we’d like the pivot to be moved to its sorted position in the range, with all smaller elements appearing to its left and all larger elements appearing to its right.
We’ll accomplish this by growing two segments from the left and right of the array view (similar to paritySplit()
from the loop invariants lecture), with small elements growing from the left and large elements growing from the right. We’ll need a second loop variable j
to keep track of the boundary of the right segment.
Let’s set up the loop. From the “Pre” diagram, we initialize i = begin
and j = end - 1
. From the “Post” diagram, we terminate the loop once the “?” range disappears, which happens when i == j
. Thus, we guard the loop on i < j
.
|
|
|
|
It remains to fill in the loop body. Within the loop, we inspect the next entry a[i+1]
. We make the following observations.
- If
a[i+1] < a[i]
, thena[i+1]
belongs in the left segment of the array diagram. We can move it there by swapping it witha[i]
and then incrementingi
to restore the loop invariant. - If
a[i+1] >= a[i]
, thena[i+1]
belongs in the right segment of the array diagram. We can move it there by swapping it witha[j]
and then decrementingj
to restore the loop invariant.
These observations result in the following completed partition()
definition.
|
|
|
|
The following animation walks through an invocation of partition()
. We use shading to indicate the “?” region.
previous
next
Complexity Analysis
Let’s analyze the complexity of quicksort()
. The runtime of partition()
is dominated by the loop, which runs for end-begin
iterations and performs \(O(1)\) work per iteration. In total, each call to quicksortRecursive()
performs an amount of non-recursive work proportional to the length of its subarray.
Understanding the recursive call structure of quicksort()
is a bit more subtle, as it depends on which pivot is selected in each call. A good pivot will land close to the middle of its range, splitting the problem into two roughly equally sized subproblems.
In this case, the recursive structure is similar to that of mergeSort()
, which results in an \(O(N \log N)\) runtime. In the worst case, however, the pivot does not equally divide each subarray. Instead, it is the either the smallest or largest element, causing one segment of the partition to be empty and the other segment to contain all the other elements. One way to realize this behavior is to use our naive “first element” pivot selection on an array of unique elements sorted in descending order.
In this case, the recursion will have depth \(O(N)\), which results in a worst-case \(O(N^2)\) runtime.
Since none of the methods used in quicksort()
allocate more than a constant amount of memory, its space complexity is bounded by the depth of the recursion, which we just saw is \(O(N)\) in the worst case. It is possible to make some clever adjustments to reduce the space complexity to worst-case \(O(\log N)\), but we relegate this discussion to Exercise 7.7.
Expected Runtime
As we just saw, the runtime of quicksort()
depends on the instance and can vary widely from \(O(N \log N)\) to \(O(N^2)\). In light of this variation, practitioners are often interested in a different notion of performance, the expected time complexity.
Given a probability distribution over inputs of a fixed size \(N\) (e.g., a uniform distribution over all possible orderings of some set of entries for quicksort()
), the expected time complexity with respect to this distribution is the expected value of the random variable representing number of operations performed by the method on inputs from this distribution, expressed as a function of \(N\).
This definition is a bit wordy, but you can think about it as measuring the “typical” performance of the method (as opposed to the specially tailored inputs that give the best-case and worst-case complexities). One can calculate (it’s beyond our scope, but you may do this in CS 4820) that quicksort()
with a reasonable choice of pivot has an \(O(N \log N)\) expected time complexity. In other words, its performance is typically much closer to that on best-case inputs than on worst-case inputs.
Other Performance Considerations
The above expected performance result relies on having a “reasonable” choice of pivot. While we’ll leave this idea a bit fuzzy, we remark on a few thoughts on how to improve the pivot selection.
- Our “first element” pivot rule performs poorly in real-world data which is often close to sorted. While above we show that data sorted in descending order is a worst case for
quicksort()
, it turns out that sorted data is another worst case (even though no rearranging is necessary). - One way to improve the pivot selection is to look at multiple elements before selecting a pivot. For example, the “median of 3” heuristic chooses the middle of 3 inspected entries, typically the first, last, and midpoint entries (to be robust against sorted data).
- The ideal pivot, which would guarantee \(O(N \log N)\) performance would be the median of the subarray. However, computing this exactly is challenging in practice. There are \(O(N)\) algorithms such as the “median of medians” that can approximate the median well. However, the additional computation time for these approaches often make them worse than the following, simpler approach.
- Choosing a random pivot tends to work very well in practice. To augment this approach, one check that the
partition()
is “relatively balanced” (e.g., at worst a 1/3 - 2/3 split) and re-partition()
with a new random pivot if not.
We explore another optimization for quicksort()
, which considers a smarter way to handle values equal to the pivot, in Exercise 7.8. Despite having a bad worst-case runtime complexity, quicksort()
tends to perform very well in practice; often even better than mergeSort()
. The “long-range” swapping used in partition()
makes quicksort()
unstable, and there is no easy way to achieve stability without a deleterious change in performance. Together, these facts lead to quicksort()
being the default unstable sorting algorithm in many language libraries (including Java’s).
Main Takeaways:
- There are many different sorting algorithms, each with their own benefits and drawbacks that make certain algorithms preferable to others in different settings.
- A sorting algorithm is stable if it preserves the order of equivalent elements. It is adaptive if its performance improves when the input array is almost sorted.
- Insertion sort builds up a sorted array one entry at a time. It is stable and highly adaptive with an \(O(N^2)\) worst-case time complexity and an \(O(1)\) space complexity.
- Merge sort recursively divides an array into smaller pieces, sorts these pieces, and then merges together the sorted results. It is stable but not adaptive, and it has an \(O(N \log N)\) worst-case time complexity and an \(O(N)\) space complexity.
- Quicksort partitions an element about a pivot and then recursively both segments of the partition. It is unstable and not adaptive, and it has an \(O(N^2)\) worst-case time complexity and an \(O(N)\) space complexity.
- Quicksort tends to perform well in practice and has an \(O(N \log N)\) expected runtime complexity. Using a good pivot-selection strategy improves its performance.
Exercises
a
is the array to be sorted in ascending order and j
is not changed during the loop): (1) a[0..k)
and a(k..j]
are sorted and (2) a[k] < a(k..j]
?A version of Merge Sort is written below. One of the statements has been replaced by // TODO
.
|
|
|
|
// TODO
to complete the MS()
definition. Pay careful attention to whether array bounds are inclusive or exclusive.[9,4,13,11,16,18,3,22]
. Sorting algorithm \( X \) is run on the array. At some time during the execution of \( X \), the array looks like [3,4,9,13,11,16,22,18]
. What is \( X \)?i
'th iteration, that the first i
elements of the array are the same as the original, but sorted. Selection sort guarantees that the first i
elements of the array are the i
smallest elements from the entire array, and that these elements are sorted. While insertion sort guarantees the remainder of the array (a[i..]
) is unchanged, selection sort does not have this property; the smallest elements must somehow be brought into the first i
positions. Consider the following example:
previous
next
i
‘th iteration, what guarantees can you make on the first i
elements or on the remaining ones?
select()
procedure and one for the full selectionSort()
procedure that contains the outer loop that calls the select()
method.
select()
and selectionSort()
methods.
selectionSort()
implementation stable? Explain your answer.
selectionSort()
implementation adaptive? Explain your answer.
|
|
|
|
bubbleSort()
on the following array:
bubbleSort()
. How does it make progress and eventually result in a sorted array?
bubbleSort()
loop.
bubbleSort()
loop.
bubbleSort()
definition.
bubbleSort()
definition is not stable. Give an example input that demonstrates this instability. What change can you make to the bubbleSort()
definition to make it stable?
bubbleSort()
implementation adaptive? Explain your answer.
a
. An inversion is defined as a pair of indices \( (i, j) \) with \( 0 <= i < j < \texttt{a.length} \) and \( \texttt{a}[i] > \texttt{a}[j] \).
{1, 5, 2, 3, 4}
. How many inversions are there in this array? Consider an array of size \( N \) with one element out of place, what is the maximum possible number of inversions?
i
'th iteration of insertion sort, the insert()
procedure requires finding the correct index of the new element nums[i]
among nums[..i)
. Since the outer loop invariant guarantees that nums[..i)
is sorted, a student proposes to use binary search to find the index to insert at. They reason that this improves the runtime to \( O(N\log N) \) since in each of the \( N \) iterations, binary search is used on the sorted subarray, which has a complexity of \( O(\log N) \). Is the student correct in this analysis? If so, implement this upgraded insertion sort. Otherwise, explain why not.
int[] nums = {7, 3, 4, 4, 2, 3, 5}
, draw nums
after each iteration of the while
-loop when calling merge(nums, 1, 4, 7)
.
nums
, draw nums
after each iteration of the while
-loop when calling partition(nums, 0, 5, 0)
.
quicksort()
. They suggest to first recursively call quicksort on the partition with fewer elements, instead of always recursing on the "< pivot" section first.
|
|
|
|
quicksortSmallerFirst()
. Assume you have access to the partition()
method defined in the lecture notes.
quicksortSmallerFirst()
?
partition()
procedure in quick sort. After partitioning, we get three sections: less than pivot, the pivot itself, and greater than or equal to pivot. We modify this partition()
to a partitionThreeWay()
. This partitions the input array into three sections: less than, equal to, and greater than the pivot. The equal to section of the partition is in its sorted place already, so we only need to recurse on the less than and greater than sections of the array. A student argues that this can help since we wouldn't need to re-sort the equal elements as we do in the "\(\geq\) pivot" section in the partition()
procedure.
|
|
|
|
partitionThreeWay()
according to the specification above. Use a loop invariant to guide your implementation. This method should still run in \(O(N)\) time and \(O(1)\) space as does partition()
.
quicksort()
with this new partitionThreeWay()
method. Does this help improve the runtime? If so, explain how. If not, under what assumptions may this help with the runtime?
void
. Refactor the implementation of insertionSort()
, mergeSort()
, and quicksort()
to return an int
representing the number of comparisons needed to sort the array.
Run your implementations of the three methods with a randomly generated array of lengths \( N = \{10, 10^2, 10^3, 10^4, 10^5\}\) with elements in the range \([0..10^4]\). Take note of the number of comparisons needed for each sorting algorithm. Plot these on a graph using some tool, such as Desmos. For your convenience, you can use the following method to generate an array of size n
.
|
|
|
|
nums
of length \( N = 100 \) with the following initial configurations: (1) an already sorted array, (2) the array of integers where nums[i] = 100 - i
, and (3) an array where each element is the same. Which algorithm preforms the best and worst for each in terms of the number of comparisons.