23. Shortest Paths
In our final lecture about graphs, we’ll turn our attention to the question of locating the shortest path between two vertices. Finding optimal paths is critical for transportation and navigation software (e.g., Google Maps). It also has uses in other optimization problems, such as edge detection for image segmentation, where we can recast the problem as finding the optimal dividing line between neighboring pixels. We’ll begin our discussion by revisiting the BFS algorithm. We can augment its traversal to track additional information that enables us to locate shortest paths in unweighted graphs. By expanding on these ideas, we can adapt this procedure to also work for weighted graphs, giving us the celebrated shortest path algorithm of Edsger Dijkstra.
Unweighted Graphs
In the previous lecture, we saw that a BFS discovers (and visits) a graph’s vertices in level order:
- The source vertex is designated as belonging to Level 0.
- All the neighbors of the source vertex comprise Level 1.
- The neighbors of the vertices in Level 1 (that are neither in Level 0 nor Level 1) comprise Level 2.
- etc.
We can use this reasoning to describe an iterative process that builds up the levels of the graph one at a time.
previous
next
From this process, we can derive the following two facts about the relationship between the vertex levels and the edges in our graph.
1. For each edge \((v,w)\) in a graph, the level of \(w\) is at most one greater than the level of \(v\).
Suppose that \(v\) belongs to level \(\ell\) in the graph, and consider the point of our level-labeling process where we are considering the edges crossing out of level \(\ell\). If the edge \((v,w)\) is one of these crossing edges, then our process will place \(v\) in level \(\ell + 1\). The only way that \((v,w)\) is not one of these crossing edges is if it was already assigned to a level earlier in the process. In either case, the level of \(w\) is at most one greater than the level of \(v\).
A common misconception is to assume that the level of \(w\) will always be exactly one more than the level of \(v\). Our example from above shows that this is not true. For example, the edge \((d,t)\) is between two vertices in level 3, and the edge \((d,b)\) crosses "back" from Level 3 to Level 1.
2. For each vertex \(w\) in Level \(\ell \geq 1\), there is some edge \((v,w)\) in the graph reaching \(w\) from a vertex \(v\) in Level \(\ell - 1\).
This follows immediately from the definition of our level-labeling process; the presence of a crossing edge from a vertex in Level \(\ell-1\) to \(w\) was what allowed us to include \(w\) in Level \(\ell\).
Again, we should interpret this fact (in particular, its quantifiers, if you're familiar with this term from CS 2800) carefully. It guarantees the existence of at least one such incoming edge to \(w\) with this property. It does not guarantee that all incoming edges to \(w\) will have this property.
These two facts allow us to reach the following result connecting a vertex’s level and its shortest path from the source vertex \(s\).
We can use Fact 1 to show that a shorter path cannot exist. Each edge in the graph can move me up at most one level. To get from the source vertex (at Level 0) to vertex \(w\) (at Level \(\ell\)), we’ll need to traverse at least \(\ell\) edges (i.e., cross over \(\ell\) level boundaries).
Next, we can use Fact 2 to show that a path of length \(\ell\) must exist. Working backward from the end of the path, we know that there must be an edge from some vertex \(v\) in Level \(\ell-1\) to vertex \(w\). Similarly, there must be an edge from some vertex \(u\) in Level \(\ell-2\) to vertex \(v\). Repeating this reasoning \(\ell\) times (which can be formalized with a proof by induction), we will eventually conclude that there must be an edge from a vertex in Level 0 (which must be \(s\), the only vertex in Level 0), to a vertex in Level 1 that has a path to \(w\). Combining all these edges produces our desired \(s \rightsquigarrow w\) path with length \(\ell\).
Augmented BFS
The level-labeling procedure that we just described can be achieved with a minor modification to our BFS code from the previous lecture. We know that our BFS is guaranteed to discover (and visit) the vertices in level order, beginning with the source vertex at Level 0. In each outer-loop iteration, we iterate over the outgoing edges of some vertex \(v\) (where \(v\) was the vertex at the front of the frontier queue) and determine which connect to a yet-undiscovered vertex \(w\). Each of these edges \((v,w)\) is a “crossing” edge, adopting our terminology from the previous section.
If we kept track of the levels of all discovered vertices (including \(v\)), we would know how to label the level of \(w\); one more than the level of \(v\). We can perform this bookkeeping by replacing our discovered set with a map that associates each discovered vertex with its level. Then, we can return this map of levels at the end of the traversal.
|
|
|
|
In the lecture code, we construct the 6-vertex graph from above and run the bfsLevels() method to confirm that it computes the vertex levels correctly.
Reconstructing the Shortest Path
While our bfsLevels() method allows us to determine the length of the shortest path from the source vertex \(s\) to every vertex \(v\) in our graph, it does not tell us which edges comprise each of these paths. How can we further augment our BFS code to provide this information?
For this, we can take inspiration from our earlier observations. If \(w\) is a vertex in Level \(\ell\), then we know that the last edge in its shortest path will be some edge \((v,w)\) from a vertex \(v\) in Level \(\ell-1\). The other \(\ell-1\) edges in the shortest \(s \rightsquigarrow w\) path will connect \(s\) to \(v\), so they will form a shortest \(s \rightsquigarrow v\) path. Let’s record this observation since it will be very important going forward:
Using this observation, we see that all the vertices will be able to determine their shortest paths from \(s\) as long as they keep track of the last edge of this path. If \(w\) knows that its last shortest path edge is \((v,w)\), then it can ask \(v\) about the second-to-last edge (which will be the last edge of \(v\)’s shortest path), continuing this process until it retraces the path back to \(s\).
Let’s augment our BFS to keep track of these final edges. Rather than storing the edge directly, we’ll store a reference to the tail vertex of the edge, which we’ll denote by prev (the “prev"ious vertex in the shortest path). If vertex \(w\) belongs at Level \(\ell\) of the graph, then this prev vertex can be any vertex with an edge to \(w\) that sits in Level \(\ell-1\). In particular, we can choose the vertex \(v\) whose outgoing edge “discovered” vertex \(w\) during the BFS.
We’ll modify our discovered map to store both a vertex’s level and its prev vertex, which we can package up in a record class, PathInfo.
|
|
|
|
We can use this PathInfo class to complete the definition of our augmented BFS procedure, bfsPaths().
|
|
|
|
When we run bfsPaths() on our example 6-vertex graph and print the resulting map (our full client method definition is provided with the lecture release code), we obtain:
s: {level = 0, prev = null}
a: {level = 1, prev = s}
b: {level = 1, prev = s}
c: {level = 2, prev = a}
d: {level = 3, prev = c}
t: {level = 3, prev = c}
Take some time to complete the definition of the following method that uses the Map returned by bfsPaths() to reconstruct the shortest path (modeled as a list of vertices beginning with the source vertex and ending with the destination vertex) to a given vertex.
|
|
|
|
Compare your implementation with ours below.
reconstructPath() definition
Toward Dijkstra’s Algorithm
We’ve completed our implementation of a shortest path algorithm for unweighted graphs. Before we move on to weighted graphs, let’s stop to reflect on some of the big ideas that we’ll need for this approach. First, we’ll recall some of the terminology that we used to describe the state of a vertex at different points in a graph traversal.
- At the start of the traversal, all the vertices (except for the source vertex) are undiscovered.
- A vertex becomes discovered the first time that it is identified as the head of an outgoing edge from a vertex we are visiting.
- Between the time that a vertex is discovered and when it is visited, it belongs to the traversal’s frontier.
- Once we have finished visiting a vertex, we say that it is settled.
We can thus visualize the state of a vertex over the course of the traversal as follows:
The shading of these time segments corresponds to how we visualize the vertices in our animations. Now, let’s imagine that we take a snapshot of our graph at some point during the traversal and consider what states its vertices are in. Since the traversal begins at the source vertex and radiates outward, the settled vertices will be closest to the source. The undiscovered vertices will be furthest from the source. Finally, and most crucially for Dijkstra’s algorithm, the frontier vertices will form a thin layer (1-vertex wide) between the settled vertices and the undiscovered vertices.
Using these pictures, we make the following observations about the properties (or invariants) maintained by BFS.
1. At any point in the algorithm, our discovered map records the shortest (known) path distance to every node that we’ve discovered.
In the case of unweighted graphs, this distance can never decrease since it is set the first time that a vertex is the head of a crossing edge, and the tails of these crossing edges are considered in increasing level order.
2. The next vertex to be removed from the frontier queue is always the unvisited vertex with the lowest level.
In other words, each outer-loop iteration visits (and settles) the closest remaining vertex to the source.
3. As soon as a vertex is visited/settled, we are guaranteed that we have located the shortest path to it from the source vertex, and this path contains only vertices that were previously settled.
This follows from the fact that BFS settles vertices in level order, and a shortest path connects vertices in increasing order of level.
Weighted Graphs
The setting for Dijkstra’s algorithm is a directed graph with a designated source vertex \(s\) in which each edge \((u,v)\) is labeled with a non-negative value \(w(u,v)\) that we’ll call its weight. The goal of the algorithm is to identify the shortest path (i.e., the path whose edges have the lowest possible weight sum) from \(s\) to each possible destination vertex \(t\).
Here, the assumption that all the edge weights are non-negative will be critical, as it provides us with the guarantee that adding additional edges to the path can never decrease its total weight. When negative edges are allowed, a different dynamic programming procedure called the Bellman-Ford algorithm can be used to locate shortest paths. You'll learn about this in CS 4820.
Let’s try to mirror the ideas from BFS in an edge-weighted graph.
-
We’ll again use a map to keep track of information about all the vertices that we’ve discovered. At any point in the algorithm’s execution, the map will record the length (i.e., edge weight sum) of the shortest known path from \(s\) to each discovered vertex \(v\), which we’ll denote by \(d(s,v)\).
-
We’ll again group the discovered vertices into two classifications: the frontier vertices (that are queued up to be visited) and the settled vertices (that have already been visited). In each iteration of the algorithm’s main loop, we will visit one vertex in the frontier, allowing us to mark it as settled and make progress toward termination (which takes place once the frontier is empty).
-
We would like it to be the case that as soon as a vertex \(v\) becomes settled, we are guaranteed to have located the shortest \(s \rightsquigarrow v\) path.
Understanding how to achieve the third property is the key insight for the development of Dijkstra’s algorithm.
Dijkstra’s Invariant
To formalize the key invariant of Dijkstra’s algorithm, let’s return to this picture visualizing the frontier of our graph traversal.
In this picture, we have already settled all the vertices in the dark red shaded area. Trusting property 3 from the previous section, we have identified the shortest path to each of these vertices from \(s\); that is, we know the true shortest-path distance from \(s\) to each settled vertex \(u\), which we’ll denote \(d^*(s,u)\).
Now, any path from \(s\) to any unsettled vertex \(w\) in the graph will need to include at least one frontier vertex \(v\) (which may be the same vertex as \(w\) or some other intermediary vertex along the path). Take a minute to think about why this is true.
Why must the path include a frontier vertex?
Now, let’s consider the shortest known distances to each of the frontier vertices in the discovered map. Among all of these, we’ll let \(v^*\) be the vertex with the minimum shortest known distance
\(d(s,v^*)\).
Dijkstra’s invariant tells us that this distance \(d(s,v^*)\) must be the shortest path distance to
\(v^*\),
\(d^*(s,v^*)\).
If \(v^*\) is the vertex in the frontier set with the minimum shortest known distance from \(s\), \(d(s,v^*)\), then the shortest known \(s \rightsquigarrow v^*\) path is a shortest \(s \rightsquigarrow v^*\) path in the graph, and \[ d^*(s,v^*) = d(s,v^*). \]
Dijkstra’s invariant tells us that we can safely settle the vertex \(v^*\) and satisfy property 3 from above, which will allow us to make progress in each iteration of our traversal algorithm. Before we finish developing the algorithm, let’s understand why the invariant holds.
We want to argue that our shortest known \(s \rightsquigarrow v^*\) path, with length \(d(s,v^*)\) is the truly shortest \(s \rightsquigarrow v^*\) path. To do this, let’s consider any alternate \(s \rightsquigarrow v^*\) path. From our earlier observation, this alternate path must pass through at least one frontier vertex. Thus, we’ll let \(v'\) denote the first frontier vertex in this alternate path. We can split our alternate \(s \rightsquigarrow v^*\) path into the portion from \(s \rightsquigarrow v'\) and the portion from \(v' \rightsquigarrow v^*\). Since all edges have non-negative weights, the length of the path portion from \(v' \rightsquigarrow v^*\) must be non-negative. Thus, the length of our alternate path is at least the length of its \(s \rightsquigarrow v'\) path portion.
Since \(v'\) was the first frontier vertex on our alternate path, all vertices besides \(v'\) in this \(s \rightsquigarrow v'\) path portion are settled (i.e., this path was “known” at this point in the algorithm). Thus, this \(s \rightsquigarrow v'\) path has length at least \(d(s,v')\). Finally, our choice of \(v^*\) tells us that \(d(s,v') \geq d(s,v^*)\), so our alternate path has length at least \(d(s,v^*)\). To conclude, we note that since an arbitrary \( s \rightsquigarrow v^*\) path has length at least \(d(s,v^*)\), the length of the shortest path \( s \rightsquigarrow v^*\) must be \(d^*(s,v^*) = d(s,v^*)\).
Coding Dijkstra’s Algorithm
Our code for Dijkstra’s algorithm will largely follow the same structure as the bfsPaths() method. We will maintain a collection of vertices in the frontier and visit one frontier vertex in each iteration. When we visit a frontier vertex, we’ll explore its outgoing edges to potentially discover new vertices, adding them to the frontier. We’ll maintain a discovered map to track information during our traversal, which will allow us to reconstruct the shortest paths at the end. Now, let’s identify how our traversal must be modified to account for edge weights.
Tracking Distances
Rather than storing the traversal level of discovered vertices in our map, we’ll store the distance of the shortest known path to that vertex from \(s\). We’ll update our PathInfo record class accordingly:
|
|
|
|
In the notation from the previous section, our PathInfo object associated with a vertex \(v\) stores the quantity \(d(s,v)\), along with the pointer to the previous path vertex. When we discover a new path to a vertex \(v'\) during our visit of vertex \(v\), this offers a new candidate for the shortest \(s \rightsquigarrow v'\) path. Namely, we can follow the shortest known path from \(s \rightsquigarrow v\), which we now know is the shortest \(s \rightsquigarrow v\) path by Dijkstra’s invariant, and then follow the \((v,v')\) edge. The length of this path is \(d(s,v) + w(v,v')\), which we can easily compute since \(d(s,v)\) is stored in our discovered map and \(w(v,v')\) can be queried from the graph.
If this is the first path we’ve found to \(v'\), we’ll add \(v'\) to the frontier and the discovered map. Otherwise, it’s possible that this path may be shorter than the previous best-known path, in which case we should update the discovered map. Otherwise, if this path is the same length or longer than the previous best-known path, we don’t need to make any updates.
Modeling the Frontier
For BFS, we modeled the frontier as a Queue. Our analysis of BFS guaranteed us that the elements were removed from the Queue in level-order, so the dequeued() element was always the closest currently enqueued element to the source. In the case of a weighted graph, it is true that vertices are initially added to the frontier in increasing distance order. However, the possibility of updating distances when a shorter candidate path is discovered means that this guarantee would not extend to the removal order from a Queue. Instead, we’d like another data structure that can guarantee removals in increasing distance order and will allow efficient insertions and updates of its elements. The DynamicPriorityQueue from a few lectures ago provides exactly these guarantees.
We can put these pieces together to complete the definition of Dijkstra’s algorithm. Note that WeightedEdge is a subtype of the Edge interface that introduces a method to access an edge’s weight().
|
|
|
|
|
|
|
|
Here, we omit the particular DynamicPriorityQueue implementation from the method definition. The lecture release code does a “hacky” patch to Java’s provided PriorityQueue class to support priority updates (note that Java does not include a standard DynamicPriorityQueue implementation in the language library). We encourage you to replace this with DynamicPriorityQueue described in Lecture 19, which utilizes a heap and a map.
Example Dijkstra’s Algorithm Execution
Step through the following animation that visualizes an execution of Dijkstra’s algorithm.
previous
next
Complexity Analysis
Finally, let’s analyze the complexity of Dijkstra’s algorithm.
Space Complexity
The method constructs two data structures as local variables, the discovered map and the frontier priority queue. The discovered map includes one entry per vertex. Each entry stores a distance and a reference to the prev vertex, which each use \(O(1)\) space. Thus, the overall size of the discovered map is \(O(|V|)\).
The frontier priority queue stores a vertex reference and a double priority value in each entry, and requires \(O(1)\) space per entry. At worst, the frontier can contain all the vertices (technically, all but the source vertex if the source vertex had an outgoing edge to each other vertex). Thus, its overall size is also \(O(|V|)\).
The remaining local variables each occupy \(O(1)\) space. Assuming that our priority queue and map implement their operations iteratively, the stack space required will be \(O(1)\); even recursive implementations would never need to exceed an \(O(|V|)\) depth, so would not dominate the space complexity.
Overall, we find that Dijkstra’s algorithm has an \(O(|V|)\) space complexity.
Time Complexity
Next, let’s analyze the time complexity. To do this, we’ll focus on the main loop, which will dominate the runtime.
|
|
|
|
Let’s analyze the runtime line by line.
- The main loop runs one iteration per element that is added to the
frontierpriority queue. Each vertex is added at most once over the course of the algorithm, so the main loop runs for \(O(|V|)\) iterations. - The
remove()operation on line 2, assuming a heap-backed priority queue, runs in \(O(\log |V|)\) time. Here, we use the fact that the maximum size of the priority queue is \(O(|V|)\). Thus, this line contributes \(O(|V| \log |V|)\) to the overall runtime. - In a particular iteration of the main loop, the inner loop beginning on line 3 runs one iteration per outgoing edge from \(v\). Summing over all the vertices \(v\) (i.e., over all iterations of the main loop), there will be a total of \(O(|E|)\) iterations of this inner loop.
- Getting the
head()vertex of an edge on line 4 is an \(O(1)\) operation, so this line contributes \(O(|V|)\) to the runtime. - The computations on lines 5, 6, 7, 9, and 10 are dominated by operations on the
discoveredmap (get(),put(), andcontainsKey()). These all have expected \(O(1)\) runtime assuming the use of aHashMap, so overall these lines contribute an expected \(O(|E|)\) to the runtime. - The
add()andupdatePriority()operations on lines 8 and 11 require \(O(\log |V|)\) time, assuming the use of a heap-based priority queue. Therefore, these lines contribute an expected \(O(|E| \log |V|)\) to the runtime.
In total, lines 8 and 11 dominate the runtime. We find that our implementation of Dijkstra’s algorithm has an expected worst-case runtime of \(O(|E| \log |V|)\).
By using an alternate data structure for the DynamicPriorityQueue called a Fibonacci heap, one can reduce the runtime of Dijkstra's algorithm to \(O(|E| + |V| \log |V|)\).
Main Takeaways:
- Breadth-first search is guaranteed to visit the vertices in increasing distance order from the source vertex.
- We can augment a traversal by using a map to keep track of extra information about the vertices as we visit them. By tracking the incoming edges that discovered the vertices, we can reconstruct paths to the vertices from the traversal source.
- Dijkstra's invariant tells us an order in which we can settle the vertices with a guarantee that we have located the shortest path to them.
- We use a
DynamicPriorityQueueto manage the frontier in Dijkstra's algorithm. Vertices' best-known distances may decrease as we locate new incoming paths while settling other vertices.
Exercises
Using Dijkstra’s algorithm to find the shortest paths from vertex “A” to all reachable vertices, we obtain the following table of auxiliary vertex data.
| vertex | A | B | C | D | E | F | G | H | I |
distance |
0 | 1 | 5 | 2 | 7 | 6 | 11 | 15 | 12 |
prev |
null |
A | A | B | C | D | D | E | G |
State the shortest path that was found to "I".
Suppose we are running Dijkstra’s algorithm to find the shortest distance between Melbourne and various other cities. At some point during the search, the distances and back-pointers in the frontier set look like this:
| City | Distance | Backpointer |
|---|---|---|
| Perth | 3500km | Adelaide |
| Brisbane | 1800km | Canberra |
| Sydney | 900km | Canberra |
Consider a directed graph in which vertices represent bus stops, edges represent portions of bus routes, and edge weights correspond to average travel time between stops. Suppose we use Dijkstra’s algorithm to find the fastest way to get from the stop at Helen Newman to all other stops (assuming no wait time between buses). As we run the algorithm, we record details about all the stops discovered so far in a table. At some point during the execution of the algorithm, our table looks like this:
| Bus Stop | Fastest known time from Helen Newman | Previous stop | Settled? |
|---|---|---|---|
| Rockefeller | 2 min | Helen Newman | Yes |
| Uris Hall | 8 min | Rockefeller | Yes |
| Baker Flagpole | 7 min | Helen Newman | Yes |
| Dairy Bar | 10 min | Uris Hall | Yes |
| Collegetown | 11 min | Baker Flagpole | No |
| Vet School | 20 min | Dairy Bar | No |
| Commons | 15 min | Uris Hall | No |
Trace the algorithm until the conclusion of the loop iteration in which vertex \(d\) is settled, so that the invariant has been restored, recording its state in the table below. If properties change over the course of running the algorithm, you should neatly cross out their old values.
| vertex | \(a\) | \(b\) | \(c\) | \(d\) | \(e\) |
distance |
0 | ||||
prev |
null |
||||
| discovered? | ✓ | ||||
| settled? |
PathInfo map with their associated weights.
PathInfos will form such a tree. What properties of a tree ensure this?
DynamicPriorityQueue implementation backed by a map and a heap.
MinHeapDynamicPriorityQueue. Note that here we desire a min heap rather than a max heap.
PathInfos?