26. Concurrency
Up to this point in the course, we have conceptualized programs as having a single execution path. Our code specifies a sequence of instructions that are carried out in order. We can understand the state of our program by taking a single snapshot of its runtime stack, which gives us a sense of what the program is currently doing (executing the method of its highest call frame), what variables and objects it has instantiated and what they store (by looking at the call frame entries and the variables allocated on the heap), and what other work there is still left to do before the program will finish (by examining the lower frames on the runtime stack).
This computation model has been convenient. It has given us ways to reason about the complexity of our code and the lifetimes of its variables. However, it is not a true representation of how our computers actually work. When you use your computer, it is never carrying out a single task. Right now, you’re reading these notes in one tab of your browser, which is executing code to render this text correctly on your screen. At the same time, you may have other tabs open. Perhaps you are listening to music that is playing in another tab. Perhaps you have a tab open that will alert you to any incoming emails. Perhaps you have IntelliJ open to work on your CS 2110 assignment (which, itself, simultaneously executes multiple code routines to manage its application view, statically analyze your code to highlight syntax errors and warnings, interact with your file system to sync and save your work, etc.).
We need to expand our understanding of computation to incorporate concurrency, the ability to simultaneously make progress on multiple computational tasks. This will be the focus of the next two lectures. Today, we’ll introduce a lot of the central terminology around concurrency, with a focus on the concurrency model of Java. We’ll see that concurrency offers some performance benefits while also introducing new complications. In the next lecture, we’ll introduce synchronization, a technique that helps us to address many of these complications and approach concurrency in a principled way.
Motivating Concurrency
Imagine you are preparing your house for a dinner party this evening. There are three big jobs that you need to complete to be ready for your guests to arrive. First, you need to make the dessert, a cake. For this, you’ll need to:
Bake Cake:
- Prepare the cake batter (30 minutes)
- Bake the cakes (1 hour)
- Allow the cakes to cool (1 hour)
- Prepare the frosting (30 minutes)
- Frost and decorate the cake (30 minutes)
Next, you also need to make the main dish, a lasagna. For this, you’ll need to:
Make Lasagna:
- Prep the ingredients for the meat sauce (30 minutes)
- Simmer the meat sauce (3 hours)
- Prepare the pasta and cheese filling (45 minutes)
- Bake the lasagna (1 hour)
Finally, you’ll need to clean your house so it is presentable for your guests. For this, you’ll need to:
Clean House:
- Vacuum the floors (20 minutes)
- Clean and organize the kitchen counters (30 minutes)
- Set the table (10 minutes)
There are many different ways you can go about completing all of these tasks. The first, and most basic, approach is to do the tasks sequentially. You can follow all of the instructions to bake the cake. Once this is finished, you can start to make the lasagna. Once the lasagna is finished, you can clean and set up for the party. In all, this approach will take 9 hours and 45 minutes, so it’s doable as long as you woke up early enough. How can we complete this work in less time?
You might observe that a lot of the time in the cooking tasks is passive. When the sauce is simmering or the lasagna or cake is baking, you aren’t actively involved and are free to work on something else. This could be getting a head start on the next step (for example, preparing the cheese filling while the sauce is simmering), but it can also be starting work on one of the other jobs (such as setting the table while the cake is in the oven). We say that you are completing these tasks (or their steps) concurrently, since you are making progress on multiple steps at the same time.
Concurrency describes the ability to carry out multiple procedures at the same time.
In particular, you are achieving concurrency by time slicing.
Time slicing describes switching between actively working on multiple different procedures at the same time.
In time slicing, you are only ever actively working on one task at any instantaneous time point (you’re never simultaneously vacuuming the floor and stirring the meat sauce). However, the interleaving of your actions as you jump between the tasks allows for concurrency when viewed at the large scale: there will be hour-long time blocks where we can say that you were both cooking and cleaning. While time slicing might actually add a little bit of work for you to manage these context switches (e.g., you’ll probably need to wash your hands a few more times as you switch between cleaning and cooking tasks), overall, you’ll save time by repurposing the passive times to complete other active work.
As an alternate approach, you can ask one of your dinner party friends to come over early to help you with the tasks. One of you can take ownership of baking the cake while the other makes the lasagna. In this case, concurrency is achieved through parallelism.
Parallelism describes the ability to simultaneously carry out multiple procedures at an instantaneous point in time, achievable by dividing the work among multiple entities, or processors.
Having another person to help with the tasks should help finish the tasks faster. However, it will also add a new challenge of coordination. You may both want to access shared resources (such as a mixing bowl or the oven) at the same time, and one of you will need to wait for that resource to become available. When you are vacuuming the floor, your friend may need to temporarily stop decorating the cake to move out of your way. This coordination problem is a new phenomenon brought about by concurrency; you did not need to worry about being in your own way when you were working on the tasks alone, one at a time.
How Computers Compute
Similar to our dinner party analogy, your computer must also contend with running multiple programs simultaneously. With the rise of graphical applications and operating system support for window management and multi-tasking views, we have come to expect concurrent execution. To understand how this is done, we’ll need to peek under the hood of the computer to get a high-level view of how it carries out computations.
After a lot of simplifications and abstractions, your computer largely consists of three components: its CPU (central processing unit) cores, its memory (RAM), and its I/O and storage peripherals.
The CPU is responsible for carrying out computations. It executes billions of machine code instructions per second. Each of these instructions either moves data between its temporary holding spots, called its registers, and the memory, or it performs a basic calculation whose inputs and output are stored in registers. Compilers know how to translate the source code you write into long sequences of simple machine instructions, and the speed of the CPU allows large computations to be performed very quickly.
The RAM is the short-term storage that is used to hold the values of variables and objects that you create in your code. The CPU has random access to the memory, meaning it can access values stored anywhere in the memory in roughly the same amount of time (just like the random access guarantee for arrays); this time is relatively quick, though still significant when compared to the speed of instruction execution in the CPU.
Finally, the I/O and storage peripherals handle all of the other tasks of your computer. These include the persistent disk/drive storage on your computer, as well as devices like screens, printers, keyboards, track pads, and mice that serve as an interface between the calculations done by the computer and its user. Interacting with these peripherals to read/write data is significantly slower than accessing information stored in RAM.
Your operating system is a program that is responsible for managing these computer resources.
An operating system is a program that begins executing when you turn on your computer and manages the execution of all other programs.
When you start a program on your computer (which we’ll refer to as a process), the operating system allocates it a chunk of the RAM (some subset of all of the possible memory addresses) that it is allowed to use during its execution. The operating system provides a guarantee that the process will have exclusive access to this memory, ensuring that other running processes cannot mess with the state of this process.
A process models an application or program that is given exclusive access to a chunk of computer memory by the operating system.
The operating system also handles any requests to write or read information from the computer’s peripherals, and determines when the process is allowed to execute its code instructions on the CPU. As a process is running, it is free to use any memory in its designated chunk (the operating system will complain if it tries to access memory outside of this) and execute code to make progress toward its objective. Behind the scenes, the operating system will do all of the bookkeeping necessary to manage multiple processes. It will ensure that their memory chunks are disjoint, coordinate their access and usage of peripheral devices, and schedule their time executing instructions on the computer’s (likely multiple) CPU cores. When it recognizes that a process is waiting for input from a peripheral device, it can schedule time for another process to use the CPU. In times when no processes are waiting for peripherals, the operating system can achieve concurrency by time slicing among the processes or schedule them to execute in parallel on different CPU cores.
Threads
A single process may also wish to achieve concurrency. For example, one Java program might want to divide its work into multiple separate tasks that can be executed concurrently. The abstraction we use for these separate tasks is called a thread.
A thread models a single, sequential execution of instructions that performs work within a process.
Unlike processes, which each have exclusive access to a disjoint chunk of memory, the threads belonging to a process all have equal shared access to its memory. When a process is created, it begins with a single thread. During its execution, it can create additional threads to achieve concurrency. Once all of these threads stop (they run out of instructions to execute), the process terminates.
Let’s better understand these ideas in the context of a Java program. When you start a Java program in IntelliJ, this creates a new process. Your operating system hands over a chunk of memory to Java that can be used by your program. We know that Java programs enter their main() method when they start executing. Behind the scenes, the process creates a main thread where this execution will take place and designates some of its memory to contain the runtime stack for this main thread. The bottom call frame in this stack will be the main() call frame, and the main thread will push and pop other call frames onto the call stack as it executes your code. As soon as it returns from the main() method, the main thread stops. Almost always, this has been the only thread of the Java process, so the application terminates.
We also saw one case where the program involved multiple threads, our Swing applications. In this case, the main thread executing the main() method stopped almost immediately. The only statement in the main() method was a call to SwingUtilities.invokeLater(). This call created a new thread (the event dispatch thread) where the rest of the code was executed. In this case, the presence of the running event dispatch thread prevented the Java process from terminating. As soon as the application window was closed, this stopped the event dispatch thread (because of our requested JFrame.DISPOSE_ON_CLOSE behavior), which terminated the application.
Next, we’ll see how we can leverage additional threads in our Java programs to introduce concurrency.
Concurrency in Java: The Thread Class
Java uses the Thread class to model an execution thread. We won’t do a deep dive into all of the functionality of this class. Rather, we’ll focus our attention on a few of its central methods.
Constructing New Threads
When you execute a Java program, we enter its main() method on its main thread. To obtain the Thread object corresponding to the currently executing thread, we can use the static Thread.currentThread() method. For example, running the program
|
|
|
|
prints:
Thread[#1,main,5,main]
This is a String representation of the main thread, which is identified by a unique number (#1) and its name (“main”). The creation of this thread happens automatically; you can think about its constructor running before we enter the main() method. If we want to create additional threads (necessary to achieve concurrency in our code), we’ll need to construct additional Thread objects, either directly or by calling some helper method like SwingUtilities.invokeLater(). We’ll focus our attention on the second Thread constructor, which accepts a Runnable as its argument.
Take a look at the other overloaded Thread constructors as well. These provide the ability to associate a name with the new thread (which may be useful for debugging), and to manually set the size of that thread's runtime stack (which may be important in applications with many threads). The ability to assign a thread to a ThreadGroup is somewhat outdated.
You might recall that the SwingUtilities.invokeLater() method also accepted a Runnable argument, and this is a functional interface that helps us package up code that we’d like to execute elsewhere (i.e., on this new thread we’re constructing). We typically instantiate this argument using a lambda expression that calls another method. For example,
|
|
|
|
Thread.run() and Thread.start()
We’ve just seen that we can construct a new Thread object to encapsulate a piece of code (i.e., a Runnable) that we want to execute. How do we actually get this to execute? A common but flawed first attempt is to call the run() method on the Thread object that we just created. This will execute the Runnable.run() method of the argument that we passed to the constructor, so this code will be executed. However, it will be executed from the main thread (or whatever thread we are calling the run() method from). We can see this by slightly expanding our demo code. We’ll add a name argument to the Thread constructor call and then execute run() on this thread.
|
|
|
|
When we run this demo, we obtain output:
Entering main() in Thread[#1,main,5,main] Hello from Thread[#1,main,5,main] Exiting main()
All of the code ran from within the main thread. We never executed anything on the other thread. To actually utilize another execution thread in the background of our Java program, we need to call the start() method instead of run(). This has the effect of creating a new execution thread behind the scenes (associated with the Thread object other) and calling the run() method from that execution thread.
Let's pause here for a minute to appreciate the subtlety of using the Thread class. This class provides an abstraction for us to create and interact with different execution threads in our programs. Creating a Thread object doesn't automatically start executing code on a different thread. Instead, it provides us the ability (through its methods) to make this happen "behind the scenes". A lot of the lower-level details of how threads are created and managed fall well beyond our scope.
Suppose we make this change in our demo code:
|
|
|
|
Then, we obtain the output:
Entering main() in Thread[#1,main,5,main] Exiting main() Hello from Thread[#21,other,5,main]
The “Hello” message is printed from the other thread. Notice something else, though: we exited from the main() method before the “Hello” message was printed. The start() method (which executes on the main thread) returns immediately, allowing us to print the exit message, return from the main() method, and stop the main thread. The execution of the run() method happened to take place after returning from main(). We’ll see in a little while that we don’t have control over the order of executions across multiple threads. Recall that our Java process won’t terminate until all of its threads have stopped, meaning that our “Hello” message can still be printed from the other thread.
Thread.sleep() and Thread.join()
Sometimes, we might want to try to influence the order in which statements across multiple threads get executed. A rather simplistic way to do this is to add artificial pauses into our code’s execution using the static Thread.sleep() method, passing in the number of milliseconds this pause should last. When this method is called, it will delay the execution of the next instruction on that thread for the prescribed amount of time. As an example, we can sleep() the main thread so that it should exit after the “Hello” message from the other thread is printed. Note that sleep() has the possibility of being interrupted by a checked InterruptedException, which we must disclose in a throws clause:
|
|
|
|
This results in the output:
Entering main() in Thread[#1,main,5,main] Hello from Thread[#21,other,5,main] Exiting main()
This use of sleep() is a bit hacky, as it requires us to guess how long we will need to wait for the other thread to finish its work. Perhaps the code that the main thread needs to execute next relies on an answer computed by the other thread. In this case, we need to be sure that this other thread is truly finished. A more principled approach would be to pause execution until we are alerted that the other thread has stopped. This is accomplished with the join() method.
I find this naming a bit odd. The way I remember it is to imagine that when I called start() on my thread, I sent the other thread away to do its own work. Now, I am asking it to join() back up with me when it is finished to let me know I can keep going.
Let’s add a join() call to our demo. We’ll also have our other thread sleep() to ensure that some time passes before it stops.
|
|
|
|
In this case, we obtain the expected output:
Entering main() in Thread[#1,main,5,main] Hello from Thread[#21,other,5,main] Exiting main()
In more realistic applications, we'll often need more sophisticated coordination between multiple threads. They may need to pause and wait multiple times during their execution for other threads to reach certain points. This becomes particularly important when the threads share state, as we will discuss next. Synchronization allows for this more fine-grained coordination and will be the topic of the next lecture.
Shared State and Capturing
As we mentioned earlier, threads differ from processes in that they are given shared access to a chunk of memory. Focusing on Java, this means that we can have a Thread object whose Runnable argument refers to (or captures) an object created on the main thread. Consider the following example.
|
|
|
|
Let’s break down what this example does:
- From the main thread, we construct an
ArrayListobject,nums, on the heap. - We construct a
Threadobject,other, that will execute theaddOnes()method on the captured variablenums. This method will add five copies of 1 to thenumslist. - Separately, the main thread will add five copies of 0 to the (same)
numslist. - After each list addition, the respective thread will sleep for a random duration. We will not know the exact order in which the 10 calls to
nums.add()will be made.
When we execute this code, we expect that it will print out a list containing five 1s and five 0s in some order. This is exactly what happens, and we show the results of multiple executions below:
[0, 1, 1, 0, 1, 0, 1, 1, 0, 0] [0, 1, 0, 1, 0, 0, 1, 1, 0, 1] [0, 1, 1, 1, 0, 1, 1, 0, 0, 0] [0, 1, 0, 1, 0, 0, 0, 1, 1, 1]
It turns out that even without these random sleep()s, we still cannot guarantee the order of execution of these calls to add(). This introduces a phenomenon called a race condition into concurrent programs, which we will discuss next.
Race Conditions
When our code includes multiple threads, Java (and our operating system) separately decides when instructions from each thread are executed. These threads may execute simultaneously in parallel, utilizing different cores of the machine. It’s also possible that the operating system will utilize time slicing, arbitrarily pausing (or preempting) one of the threads to switch to executing the other, doing this switching many times in rapid succession to present the illusion of parallelism.
While the instructions on any particular thread will run sequentially (just like all of the programs that we have studied all semester), this switching or parallel execution precludes any guarantees of the order that lines are executed across multiple threads. Said another way, we don’t know how the two (or more) sequences of instructions will interleave. This ambiguity does not matter if our threads do not share any state variables. They essentially act as independent processes, and we can reason about invariants that are maintained throughout their execution to reason about their correctness. However, in the presence of shared state, this ambiguity of execution order becomes a concern, which we call a race condition.
Multi-threaded code experiences a race condition if its behavior depends on the specific order in which the instructions across multiple threads are executed.
To see an example of a race condition, let’s consider the following code:
|
|
|
|
In this code, we’re using the SharedInt class as a hacky way to create a mutable object wrapping an int value that can be captured by our lambda expressions in the Thread constructors. The initial value of the shared int s.x is 0, and both of the threads will increment this value by 1 when they execute. The calls to join() ensure that the print statement happens after both threads t1 and t2 have stopped, meaning both calls to increment() have returned. Therefore, we’d expect the printed value of s.x to be 2. Many times, this will be the case, and the execution will proceed similarly to the following animation.
previous
next
However, this is not the only possible execution. We walk through another one in the following animation.
previous
next
In this case, the two different executions of our code (both possible due to the non-determinism of concurrent execution) result in different outputs. In the lecture release code, we include a more extreme example of this type of race condition that includes even more threads and even more calls to increment(). This more clearly shows the variety of possible outputs (it is very rare to actually get the example code above to print 1).
You might wonder whether we could get around this problem by simplifying our increment() method to remove the local variable storage (which was useful for our animation but unnecessary in practice). Suppose we instead write i.x++. While this appears to solve the problem (it reduces the method body to one line), it actually does not help. Behind the scenes, the Java code is translated to multiple machine instructions (read the old value of i.x from memory into a register, add one to this value, and store the result in a register, write this register value back to memory). Thread preemption can happen between any consecutive machine instructions, so this same race condition is still present.
Race conditions are problematic. They can cause our code to behave in ways that we did not intend. In the previous example, we likely expected the value 2 to be printed out. Usually, this does happen; however, it doesn’t always happen. This uncertainty is unacceptable in production code. This is not a problem we can solve with careful testing, either. Our unit tests cannot control the execution order of concurrent statements. Our code may pass all of the unit tests (under one execution order) and then fail to work correctly when it is actually deployed (and executes in a different order).
Instead, we’ll need to find ways to design our code to avoid the possibility of race conditions. One option is to eliminate shared mutable state, only allowing threads to read and write from variables that they exclusively control. Under this model of concurrency, the execution order of statements across different threads will not affect any computations, since the actions of threads will not be “visible” to other concurrently executing threads. This is the approach taken by the Swing framework, which prevents modifications to any Swing components outside of the event dispatch thread.
Alternatively, we can use synchronization, which allows us to build up larger blocks of instructions during which a thread is promised exclusive access to some shared mutable state. Synchronization is a more robust approach for safely achieving concurrency that comes with its own considerations and challenges. It will be the focus of our next lecture.
Main Takeaways:
- Concurrency describes the ability of your computer to make progress on multiple tasks during one window of time. It can be achieved through time slicing and/or execution across multiple parallel processors.
- A process is a computational task that is allocated its own memory. A process can perform its work on one or more threads, each of which has complete access to the same shared memory pool of the process.
- In Java, calling the
start()method on aThreadcauses its work to be done on a separate execution thread, achieving concurrency. - Race conditions arise when multiple threads can simultaneously modify an object in their shared memory. The final state of this object depends on how the thread's operations were interleaved, which is non-deterministic.
Exercises
Below is the outline of a class ParallelProgram. When the program is executed, main() first creates an instance of ParallelProgram and stores it in a local variable pp. Its method run() is supposed to execute in parallel with the remainder of method main().
|
|
|
|
// TODO comment to have this happen?Suppose your CPU only has a single core and is incapable of executing multiple threads simultaneously in parallel. Nevertheless, you decide to start multiple threads in your program. Your program works with instances of a Student class, which has mutable state. You know that, as long as a Student’s method’s preconditions are met when it is called, then it is guaranteed to meet its postconditions and restore the class invariant.
Student object with multiple threads. Is it possible for your program to trigger undefined behavior in the Student if its method arguments always satisfy their preconditions?Two threads share an instance s of the class Shared, defined as follows:
|
|
|
|
s.x is 2. The threads execute the following statements concurrently:
| Thread 1 | Thread 2 |
|---|---|
s.x = s.x + 1; |
s.x = s.x + 2; |
s in a way that guarantees that any updates made by both threads will be observed. What are the possible values of s.x?Recall the Counter class from Lecture 8.
|
|
|
|
|
|
|
|
|
|
|
|
SinglyLinkedList from Lecture 13. Each of the following figures shows a possible state of list after the following code finishes executing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
SwingWorker
SwingWorker is a utility class to safely perform long-running tasks in the background of Swing applications. Suppose we are creating a GUI application that runs Dijkstra's algorithm on a very large graph. As the algorithm progresses, we want to visualize the frontier and all settled vertices.
SwingWorker?
SwingWorker allows us to publish both intermediate and final results. Intermediate results allow us to update the GUI’s view throughout the execution of the background task. Implementations do this by overriding doInBackground(), which runs in a separate thread. Why are we allowed to modify the model but not the view in doInBackground()?