1. Introduction to Java
Welcome to CS 2110, the second course in Cornell’s introductory programming sequence. Your first programming course (either CS 1110 or 1112 at Cornell, or perhaps AP Computer Science) taught you how to write code. You were introduced to the basic units of programming, variables and objects, and operations that can be performed on them. You learned about control structures such as conditionals and loops to modify the execution of your code based on the values of variables. You also learned how to use subroutines (functions) to promote modularity and reuse within your code. In 2110, we will revisit and build on all of these ideas, but we will push a lot farther. Beyond teaching you how to code, we’ll focus on how to write better code. What does it mean for code to be “better”? There are many possibilities:
- Better code has fewer bugs. We are more confident that it works as intended.
Ultimately, the most important aspect of our code is that it is correctly carrying out the task it was designed to do. In terminology we will use throughout the course, the code meets its spec (or specification). We’ll talk about how to carefully document specifications and invariants in our code that help us to reason about its correctness. We’ll also learn about how to write thorough unit tests for our code and see strategies for principled debugging.
- Better code runs faster and is more space efficient.
For large-scale systems that need to quickly process vast amounts of data, seemingly small choices can have profound impacts on performance. Early in the course, we’ll introduce the tool of asymptotic analysis, which provides a way to account for the time (the number of operations being performed) and space (the amount of computational “scratch space”) complexity of the subroutines that we write. We’ll use this as a tool throughout the course to compare various ways of organizing and manipulating data in our code, and we’ll study a collection of data structures that trade off different aspects of performance.
- Better code is more readable and maintainable.
Most of the projects you have written so far were probably your own, or limited to a couple of developers, and their scope was likely known when you started on the project. In industry, code bases can incorporate work from thousands of developers and remain in use for decades. Therefore, writing code that is easy to understand, extend, and maintain becomes a first-order concern. Object-oriented languages like Java were developed with large software systems in mind, and we’ll learn about how to leverage object-oriented features to make programs more modular. We’ll also discuss many software design patterns that provide engineers with a common language to discuss ubiquitous programming paradigms.
Ultimately, balancing all of these considerations is a challenging endeavor, but we hope that you will come out of this course with a greater appreciation for the art of writing good code and with the tools to make it happen.
The Java ecosystem
CS 2110 uses Java as its principal programming language. Java is a programming language designed by James Gosling at Sun Microsystems (now under the control of Oracle) in 1995. It is inspired by C++ and shares many of its syntax features but was designed to address some of C++’s security and portability issues. Java was widely adopted because of its ability to run on many different platforms, including servers, personal computers with different operating systems, mobile devices, and web browsers. Java achieves this portability through the use of the Java Virtual Machine (JVM), programs targeted to their host hardware that interpret Java bytecode (a low-level machine-independent language) into machine instructions.
Compilation
The presence of the JVM introduces the necessity for an extra step between writing and executing Java code, compilation. A compiler is a program that accepts the Java code that you write as its input and produces Java bytecode files (called “Class files” because of their “.class” extension) that are understandable by the JVM as its output. The JVM takes in these Class files and interprets the bytecode instructions line-by-line, allowing your computer to execute the code you have written.
While carrying out this intermediate translation to bytecode, the compiler takes many steps to analyze your code. Beyond identifying syntax errors (misplaced punctuation, misspelled variable names, etc.), the compiler runs many checks that verify the soundness of your code. We will talk about many of these checks in greater detail in upcoming lectures, but these include checking appropriate static variable typing (today’s focus), variable scope and visibility, and exception handling. In addition, the compiler can optimize code, rewriting portions to achieve the same end behavior with fewer, more efficient machine instructions. Remarkably, the compiler does all of this without actually executing your code. Instead, it relies on formal guarantees provided by the semantics of the Java language that we will discuss in upcoming lectures. Today, we’ll see one motivating example for the benefits of the static typing rules enforced by the compiler.
Variable Types
Consider the following Python program.
convert.py
|
|
|
|
This program consists of one user-defined function days_to_minutes()
. Even if you have never written code in Python before, it should be clear (both from the documentation comment atop the function and the calculations performed within it), that this function accepts a number of days as a parameter and returns the number of minutes in this many days. The second half of the program incorporates this function into a simple command line application that asks the user to input a number of days, uses the days_to_minutes()
to convert this to a number of minutes, and prints a message with this conversion to the console.
Suppose we use this program to calculate the number of minutes in 3 days:
> python3 convert.py Enter a number of days: 3
What do you expect the output will be? Give it a guess before revealing the output below.
show output
What happened here? This is way too many minutes! Is there a mistake in our calculation? Technically, no; everything is working as it should, just not in the way that we intended. The strange behavior of this program arises out of an improper understanding of variable types.
Dynamic vs. Static Typing
In programming, the type of a variable determines what data it can store and how that variable can be used in our program.
The (static) type of a variable characterizes the set of possible values (or states) that it can be assigned, and the ways that it can be used within the program. The latter includes which operators can be applied to the variable and to which methods the variable can be passed as an argument.
Python is a dynamically typed language, meaning the types of its variables are inferred while the program is running. This lowers the overhead to the programmer, who does not need to explicitly specify the types of all the variables they use. However, it also allows the possibility of runtime errors when the user attempts to carry out an operation on a variable that is not supported by its type. Even worse, when the inferred type of a variable differs from the programmer’s expectation, unexpected behaviors can occur. In the preceding example, Python’s input()
function accepted 3
from the user on the command line. This was not the integer 3, as we may have expected, it was the String “3”. This String was passed into the days_to_minutes()
function, where it was multiplied, first by 24 and then by 60 (for a total multiplication by 1,440). Does it make sense to multiply an integer by a String? In Python, it does, and it has the effect of concatenating together that integer number of copies of that String. In this case, the function computed and returned the String consisting of 1,440 “3"s, which was output to the console.
Of course, this is not what we intended. We expected the parameter days
to be an integer. We expected the multiplication within the days_to_minutes()
function to be normal integer multiplication. We expected the days_to_minutes()
function to return an integer. However, we could not enforce this. Since Python is dynamically typed, it was responsible for determining the types at runtime, and it did this in a way that was technically correct but deviating from our expectations. Dynamic typing can lead to strange behaviors, so it is something we may wish to avoid in critical software. An alternative approach, taken in Java, is static typing.
In a statically typed language, the types of variables are determined before the program is executed. Often, the programmer is responsible for explicitly declaring the types of all variables that they use, as well as the return type and types of all parameters of functions that they define. Then, the compiler of a statically typed language uses these type declarations to identify type errors before the code is executed, preventing many runtime errors.
While static typing passes more responsibility to the programmer, it can often save time in the long run by avoiding unclear program behaviors arising from unexpected type inferences; imagine trying to identify the issue in our earlier example when it is buried within thousands of lines of code across many function calls.
Our First Java Code
Now, let’s translate our time-conversion code into a Java program. As a warning, Java has a reputation for verbosity; it takes a lot of “boilerplate” code to write even simple programs in Java, so this example may look a bit overwhelming. We’ll walk through a lot of the syntax here but leave some of the discussion to future lectures.
In CS 2110, we will write our Java code in the IntelliJ IDE (integrated development environment). This is a program that includes many useful features for developing Java code, including syntax highlighting, code completion and suggestions, and debugging tools. It also automatically manages the compilation and execution of your code (rather than having to rely on more complicated build systems or command-line tools). For more information about how to install and set-up IntelliJ and navigate some of its main features, refer to our setup page.
Java is an object-oriented programming language. The basic building blocks of Java programs are classes, which serve as containers for code subroutines called methods. All of the code that we write in Java must belong to a class, and (for the most part… we will see exceptions to this rule soon) each class lives in a “.java” file with the same name. Classes and files in Java begin with an uppercase letter by convention. Let’s create a class Convert
in a file Convert.java
that will contain our example code. We declare a class using the keywords public class
, followed by the class name. We use curly braces { ... }
to demarcate the code in this class.
Convert.java
|
|
|
|
Within the Convert
class, we’d like to define our conversion function. In Java, we use camelCase for function names, starting the name with a lowercase letter and then capitalizing the first letter of all subsequent words, so days_to_minutes()
from Python’s snake_case becomes daysToMinutes()
. This method accepts a single parameter, days
which is an integer. We denote this by preceding the parameter name with the type keyword int
in the parameter list. Since daysToMinutes()
returns an integer (the number of minutes), we declare its return type by adding the int
keyword before the method name. We must add the static
keyword at the start of the method declaration to tell Java that this method does not act on an instance of the Convert
class but rather is a “free function” (we’ll discuss this distinction more soon… for now, just add static
in front of all method definitions). We demarcate the method’s definition using more curly braces. Finally, we’ll add a Javadoc comment describing the (eventual) behavior of this method using the /** ... */
syntax above the method declaration.
Convert.java
|
|
|
|
The method body will consist of multiple statements, each of which must end with a semicolon (;
). Within this method, we use two local variables, hours
and minutes
. Both of these local variables have type int
and must be declared. This can be done in two ways. First, we can add a separate declaration statement like int hours;
above the first use (i.e., assignment) of hours
in the code. Alternatively, we can declare the static type on the left of the first assignment in an initialization statement such as int minutes = 60 * hours;
. We demonstrate both of these options below. Note that the //
syntax turns the rest of the line into a comment.
Convert.java
|
|
|
|
We’ve completed the definition of daysToMinutes()
, and now, we’d like a way to run our program. In Java, we make a class executable by adding a main()
method with the following signature: public static void main(String[] args)
. We will learn more about the meaning of each word of this method signature soon, but we already have been exposed to some of this syntax:
- The
static
keyword is used to declare a “free function” (like the kind we saw in Python that is not tied to a particular instance of a class). - The
void
keyword that precedes the method namemain
is the return type of this method. Here,void
denotes that nothing is returned by this method. - This method accepts a single parameter,
args
, which has typeString[]
(an array ofString
s).
Note that the signature (the word we use to describe the name, return type, and parameters of a method) of the main()
method must match this exactly, or Java will not recognize it as the program entry point. Let’s add the following main()
method definition to our Convert
class:
|
|
|
|
Here, the System.out.print()
function is used to output text to the console. The Scanner
object is used on lines 4-5 to read the next integer input from the console and assign it to the int
variable days
. The try
syntax (which we will discuss more soon) is code that contends with the possibility that the IO (input-output) operation of reading user input from the console could fail. The remaining code calls the daysToMinutes()
function, assigns the result to the int
variable mins
, and prints an appropriate output message. Running this code, we see
Enter a number of days: 3 There are 4320 minutes in 3 days.
In this case, the code works correctly because we have made sure to interpret the user’s input as an int
(using the nextInt()
method). However, what happens if we were not so careful and did something similar to our Python code. If we instead attempt to store the result of sc.next()
to our int
variable days
, we get the following error message from the compiler:
Incompatible types. Found: ‘java.lang.String’, required: ‘int’
The compiler sees that next()
will return a String
and knows that this String
cannot be stored in the days
variable with int
type, so it refuses to accept this code. If we redeclare days
as a String
variable, we get a different error message:
‘daysToMinutes(int)’ in ‘Convert’ cannot be applied to ‘(java.lang.String)’
Now, the compiler sees that we are trying to pass the String
days
as an argument to the daysToMinutes()
method, whose signature includes only an int
parameter. Since the argument and parameter types do not match, the compiler refuses to accept this code as well. While it might be annoying when the compiler complains about what we have written, this is much nicer behavior than letting the code run with unexpected results. This is the power that static typing, a requirement that the types of all variables, return values, and parameters be declared explicitly by the programmer, affords us.
Primitive types
As we have just seen, understanding types plays a central role in statically typed languages like Java. Thus, we should be familiar with which types Java provides and how we can define our own types to use in our programs. We will return to the latter question soon. The former question is the focus of the rest of this and the next lecture. Java organizes its types into two categories, primitive types and reference types.
There are eight primitive types in Java. Variables with these primitive types directly store their values. Most of the language's built-in operators (syntactically represented in infix notation with punctuation characters) act on primitive types. All other types in Java are reference types. Variables with reference types store a reference to an instance of that type (i.e., an object).
We will discuss reference types in our next lecture. The following chart summarizes the eight primitive types. For our purposes in CS 2110, we will mostly use int
s, double
s, boolean
s, and char
s.
Primitive type | Size | Values |
---|---|---|
byte |
8 bits | An integer in the range -128 (\(-2^7\)) to 127 (\(2^7-1\)), inclusive |
short |
16 bits | An integer in the range -32,768 (\(-2^{15}\)) to 32,767 (\(2^{15}-1\)), inclusive |
int |
32 bits | An integer in the range -2,147,483,648 (\(-2^{31}\)) to 2,147,483,647 (\(2^{31}-1\)), inclusive |
long |
64 bits | An integer in the range \(-2^{63}\) to \(2^{63}-1\), inclusive |
float |
32 bits | A single-precision floating point number represented with the IEEE 754 standard |
double |
64 bits | A double-precision floating point number represented with the IEEE 754 standard |
boolean |
1 bit* | true or false |
character |
16 bits | A single Unicode character |
Expressions, Operators, and Assignment
Expressions are the building blocks that we use to develop our programs. They have values that can be written to and read from memory locations as we perform computations, and their values are manipulated through operators in the computer’s central processing unit (CPU) to obtain other expressions.
An expression is any unit of code that can be assigned a type and a value.
Variables are one example of an expression; they are declared with a static type and can be given a value using an assignment statement. Another example of an expression is a primitive literal, such as the literal 3
representing the int
3, the literal 4.05
representing the double
4.05, the literal true
representing the boolean
value true, and the literal 'C'
representing the char
uppercase C. We can use operators to combine simpler expressions into more complicated expressions.
An operator accepts one or more expressions (of particular types) as inputs and evaluates to an expression (of a particular type).
For example, the Java code 3 + 4
is an expression whose type is int
and whose value is 7. We obtain this expression by applying the addition operator +
to the int
literal expressions 3
and 4
(we call this an infix operator since the operator symbol +
appears in between its argument expressions). As another example, if x
and y
are both double
variables, then x * y
is an expression whose type is double
and whose value is the product of the values of x
and y
.
Note that an operator can be applied to any expression of the appropriate type. This allows for the creation of more complicated, compound expressions like 3 * (x + 4) - y
which are built up by applying one operator at a time in the order dictated by the language. See the “Transitioning to Java” page for a more detailed accounting of Java’s operators and their semantics.
A final example of an expression is the return value of a method. Its type is specified by the method’s return type, declared in the method signature. For example, the expression daysToMinutes(3)
has type int
and value 4,320.
The other main ingredient of Java programs are statements.
A statement is a unit of Java code that is executed for its side effect. It may or may not evaluate to an expression.
The primary examples of statements are function calls with a void
return type (which, by definition, do not have a value) and assignment statements (which technically are expressions, but which should almost never be used as such to avoid confusion). An assignment statement in Java has the form <variable> = <expression>;
where <variable>
is some previously declared variable and <expression>
is any expression whose type agrees with that variable. The effect of an assignment statement is to store the value of the expression on its righthand side (RHS) into the variable on its lefthand side (LHS). For example, the statement days = 3
where days
was the previously declared int
variable has the effect of storing the value 3
into the variable days
.
Assignment statements can be subtle to reason about because they are not symmetric; they treat variables differently whether they appear on the LHS or RHS of the assignment operator =
. Variables on the left are containers; they are names of the memory location where the value of the RHS expression on the right will be stored. Variables on the right are expressions, and we use the value that they currently store to evaluate the expression on the RHS. At the end of the lecture, we will see a diagram that helps us to visualize the effect of an assignment statement. For now, we record the following facts about how an assignment statement is compiled and executed.
At compile time: The compiler performs (static) type checking on the assignment statement. It infers the types of all literal sub-expressions, uses the operator semantics to determine the type of each operator expression, and uses the method signatures to determine the type of each method return value. Together, this allows it to determine the type of the RHS expression. By looking at the static type of the variable on the LHS, the compiler confirms that it can store the RHS expression (more on this in upcoming lectures). If so, it compiles this assignment statement to bytecode. If not, a compile time error is issued.
At runtime: The JVM fully evaluates the expression on the RHS of the assignment statement, which might involve multiple method calls or operator applications. Once its value is determined, it is stored in the location corresponding to the variable on the LHS.
Coercion
As we just saw, understanding the input and output types of each operator is crucial for the compiler’s static type checking of assignment statements. As a general rule, most built-in operators for primitive types in Java preserve types. For example, adding two int
s 5 + 7
results in an int
(12
), while adding two double
s 5.0 + 7.0
results in a double
(12.0
). One weird case to be aware of is integer division. An expression like 4 / 8
, which we’d intuitively like to assign value 0.5
, has two int
arguments, so Java will assign this operator an integer result. It does this by rounding the answer toward 0
to the nearest integer; 4 / 8
evaluates to the int
0
.
When an operator is applied to two primitives with different types, the “narrower” type (or the type that has a smaller range of values) is automatically coerced (or widened) to the “wider” type. Floating point types are “wider” than integral types, and larger types are “wider” than smaller types. For example, in the expression 3.5 * 4
, we are multiplying a double
(3.5
) by an int
(4
). The double
is a wider type, so 4
is coerced to the double
4.0
and double
multiplication is performed, yielding the double
result 14.0
.
Coercion can also be done manually through the use of a cast. Write the name of the type to which you’d like to convert the expression in parentheses to the left of that expression. For example (int) 3.5
will cast the double
literal 3.5
to the nearest integer (again truncating toward 0
to give int
value 3
). Casting can also be applied to the result of an operator (i.e. a more complicated expression); (int) (3.5 * 4)
has int
value 14
. We can also use casting together with widening to fix unwanted integer division. We saw that 4/8
evaluates to 0
, but (double) 4 / 8
evaluates to 0.5
. The cast (double) 4
evaluates to the double
4.0
. During the evaluation of the division, we (implicitly) widen int 8
to double 8.0
so the types agree and then perform double
division to obtain 0.5
.
One place where the “operators preserve primitive types” rule does not hold are for many operators involving boolean
s. Comparison operators such as >
, !=
, and <=
convert numerical types to boolean
s: 3 > 4
evaluates to false
. The conditional operator has the form <boolean expression> ? <true branch> : <false branch>
. It is evaluated by first evaluating the <boolean expression>
. If this is true, the value of the entire expression is the value of <true branch>
. Otherwise, the value of the entire expression is the value of <false branch>
. For example, the expression (3 % 2 == 0) ? 4 : 5
evaluates to 5
since 3 % 2 == 0
is false
.
Program Execution
To conclude today’s lecture, we’ll introduce a diagram that is useful for tracking the state of execution of a Java program. As we learn more about the Java language in upcoming lectures, we will add more features to this diagram.
When a method is called in Java, space is allocated for its local variables in a region called the runtime stack. All of these variables are grouped together in a unit called a call frame. Since all variables are declared explicitly in Java, the compiler knows the names and types of all local variables used in a method, so knows exactly how much memory the call frame will occupy.
We draw the call frame as a rectangle on the left of the diagram. Within this rectangle, we draw a smaller box for each local variable or parameter, one per line, labeled with the variable’s name, followed by a colon, followed by its type. We write the current value of the variable in the box. In our Convert
code, the daysToMinutes()
method would have the following call frame :
Program execution begins with the main()
method, which is the bottom frame on the runtime stack. Whenever another method is called, the call frame for its invocation is added on top of the previous call frame. The arguments passed into this method are evaluated and stored in the corresponding parameter variables, and execution proceeds in the new call frame. When a method completes execution (reaching a return statement or the end of its body), its call frame is destroyed and its return value is passed back to previous call frame, where execution continues. We conclude today’s lecture by diagramming the execution of our Convert
program. Note that details of diagramming the args
and sc
variables have been elided; we will discuss these more next lecture when we talk about reference types and semantics.
previous
next
Main Takeaways:
- There is a two-step process to execute Java code that you write. First, the compiler translates your Java code to bytecode, performing static checks during this translation. Then, the JVM interprets this bytecode using instructions on your machine.
- Dynamically typed languages (like Python) determine variable types at runtime, while statically typed languages (like Java) check types at compile time. Static type checking affords more safety at the expense of more work for the programmer to declare types.
- Operators transform input expressions into output expressions according to rules built into the language. Most Java operators preserve primitive types.
- Primitive types can be coerced (modified) either through implicit widening or explicit casting.
- Local variables are stored in call frames on the runtime stack. Call frames are created when a method is called and deleted when that method returns.
Exercises
Match each of the following code snippets to their best descriptor.
/** Returns the weight expressed in ounces. */ |
Assignment |
int ounces; |
Declaration |
static int poundsToOunces(int pounds) |
Initialization |
ounces = 16 * pounds; |
Invocation |
poundsToOunces(5) |
Specification |
int ounces = 16 * pounds; |
Signature |
Consider the following abbreviated Java code snippet:
|
|
|
|
Assume method doMath() has been defined as follows:
|
|
|
|
What value is printed when the following code is executed?
|
|
|
|
5
8.0
5 + 8
5 + 8.0
8 % 2 == true
8 > 7.9
false ? 1 : 1.0
true ? false : 1.0
isSingleDigit(7)
where isSingleDigit()
is defined
|
|
|
|
isSingleDigit('7')
using the same definition above.
x
, y
, z
, and b
at the end of the following block of code.
|
|
|
|
(5 + 4) / 2
(12 % 5) + 4
3 + 2 / 3
3 + 2.0 / 3
(13 > 13)
!(true && !false)
(12 > 13) || (-12 > -13)
3 % 4 == 0 ? 3 : 3
|
|
|
|
f(1, 2.0)
f(1.0, 2)
f(1.0, 2.0)
f(1, 2)
g(f(1, 2.0))
h(g(1), 2.0)
h(g(1), f(2, 3.0))
Math
Class
Math
class. The Math
class is a component of the Java Standard Library, a vast codebase providing useful and frequently re-used code for programmers. The Math
class is imported by default in every Java file. This means to access a field or method of the class, like PI
. you can simply write Math.PI
. Using fields and methods from the Math
class, implement the following.
area()
, \(A\), and circumference
, \(C\), of a circle, given the radius as a double r
. Recall the formulas \( A = \pi r ^ 2\) and \( C = 2 \pi r\).
A
of a
circle and returns its radius. Do the same with a function that takes the circumference C
. Use the formulas \(r = \sqrt { A / \pi }\) and \(r = \frac{C}{2 \pi}\).
celsiusToFahrenheit()
and fahrenheitToCelsius()
in the following extension of the Convert
class from above. The conversion formulas are given:
\[
\begin{align}
C &= \frac 5 9 \left(F - 32\right) \\
F &= \frac 9 5 C + 32
\end{align}
\]
|
|
|
|
main()
method. Are you getting the expected results?
main()
method to check whether your code works as intended.
maxOfThree()
that returns the maximum value among its three int
parameters.
formTriangle()
that takes in three int
s and returns whether or not (i.e., a boolean
) these can be the lengths of the three sides of a triangle. Recall that three numbers form the sides of a triangle if they are all positive and the sum of the two smaller numbers is greater than the largest number.
triangleSum()
that takes in an int
, \(n\) and returns the sum \(\sum_{i=1}^{n} i\), the sum of all positive integers less than or equal to \(n\). You can do this with a loop or use the closed formula (if you know it) to compute the answer directly.
isTriangular()
that takes in an int
, \(x\) and returns whether or not \(x\) is a triangular number. Triangular numbers are those that can be represented in the form from the previous part, as a sum of the first \(n\) positive integers for some \(n\).
|
|
|
|