Transition to Java
This document aims to help you adapt your knowledge of another procedural programming language to the context of Java, the language used in CS 2110. It provides explicit comparisons to Python and MATLAB, as those are the languages currently used in Cornell’s introductory programming courses CS 1110 and CS 1112. See the setup page for instructions on how to install a Java development environment on your computer.
Where to write code
While many aspects of a language’s syntax can be illustrated in isolated code snippets, it is important to understand the context in which that code might appear. In particular, fluency in a language requires practice (i.e., writing and running code and interpreting the results), so you need to know where to write code so you can see its effects first-hand.
Interactive programming
A great way to explore new programming concepts is to interact with them one statement at a time, printing a text representation of each statement’s value (if any) after it is run. In Python, this can be done by running the python
(or ipython
) interpreter, and in MATLAB, this is done by typing into the Command Window. A program that facilitates this kind of interaction is called a read-eval(uate)-print loop, or REPL, though more colloquially it may be called an “interpreter” or “shell”.
Modern versions of Java include a program called JShell that serves this purpose. If your JDK is on your PATH
(optional for this class), you can launch its command-line interface by running jshell
. Alternatively, IntelliJ IDEA provides a graphical interface to JShell under Tools | JShell Console; this is especially useful because you can interact with the classes in a Project (such as a lecture demo or programming assignment) in addition to those from Java’s standard library.
When using JShell, you can treat Java like a scripting language and have the ability to redeclare variables and even types. This is great for trying out code snippets, but remember that this mixture of definitions and execution is not reflective of how most Java code is written and run.
Source files
Interacting with code is great for learning or performing one-off tasks, but most Java code is written to create applications or libraries that are used multiple times by many people. Such code is saved to a file so that it can be shared and revised over time, much like a word processor document. Professional software engineers actually spend much more time reading old code than writing new code, so the style of code saved to files is just as important as its correctness and efficiency.
In Python and MATLAB, code can be written in a file just as it would have been typed interactively; this creates a script that can be executed multiple times. Here is where Java diverges—while you could ask JShell to read inputs from a file, “real” Java code is organized differently from a script. Java is a compiled language, meaning that applications do not execute source code directly. Instead, a program called a compiler first reads all of your source code and translates it into machine instructions without executing any of it. This is typically done by the software’s developer, who may not be the end user (think manufacturer vs. customer). The output of this process, called bytecode in Java, can then be packaged with other bytecode to form one or more applications that the end user can execute.
Java encourages modularity and reuse, so the basic unit of Java code is a class. Typically, every class in Java is defined in its own file, so the code defining a Vehicle
class would be saved to a file named “Vehicle.java”. Within a class, you can define methods, which are analogous to functions in other languages. And within a method body you can write statements of Java code. Executing Java code therefore implies calling methods. Most methods are called by other methods, forming a chain of method calls known as the call stack. But one method has to come first in this chain, defining the first code that is executed when an application is started. This method must be named main()
; it can be defined in any class, with the name of that class effectively being the name of the application (if multiple classes in a project have a main()
method, then your project effectively contains multiple applications; there’s nothing wrong with that).
Here, then, is a minimal “hello world” application written in all three languages:
Python | Java (HelloWorld.java) | MATLAB |
---|---|---|
|
|
|
Wow, does Java look complicated! The amount of “boilerplate” code required for such a simple task is a common criticism of the language—it’ll be a few lectures before you understand public
, static
, String[]
, and System.out
. But keep in mind that most software in the world is not as simple as printing “Hello world”, and the rigid structure imposed by Java becomes a big advantage when managing large projects. For now, focus on the following:
- To write any Java code at all, we needed to define a class. The class was named
HelloWorld
, and the code was saved in a file named “HelloWorld.java”. The class’s definition is enclosed by curly braces ({}
). - Defining a class wasn’t enough; we also needed to define a method. The code we want to execute can then go in the method’s body. Because we wanted this code to run as an application, we named the method
main()
. The method’s body is enclosed by curly braces ({}
). - The code we want to execute should print text to the console. A method named
println()
seems to do the job; the text it should print goes in double quotes (""
). The statement of code ends with a semicolon (;
).
Exercise
Write a Java application in a class named HelloGoodbye
that prints “Hola” on the first line, followed by “Adios” on the next line. Run this application from your IDE to ensure that it compiles and behaves as you expect.
Basic syntax
Statements
In Python and MATLAB, the end of a line of code implies the end of whatever statement was on that line, unless special “line continuation” syntax is used. If you want to put multiple short statements on the same line (usually considered poor style), you can separate them with semicolons. And in MATLAB, ending an assignment statement with a semicolon will prevent MATLAB from printing the result of the assignment.
Java works differently—statements may span several lines of code without any line continuation syntax because every statement must be terminated by a semicolon. Here is an example:
Python | Java (snippet) | MATLAB |
---|---|---|
|
|
|
Operators
Many arithmetic and relational operators in Java are identical to those in other languages (+
for addition, /
for division, <=
for less-than-or-equal-to, etc.). Here is a table of the most common operators that may differ in syntax from other languages you have learned:
Operator | Python | Java | MATLAB |
---|---|---|---|
NOT | not |
! |
~ |
OR (short-circuit) | or |
|| |
|| |
AND (short-circuit) | and |
&& |
&& |
Equality | == |
== (primitive types).equals() (reference types) |
== (value types)isequal() (handle types) |
Non-equality | != |
!= (primitive types) |
~= (value types) |
Identity (reference types) | is |
== |
== |
Non-identity (reference types) | is not |
!= |
~= |
Remainder (aka modulus) | % (positive operands) |
% |
rem() ( mod() matches Python) |
Exponentiation | ** |
Math.pow() |
^ |
Additionally, the +
operator is used for string concatenation (this is the only operator besides identity ==
/!=
that can act on operands of reference type).
Note that type widening rules come into play when determining the type of an arithmetic operation. Here is a quick summary:
- If all operands are integer types other than
long
, the result has typeint
. For this purpose,char
is an integer type. - If all operands are integer types at at least one is
long
, the result has typelong
. - If any operand is a
float
and none is adouble
, the result has typefloat
. - If any operand is a
double
, the result has typedouble
.
This is most relevant for division: integer division is truncating (rounds towards zero). If you want to perform floating-point division between two integers, you must cast at least one of them to a floating-point type:
Python | Java (snippet) | MATLAB |
---|---|---|
|
|
|
Literals
A literal is a value that can be expressed as a single token in source code. false
is a Boolean literal (type boolean
); 5
is an integer literal (type int
); 3.14
is a floating-point literal (type double
), as is 1e-6
(scientific notation). "Hello"
is a string literal (type String
; strings and arrays are the only reference types with literals other than null
).
In Java, string literals are always enclosed in double quotes (""
). Single quotes (''
) denote a character literal of type char
. This is different from Python, which can use either single or double quotes around strings, and MATLAB, which uses single quotes for character arrays.
An int
can only represent values whose magnitude is less than two billion (roughly). If you want an integer literal to have type long
instead of int
, add a suffix of l
or L
(e.g. 987654321L
). If you want a floating-point literal to have type float
instead of double
(sometimes desired for multimedia performance), add a suffix of f
or F
.
Comments
Java has three different kinds of comments: line comments (//
, like Python and MATLAB), block comments (/* */
), and documentation comments (/** */
, analogous to Python docstrings). Note that documentation comments come before the declaration being documented, not after.
Python | Java (snippet) | MATLAB |
---|---|---|
|
|
|
Blocks
It is often necessary to specify that a group of statements all go together; examples include defining the body of a method or the else
branch of an if
statement. A group of statements is called a block, and blocks are often nested. Python uses indentation to indicate which statements belong to the same block, while MATLAB uses the end
keyword to denote the end of a block. Java uses curly braces ({}
) to enclose blocks. Note that, while indentation is not syntactically relevant in Java and MATLAB, it is an essential part of good style, so blocks should always be indented in addition to using the appropriate delimiters.
Python | Java (snippet) | MATLAB |
---|---|---|
|
|
|
In Java, braces are technically optional if a block only consists of a single statement. However, this shorthand tends to lead to bugs and maintenance headaches and is banned by most professional style guides. Blocks should always be enclosed by braces in this course (but you should be aware of the shorthand because older materials may use it).
Blocks in Java also establish a scope for local variables, to be discussed later. Occasionally you may see braces used to segment a block without any associated control structure, typically with the intent of restricting variable scope.
Methods
We’ve said that “methods” are what Java calls functions and procedures. They are object-oriented by default—“instance methods” are used to request that an object perform some operation for you. But object-oriented programming comes later; for now, know that if you put static
in front of a method declaration, it will behave as a “free function” and can be called without needing to construct an object. You will need to write the name of its enclosing class when calling it; for example, calling Math.sqrt(4.0)
invokes the static sqrt()
method in the Math
class in order to compute the square root of its argument.
To declare a method, you need to give it a name, a return type (or void
if the method does not return a value), and a list of parameters. Each parameter must have both a type and a name. The body of a method is enclosed in curly braces ({}
). The return
keyword is used to indicate when to stop executing the method and what value to produce for the caller. A method’s body may have multiple return
statements (though only one will be executed per call); they are not restricted to the last line. Some people consider early returns to be “unstructured” (you may have been told to avoid them in prior CS classes), but in Java they are common and can improve the readability of code when used judiciously.
In Java (and unlike Python and MATLAB), a method can only return a single value (or none if the return type is void
). To produce multiple pieces of information, you must aggregate them in a custom class (to be discussed later in the course).
In the snippets above, main()
, reciprocal()
, and max3()
are all examples of method definitions. The return type comes first, followed by the name, followed by the parameter list. Remember that methods (even static ones) need to be defined inside of a class.
Variables and assignment
Let’s consider how to store the integer value 5
in a variable named score
:
Python | Java (snippet) | MATLAB |
---|---|---|
|
|
|
In all three cases the syntax for assignment is similar: a variable name goes on the left, followed by an equals sign (=
), which represents the assignment operator, followed by the value that should be assigned to the variable. MATLAB even ends the statement with a semicolon (though that is just to suppress automatically printing the assigned value, whereas in Java it is required to indicate that the statement is finished).
Remember that, despite the use of an equals sign, assignment does not establish an equality relation between two symbols; it simply stores the value (or reference) on the right-hand side in the variable named on the left-hand side. Other languages denote assignment differently to break the visual symmetry and avoid this this confusion (score := 5
, score ← 5
), but Java, like many others, chose to use an equals sign, which you should pronounce as “gets” or “becomes”, not “equals”. In order to test an equality relation, use the double-equals (==
) operator (and if you want to assert symbolic equality, well, that’s a different kind of programming altogether).
But Java has something extra—a variable declaration (int score;
). In Java, variables must be declared before they can be used, which helps catch typos (in other languages, if an assignment is made to a variable that hasn’t been used before, a new variable is created, whether or not that was your intent).
To declare a variable in Java, write the type of the variable, followed by its name. In this example, we want score
to be a variable that can hold integers, so we declare its type to be int
. Other built-in types in Java include double
(for double-precision floating-point numbers), boolean
(for true/false), and String
(for text).
To keep code shorter, declaration and assignment can be combined into a single initialization statement: int score = 5;
. But remember that a variable only needs to be declared once (declaration is a prerequisite to assignment, not a substep of it), so an initialization statement may only be used the first time a variable is assigned (to assign a new value later on, simply write score = 4;
without a type prefix).
Static typing
One of the biggest differences between Java and languages you have learned before is that Java is statically typed, while Python and MATLAB are dynamically typed. In a dynamically-typed language, any variable can hold any kind of value; you could assign the integer 5
to variable a
, say, then later reassign it to point to the string 'hello'
. Function return values are similarly flexible—a function may return a floating-point number when given one set of arguments, then return a boolean given a different set of arguments. As a result, one cannot know whether the values involved in an operation are compatible until the program is actually run.
In this context, “static” refers to properties of a program that can be inferred from its source code alone, regardless of what values may actually be present when it is run. By declaring types for variables and return types for methods, Java allows us to reason about the types of expressions statically. This means that Java can ensure that operands will be compatible (to an extent) regardless of what inputs a user may provide when a program is run.
The upshot is that, when a variable is declared in Java, you must specify its type, and by doing so, it is only legal to assign values of that type to the variable. So if variable score
is declared as int score;
, then you may only ever assign integer values to score
, and if variable name
is declared as String name;
, then you may only ever assign string values to name
. You cannot change the type of a variable after it is declared (except in JShell). Because every variable, method return value, and operator result has a statically-known type, every expression in Java has a statically-known type: the type of (1 + 1)
is int
, the type of Math.sqrt(2)
is double
, the type of (5 < 3)
is boolean
, and the type of "hello".substring(1, 5)
is String
.
The benefit of this type annotation is that bugs are caught much earlier, saving you time in the long run (even if it feels frustrating at first). Incompatible types are caught during compilation without having to run the program; this process is so fast that, in practice, such bugs are underlined in red as you type. This gives you immediate feedback on the logical consistency of the code you are trying to write—pretty spiffy.
Scope
Above, we noted that most Java code is written in method bodies, and methods get defined inside of classes. And within a method body, we’re likely to have blocks and nested blocks associated with various control structures. Each of these levels of nesting provides scope for variables declared within them.
For the procedural code snippets in this document, we are concerned with local variables, which are variables declared inside of a method body. Such variables are only accessible by other code in that method body; in fact, they are only accessible by other code in the same block (including nested blocks). Local variables are also confined to a single method invocation; if a method is called recursively, each recursive call gets its own copy of any local variables. Local variables are never used to communicate results between different pieces of code; such coordination must instead be done via return values, shared mutable values (not “shared variables”), or fields.
In the following example, variable x
is declared inside of the if
block, which restricts its scope. x
does not exist outside of that scope, so trying to access its value is meaningless and results in a compile-time error. This forces you to reconsider your intent—if you want to refer to the same variable both inside and outside the if
block, then it must be declared outside that block (and that declaration would need to provide an initial value). Note that Python and MATLAB would encounter a runtime error if the condition were changed to false (e.g. -1 > 0
) because they do not provide an initial value.
Python | Java (snippet) | MATLAB |
---|---|---|
|
|
|
In addition to local variables, there is another kind of variable in Java used for object-oriented programming. Variables declared at class scope are known as instance variables, member variables, or fields (in MATLAB they are called “properties”; in Python, “data attributes”). From a class’s own methods, these would be prefixed by self.
in Python or MATLAB, but Java does not require any prefix to distinguish them (unless they are shadowed by a parameter or local variable of the same name). You may optionally distinguish them with the prefix this.
, however (which is also how you would avoid shadowing ambiguity).
It is possible to mimic “global variables” in Java via static
fields; don’t do this. Global variables make code difficult to reuse and will not play a role in this course.
Arrays
Arrays are somewhat analogous to Python lists or MATLAB vectors, but there are some very important differences:
- Arrays are always passed by reference (like Python), not by value (as in MATLAB).
- Once created, arrays always have the same length. There is no way to append to a Java array.
- In this course, we will see how to use arrays to build higher-level data structures (like
ArrayList
) that provide features like appending and removal.
- In this course, we will see how to use arrays to build higher-level data structures (like
- Square brackets (
[]
, like Python) are used to access an element of an array at a particular index (MATLAB uses parentheses). - Array indices start at 0 (like Python; MATLAB indices start at 1).
- An array’s length can be accessed as a field of an array reference (e.g.
v.length
, unlike Python’slen(v)
or MATLAB’slength(v)
). - A multi-dimensional array is simply an array that contains other arrays in its elements. It need not be rectangular (analogous to Python lists or MATLAB “cell arrays”).
- Array literals are denoted with curly braces (
{}
), not square brackets as in Python and MATLAB. - There is no “syntactic sugar” for taking array slices or indexing backwards from the end of the array. Do not try to use negative indices in Java.
- When allocating an array of a given size, all elements will be initialized to
0
/false
/null
as appropriate.
Python | Java (snippet) | MATLAB |
---|---|---|
|
|
|
Control structures
Conditionals and loops typically behave similarly to other languages, so this comparison table should help you translate the syntax in most cases:
Python | Java (snippet) | MATLAB |
---|---|---|
|
|
|
Misc. syntax
You won’t need to use the following syntax in your own code, but you may come across it in examples, so it’s worth knowing what to look out for.
- Arithmetic-assignment statements:
i += 1;
,b /= 2;
, etc. - Shorthand for operating on a variable and storing the result back in that variable. Equivalent to
i = i + 1;
,b = b/2;
, etc. Can also be used for string concatenation. They’re harmless, and some programmers prefer them. - Pre-increment operators:
++i
,--j
- Increment (or decrement) the variable by 1 and evaluate to the new value. This forms a side-effecting expression, which can be dangerous for programmers to reason about. While some data structure operations can be written very succinctly using such operators, saving a line of code is not worth the risk. This course will only use these operators as statements, not expressions (for example, incrementing a loop variable in a
for
-loop, where it is the dominant idiom). - Post-increment operators:
i++
,j--
- Increment (or decrement) the variable by 1 and evaluate to the old value. The above remarks about pre-increment operators also apply here. Note that, when the expression’s value is unused (i.e., when employed as an increment statement), there is no difference between pre-increment and post-increment operators.
- Conditional expression:
(a < b) ? -1 : 1
- This is a ternary operator, meaning it involves three operands. It acts like an
if
/else
statement, but in an expression context. If the first operand is true, the expression evaluates to the second operand; otherwise, it evaluates to the third operand. Code that uses this operator can be hard to read, but it is much more succinct and takes on a “functional programming” flavor. We will use this operator sparingly in this course. - Exponentiation
- Java does not have an exponentiation operator to take numbers to various powers (i.e.
5**3
in Python or5^3
in MATLAB to compute5*5*5==125
). The closest analogue is the static methodMath.pow(5, 3)
, but note that it returns a floating-point result. For small integer powers, just write out the multiplication, or write a utility function. Do not try to use the carrot (^
) operator for exponentiation in Java—it has a very different meaning, just as in Python (see below). - Bitwise operators:
~
,^
,&
,|
,<<
,>>
,>>>
- Sometimes you need to manipulate the bits that make up binary numbers. The operators perform bitwise NOT, XOR, AND, and OR operators, or shift bits to the left or right (possibly preserving the sign bit). We do not intend to use them in this course, but expect to see them in CS 3410.
- Hexadecimal and octal literals:
0x1F == 31 == 037
- The prefix
0x
signifies that the following letters and digits represent a base-16 (hexadecimal) integer; this is often convenient when doing bitwise arithmetic. Never write a leading zero in front of an integer in Java (e.g.037
)—this interprets the number in base-8 (octal), which no one will be expecting.
Summary
We know this is a lot to take in at once, so bookmark this page as a reference. For more details and examples of Java syntax, read the Language Basics lesson in the Java Tutorial.