Equality and Comparison in Java: Pitfalls and Best Practices
Java has different methods of comparing objects and primitives, each with its own semantics. Using the “wrong” one can lead to unexpected results and might introduce subtle, hard-to-catch bugs.
Before we can learn about the pitfalls and best practices of equality and comparison in Java, we need to understand the different kinds of types and their behavior.
Table of Contents
Primitives vs. Objects
The Java type system is two-fold, consisting of eight primitive data types (boolean
, byte
, char
, short
, int
, long
, float
, double
), and object reference types.
Primitives
Primitives in Java can’t be uninitialized or null
, they always have a default value.
It represents 0
, suitable for the specific data type:
Primitive wrapper classes
Every primitive data type has a corresponding wrapper class in java.lang
, encapsulating its value in a Java object:
Being objects allows them to be used in a wider range of scenarios:
- Generic types (e.g.,
List<Integer>
). - Pass-by-reference instead of by-value.
- Ability to be null.
- etc.
But we also have to deal with all the disadvantages. Like NullPointerException
, a bigger memory footprint, and a performance impact.
Autoboxing and unboxing
The last thing we need to understand before we can learn about equality and comparison is boxing.
Even though primitives and object references have different semantics, they can be used interchangeably, thanks to the Java compiler.
Autoboxing is the automatic conversion of primitive types in their corresponding wrapper class, and unboxing is the other direction. This allows us to use both kinds of types without discrimination:
Our List
uses the wrapper type Integer
, but our code compiles even though we add an int
. That’s possible thanks to the compiler changing our code by autoboxing the i
:
The same is true the other way around:
Even though we use operators like %
and +
that aren’t available to the object type Integer
, the code compiles fine. Because the compiler unboxes the wrapper type. The actual compiled code looks more like this:
Equality
If we look at other programming languages, the most logical conclusion for how to compare values might be the ==
operator and its antagonist !=
.
Yes, we can use them to check for equality, and they compare values against each other, but it might not be the value you’re expecting.
Primitives
Primitives are literals, fixed values in memory, that can be tested for equality with ==
.
Except when they can’t.
In contrast to the other primitive data types, the floating-point data types float
and double
can’t reliably be checked for equality with ==
, due to their storage method in memory. They aren’t exact values:
We’ve got two options to deal with this. Either by using java.util.BigDecimal
, which is exact. Or by using threshold-based comparisons:
Arrays
Another pitfall is primitive arrays because arrays aren’t a primitive type, they’re objects.
Objects
If you compare objects with ==
, it will also compare the value of the object. The only problem here is that the value of an object is actually its reference, hence the name object reference type.
This means two values are only equal if they point to the same object in memory.
In practice, variables might be equal in some cases, but not in others:
The compiler and the JVM might optimize string constants, so result2
is true
. And result3
is false
because a + b
creates a new object in memory. All of this can be implementation-dependent and differ between different JVMs.
Another “not so obvious” pitfall can happen with primitive wrapper types:
What? This one took me by surprise, too.
The valueOf(...)
methods of java.util.Integer
and java.util.Long
actually cache values for specific ranges (-128 to 127), making a
and b
the same object, but not c
and d
. And thanks to unboxing, equalAgain
is true
.
Object.equals(Object other) and Object hashCode()
The java.lang.Object
class provides an equals
method for all its subclasses, with a quite simple implementation:
By default, every one of our types inherits the “problematic” comparison of object references. To be able to use equals
for actual equality, we need to override it in our types, having certain properties:
- Reflexive: An object should be equal with itself:
obj.equals(obj) == true
. - Symmetric: If
a.equals(b) == true
, thenb.equals(a)
must also betrue
. - Transitive: If
a.equals(b) == true
andb.equals(c) == true
, thena.equals(c)
should betrue
. - Consistent:
a.equals(b)
should always have the same value for unmodified objects. - Null handling:
a.equals(null)
should befalse
. - Hash code: Equal objects must have the same hash code.
If we provide our own equals
method, we also need to override hashCode
.
Since Java 7, the class java.util.Objects
provides helpers for simplifying our code:
Be aware of the class comparison in line 19. We might be inclined to use instanceof
to compare objects, but this might violate the general contract between equals
and hashCode
: Equal objects must have the same hash code.
Of course, we can design our objects so that even subclasses are equal to their parents. But the definition of equality must be the same for both, the hash code calculation must occur in the base class.
The classes java.util.Date
and its subclass java.sql.Date
are defined that way. The sql
version doesn’t have an equal
or hashCode
method, and the base class builds its hash code solely from the timestamp.
Another example is collection classes: java.util.ArrayList
and java.util.LinkedList
are both subclasses of java.util.AbstractList
and use its equal
and hashCode
methods. Equality for collections is most of the times defined by the equality of their content, so using instanceof
and not a hard class check seems appropriate.
Comparison
Just testing for equality is seldom enough. The other significant kinds of operations are comparisons of values.
Primitives
Like in other languages, we can compare the values of primitives with the <
, >
, <=
, and >=
operators.
The same problems of floating-point data types apply to them, so be aware. Also, boolean
isn’t comparable except for equality with ==
and !=
.
java.lang.Comparable
Objects don’t support these operators. To compare object types we need to implement the interface java.lang.Comparable<T>
with its single method int compareTo(T)
.
The result of left.compareTo(right)
is supposed to be the following:
The result represents the natural order of our type, not just arithmetical comparability. This way, we can make collections of our type sortable.
Best Practices
There are some simple rules we should follow not to get the wrong results when comparing values for equality or their natural order.
Never compare objects with ==
It only works if it’s the same object. Different objects with the same value are not equal. Always use boolean equals(Object other)
to compare for equality.
Always implement equals
and hashCode if needed
To make a type testable for equality, we need to implement both equals
and hashCode
to ensure correct and consistent behavior.
Floating-point data types aren’t exact
Always remember that floating-point data types aren’t exact and are harder to compare.
If we need to work with decimal values and need absolute precision, we should always use java.util.BigDecimal
. But be aware that its equals
is based on its precision:
If we need a more relaxed comparison, we can use compareTo
:
The BigDecimal
API isn’t pretty compared to primitive operations, primarily due to its immutability. But that’s actually a good thing, correctness comes before beauty.
Be aware of autoboxing/unboxing
Because the compiler does this behind the scenes, we must be sure to compare primitives, or wrapper objects, thanks to Integer
/Long
caching.
To be 100% sure, we could use Comparable<T>#compareTo(T)
of the wrapper types instead, which always uses the encapsulated value, and not the object reference.
Resources
- Primitive data types (Oracle)
- Autoboxing (Oracle)
- IEEE Standard for floating-point arithmetic (Wikipedia)
- Object.equals(Object other) (Oracle)
- Comparable<T> (Oracle)