How to Iterate with Java

2021-02-23 · 7 min

Iterating data structures is one of the most common tasks. Everyone knows the classics, like for or while. But there are more ways to iterate in Java, providing a lot more functionality.

iteration
noun it·er·a·tion | \ ˌi-tə-ˈrā-shən \
one execution of a sequence of operations or instructions in an iteration
— Merriam-Webster

Table of Contents

Java 8 is assumed, but the Java 10 feature “Local Variable Type Inference (var)” is used if appropriate, to increase readability.

The Classics

There are 3 classic ways to iterate with language-integrated keywords:

for
Enhanced for (Java 5+)
while / do-while

The `for`-loop

A for-loop is defined as:

for (<initialization>; <termination>; <increment>) {
    ...
}

The 3 expressions in the parentheses have different life cycles:

Initialization: Called once at the beginning of the loop
Termination: Terminates the loop if it evaluates to false
Increment: Invoked after every iteration

Usually, all 3 expressions are used, but they aren’t required:

java

var fibonacci = new int[] { 0, 1, 1, 2, 3, 5, 8 };

for (int idx = 0; idx < fibonacci.length; idx++) {
    int value = fibonacci[idx];
    System.out.println(value);
}

// IS EQUIVALENT TO

int idx = 0;
for ( ; idx < fibonacci.length; ) {
    int value = fibonacci[idx];
    System.out.println(value);
    idx++;
}

Even though it’s possible to write code like this, I wouldn’t recommend it. The for-loop is a great tool to confine the iteration to a single place by putting the 3 expressions in a single pair of parentheses. If we split up the expressions, our code becomes harder to reason with.

Without a termination expression, the loop would run endlessly:

java

for ( ; ; ) {
    ...
}

We would need to use break inside the loop to terminate it. Endless loops are often used for a technique called busy waiting, which is considered an anti-pattern. There are many other ways available to us to deal with concurrency, waiting for threads, etc., but these are beyond the scope of this article.

Enhanced for-loop

The traditional expression-based for-loop does its work quite fine, but it is also a noisy and verbose construct. Most of the time, our loops actually start with the same logic:

Iterate over all items and access the current item

Java 5 introduced the enhanced for-loop, to simplify this common task. It’s available to data structures that conform to java.util.Iterable<T> or are an array.

Every array has Object as its base class, and implements Serializable and Cloneable, but not Iterable. One of the main advantages is the ability to store primitive types (e.g., int, long) and not just full-fledged objects, so we don’t have to rely on auto-boxing.

The syntax is simple:

for (<assingment> : <Iterable/array>) {
    ...
}

Our previous example becomes quite more reasonable:

java

var fibonacci = new int[] { 0, 1, 1, 2, 3, 5, 8 };

for (var number : fibonacci) {
    System.out.println(number);
}

Usually, the actual type is clear thanks to the immediate context. So using var will reduce any additional noise from longer type names.

while / do-while

The while-loop can be seen as a simplified for-loop with only a termination expression:

while (<termination>) {
    ...
}

As with for, the termination expression breaks the loop if it evaluates to false.

There are scenarios where a possible termination needs to be evaluated after the loop-block. That’s what do-while is doing:

do {
    ...
}
while (<termination>)

This way, the loop-block is run at least once, regardless of the termination expression.

Iterator-based

Iterable

The interface java.util.Iterable<E> makes a data structure usable for the enhanced for-loop by providing a java.util.Iterator<E>:

java

interface Iterator<E> {

    boolean hasNext();

    E next();

    void remove()
    
    // Java 8+
    void forEachRemaining(Consumer<? super E> action)
}

It’s a simple abstraction of a data structure by knowing if there is another element and how to get it. The surrounding frame for the actual iteration is usually provided by a loop:

java


var fibonacci = List.of(0, 1, 1, 2, 3, 5, 8);

Iterator<Integer> iter = fibonacci.iterator();

while (iter.hasNext()) {
    int value = iter.next();
    System.out.println(value);
}

That’s quite noisy compared to an enhanced for-loop, so what’s the advantage?

The Iterator#remove() method makes all the difference. While iterating, we can’t change the data we’re currently iterating over. This code will throw a java.util.ConcurrentModificationException:

java

var fibonacci = List.of(0, 1, 1, 2, 3, 5, 8);

// Make it mutable
var mutable = new ArrayList<Integer>(fibonacci);

// throws java.util.ConcurrentModificationException
for (var value : mutable) {
    if (value % 2 == 0) {
        mutable.remove(value);
    }
}

Actually, the ConcurrentModificationException isn’t thrown during the call to List#remove(Object) in this case. Instead, the Iterator<Integer> accessing next() will throw the exception.

With the help of using an Iterator directly, we can remove while iterating:

java

var fibonacci = List.of(0, 1, 1, 2, 3, 5, 8);

// We need a mutable List
List<Integer> mutable = new ArrayList<>(fibonacci);

var iter = mutable.iterator();

while (iter.hasNext()) {
    int value = iter.next();
    if (value % 2 == 0) {
        iter.remove();
    }
}

// mutable => [1, 1, 3, 5]

We could also use a for-loop instead, and combine initialization and termination into a single line:

java

var fibonacci = List.of(0, 1, 1, 2, 3, 5, 8);

// We need a mutable List
List<Integer> mutable = new ArrayList<>(fibonacci);

for (var iter = mutable.iterator(); iter.hasNext();) {
    int value = iter.next();
    if (value % 2 == 0) {
        iter.remove();
    }
}

// mutable => [1, 1, 3, 5]

ListIterator

Being an Iterator<E> at heart, the ListIterator<E> provides additional functionality to navigate backward, and for modification:

java

interface ListIterator<E> extends Iterator<E> {

    // NAVIGATION

    boolean hasPrevious();

    E previous();

    int nextIndex();

    int previousIndex();

    // MODIFICATION

    void set(E e);

    void add(E e);
}

A ListIterator can be visualized as being between elements:

             E[0]   E[1]   ... E[n]  
Positions: ^      ^      ^          ^

Any modification operations will be performed on the last returned element by next() or previous().

Lambda-based

The introduction of lambdas with Java 8 brought new possibilities to iterate. Thanks to default methods, every type implementing Iterable<T> gains the ability to apply a Consumer<T> to every element:

java

interface Iterable<T> {
    
    default void forEach(Consumer<? super T> action) {
        Objects.requireNonNull(action);
        for (T t : this) {
            action.accept(t);
        }
    }
}

As we can see by its simple implementation, we won’t gain much compared to a normal enhanced for-loop. That’s why it’s best for simple use cases, like calling a method reference:

java

var fibonacci = List.of(0, 1, 1, 2, 3, 5, 8);

fibonacci.forEach(System.out::println);

forEach with Streams

Streams are lazy-sequential data pipelines of functional blocks, which means they will iterate over the data, applying the functional blocks appropriately.

In addition to iterating with the stream’s intermediate operations itself, we can also use the terminal operation Stream#forEach(Consumer<? super T> action). But just like with Iterable, smaller and more reasonable blocks improve our code’s general clarity.

It’s better to move as much logic as possible to the intermediate operations, and as high up as possible, so the total count of operations is as small as possible:

java

var fibonacci = List.of(0, 1, 1, 2, 3, 5, 8);

fibonacci.stream()
         .filter(number -> number % 2 != 0)
         .forEach(System.out::println);

Lambda vs. Traditional

We now have learned about different ways to iterate over data structures.

But how do we choose which to use?

As much as I love lambdas, they come with multiple downsides, especially regarding iteration:

Exceptions: Handling (checked) exceptions isn’t fun with lambdas, I wrote a whole article on how to deal with them in a functional manner.
No breaks: A return in a lambda will function like a continue in a traditional loop, but there’s no equivalent for break.
JVM optimizations: No loop-unrolling. This doesn’t mean there are no optimizations of lambdas, but they are different compared to traditional loops.
Deeper callstack: A stack frame will be created for the additional method call.
Debugging: Even though IDEs got better at handling lambdas and streams, it’s still simpler to step through a loop. Especially with the additional stack frame.
No side effects: Only effectively final variables can be used in the lambda.

At first read, that might sound really bad. But lambdas and streams also offer many advantages:

Parallel execution: Streams can be parallelized without needing an additional ExecutorService.
Fluency: A fluent call can split up operations into more manageable blocks. The whole call can be more concise and explicit compared to a loop-block.
No side effects: Less side effects due to effectively final variables.

Conclusion

As you see, there’s no simple answer to use which type of iteration. It all depends on the context. Code readability and maintainability should play a big role in deciding which kind of iteration fits best.