Functional Programming With Java: map, filter, reduce

2020-09-30 · 7 min

The concepts of map, filter, and reduce, are a cornerstone of any functional programming.

Usually, our data pipelines consist of one or more intermediate operations, transforming (aka mapping) and/or filtering elements, and a terminal operation to gather the data again (aka reducing).

With just these three, we can do a lot, so it’s worth knowing them intimately. But they have some close relatives that can be useful, too.

Table of Contents

This article assumes Java 9.
Method signatures and visibility modifiers are shortened for readability.

Photo by Capturing the human heart. on Unsplash

map

Stream#map(Function<T> mapper) is an intermediate stream operation that transforms each element.

It applies its argument, a Function<T, R>, and returns a Stream<R>:

java

List.of("hello", "world")
    .stream()
    .map(String::toUpperCase)
    .forEach(System.out::println);

// Output:
// HELLO
// WORLD

That’s the gist; map is pretty straightforward to use. But there are specialized map functions depending on the type.

flatMap

Stream#flatMap(Function<T, Stream<R>) is the often-misunderstood sibling of map.

Sometimes the mapping function will return an arbitrary number of results, wrapped in another type, like java.util.List:

java

var identifier = List.of(1L, 5L);

Function<Long, List<String>> mapper = (id) -> ...;

identifier.stream()     // Stream<Long>
          .map(mapper)  // Stream<List<String>>
          ???

Most likely, we want to work on the list’s content, not the list itself.

By using flatMap, we can map the Stream<List<String>> to a Stream<String>:

java

var identifier = List.of(1L, 5L);

Function<Long, List<String>> mapper = (id) -> ...;

identifier.stream()                    // Stream<Long>
          .map(mapper)                 // Stream<List<String>>
          .flatMap(Collection::stream) // Stream<String>
          ...

Optional#flatMap

In the case of java.util.Optional<T>, the flatMap method is used to flatten the Optional back to its content:

java

MyBean bean = ...;
Optional.ofNullable(myBean)           // Optional<MyBean>
        .map(myBean::returnsOptional) // Optional<Optional<String>>
        ???

// CAN BE REPLACE WITH
Optional.ofNullable(myBean)               // Optional<MyBean>
        .flatMap(myBean::returnsOptional) // Optional<String>
        ...

Actually, the implementation of flatMap is even doing less than map by omitting to repackage the mapper’s returned value into a new Optional.

Value-type map / flatMap

Until Project Valhalla with generic specialization arrives, handling with value types and generics is always a special case.

We could rely on auto-boxing, but we can’t deny that there’s an added overhead. The JDK includes specialized Stream types to improve dealing with value types:

If our mapping function returns one of the related value types, we could use the corresponding mapTo...(mapper) / flatMapTo...(mapper) to create a value-type-based Stream:

This way, we can get a real array of long, without intermediate boxing:

java

long[] hashCodes = List.of("hello", "world")
                       .stream()
                       .mapToInt(String::hashCode)
                       .toArray();

forEach

As mentioned before, map is an intermediate operation. Many other languages use it to perform actions on all elements, discarding any return type, if not void.

We can use map just like that too, but there’s a better way.

By utilizing the terminal operation Stream#forEach(Consumer<T>), we apply the consumer on every element of the stream:

java

listeners.stream()
         ... // other operations
         .forEach(this::trigger);

filter

Stream<T>#filter(Predicate<T> predicate) is used for, you guessed it, filtering elements.

If the predicate returns true, the elements will travel further down the stream:

java

Predicate<Long> isEven = (value) -> value % 2L == 0;
List.of(1L, 2L, 3L, 5L, 8L, 13L)
    .stream()
    .filter(isEven)
    .forEach(System.out::prinln);

// Output:
// 2
// 8

If we use a variable for the predicate, it’s easily negatable using Predicate<T>#negate(). Java 11 even provides us with the static <T> Predicate<T>not(Predicate<T> target) method, so we can use it with a lambda:

java

// Java 11+
List.of(1L, 2L, 3L, 5L, 8L, 13L)
    .stream()
    .filter(Predicate.not(value -> value % 2L == 0))
    .forEach(System.out::prinln);

// Output:
// 1
// 3
// 5
// 13

Not all of us are already on Java 11. But we can replicate it in a helper class:

java

// Java < 11
class StreamHelpers {
    static <T> Predicate<T> not(Predicate<? super T> predicate) {
        Objects.requireNonNull(predicate);
        return (Predicate<T>) predicate.negate();
    }
}
// How to use it in our code
List.of(1L, 2L, 3L, 5L, 8L, 13L)
    .stream()
    .filter(StreamHelpers.not(value -> value % 2L == 0))
    .forEach(System.out::prinln);

// Output:
// 1
// 3
// 5
// 13

The helper class could also be import static, so we can omit StreamHelpers.

takeWhile / dropWhile

The two methods, takeWhile and dropWhile, are close relatives to filter. Their names are pretty self-explanatory.

They are short-circuiting stream operations, not processing all elements of a stream if not necessary.

If the predicate returns false, the rest of the stream is discarded (takeWhile), or everything before is discarded (dropWhile):

java

// TAKE WHILE
List.of(1L, 5L, 7L, 10L, 11L, 12L)
    .stream()
    .takeWhile(value -> value % 2 != 0)
    .forEach(System.out::prinln);

// Output:
// 1
// 5
// 7


// DROP WHILE
List.of(1L, 5L, 7L, 10L, 11L, 12L)
    .stream()
    .dropWhile(value -> value % 2 != 0)
    .forEach(System.out::prinln);

// Output:
// 10
// 11
// 12

Unordered streams

As long as a stream is ordered, these methods work as intended. In the case of unordered streams, they can easily become non-deterministic. If not all elements match the predicate, the returned elements are arbitrary:

java

Set.of(1L, 5L, 7L, 10L, 11L, 12L)
    .stream()
    .dropWhile(value -> value % 2 != 0)
    .forEach(System.out::prinln);
// Random Output:
// 10
// 7
// 5
// 1
// 12

The reason is simple: Because it’s not clear in which order the predicate encounters the elements, the result can’t be deterministic.

Parallel streams

Due to the ordered nature of the methods, using them in parallel streams is quite expensive, impacting overall performance. Usually, a sequential stream is a better choice for takeWhile or dropWhile.

reduce

The reduce method, also known as fold in functional programming lingo, accumulates the elements of the stream with a BinaryOperator<T> and reduces them to a single value:

T reduce(T initialValue, BinaryOperator<T> accumulator)

A common use case is summing up values:

java

// List of cart item benas
var cartItems = List.of(...);
var total =
    items.stream()                   // Stream<CartItem>
         .map(CartItem::getQuantity) // Stream<BigDecimal>
         .reduce(BigDecimal.ZERO,
                 (acc, current) -> acc.add(curr));

There are two additional reduce variants available:

The first one doesn’t require an initial value. As a consequence, we might not find matching elements to accumulate, hence the Optional<T> as return type.

The second one is used for parallel streams. The accumulation can be parallelized, and the multiple results are combined.

count/sum/min/max

Common reduce use cases are already available to us, depending on the stream type:

long Stream<T>#count()

To better understand reduce operations, let’s make a naive implementation ourselves:

java

// COUNT
int count =
    List.of(1L, 2L, 3L)
        .stream()
        .reduce(0L,
                (acc, cur) -> acc + 1);
// SUM
Long sum =
    List.of(1L, 2L, 3L)
        .stream()
        .reduce(0L,
                (acc, cur) -> acc + cur);
// MIN
Long min =
    List.of(10L, 5L, 11L)
        .stream()
        .reduce(Long.MAX_VALUE,
                (acc, cur) -> acc.compareTo(cur) < 0 ? acc : cur);
// MAX
Long max =
    List.of(10L, 5L, 11L)
        .stream()
        .reduce(Long.MIN_VALUE,
                (acc, cur) -> acc.compareTo(cur) > 0 ? acc : cur);

Collectors

Collectors are thematically related to reduce by aggregating elements of a stream. We can achieve similar results with both, but the difference between them is more subtle.

A reduce operation creates a new value by combining two values in an immutable way. Collectors, however, are using mutable accumulate objects.

Let’s implement String concatenation with both:

java

List.of("Hello", "World", "Reader")
    .stream()
    .reduce("",
            (acc, cur) -> acc + " " + cur);
// VS
List.of("Hello", "World", "Reader")
    .stream()
    .collect(Collector.of(() -> new StringJoiner(" "),
                          StringJoiner::add,
                          StringJoiner::merge,
                          StringJoiner::toString));

The reduce version creates many String objects because it can only work in an immutable way. But the collector can leverage a mutable accumulation object to reduce instantiations.

Which one we should prefer depends on our requirements, considering the actual intended purpose, performance considerations, etc. If we’re dealing with immutable value types, a typical reduction should be used. But if we need to accumulate into a mutable data structure, a collector might make more sense.

Conclusion

It’s always a good idea to know the most important tools in our (functional) toolbox.

map applies as a transformation to an element.
filter accumulates only elements matching a Predicate<T>.
reduce accumulates all elements to a single value, by using immutable values.

This tweet summarizes it perfectly:

Resources

Map (higher-order function) (Wikipedia)
Filter (higher-order function) (Wikipedia)
Fold (higher-order function) (Wikipedia)
The Java Tutorials: Reduction (Oracle)
MIT 6.005 Software Construction (MIT)

#java #functional

Support Me on Ko-fi