Functional Programming With Java: Method References

2022-09-19 · 9 min

Besides lambdas expressions, Java 8 introduced another language syntax change in the form of a new operator, :: (double colon), to create so-called method references.

Even though I talked about and used method references in earlier articles, it’s time to take a closer look.

Table of Contents

What are Method References?

Method references are a way to reference an existing method or constructor and use it in lieu of a lambda expression by using the :: operator and the referenced target and method name without its arguments and parentheses. Only methods that match the required functional interface they’re supposed to represent can be referenced.

The following Stream pipeline shows how you can replace simple lambda expressions with their corresponding method references:

java

List<Customer> customers = ...;

// LAMBDAS

customers.stream()
         .filter(customer -> customer.isActive())
         .map(customer -> customer.getName())
         .map(name -> name.toUpperCase())
         .peek(name -> System.out.println(name))
         .toArray(count -> new String[count]);

// METHOD REFERENCES

customers.stream()
         .filter(Customer::isActive)
         .map(Customer::getName)
         .map(String::toUpperCase) 
         .peek(System.out::println)
         .toArray(String[]::new);

Replacing obvious lambda expressions with method references removes a lot of the usual noise without compromising the readability or understandability of your code too much.

The Different Types of Methods References

There are four different types of method references available, depending on what kind of lambda you want to replace and which method you want to reference:

static method references
Bound non-static method references
Unbound non-static method references
Constructor references

Static Method References

A static method reference refers, as you might have guessed, to a static method of a specific type, like Integer::toHexString

java

// EXCERPT OF java.lang.Integer

class Integer extends Number {

  public static String toHexString(int i) {
    // ..
  }
}


// LAMBDA EXPRESSION

Function<Integer, String> asLambda = i -> Integer.toHexString(i);


// STATIC METHOD REFERENCE

Function<Integer, String> asRef = Integer::tohexString;

The general syntax for static method references is ClassName::staticMethodName

Bound non-static Method References

A bound non-static method reference refers to an existing object’s non-static method. The lambda arguments are passed 1:1 as the method arguments to the reference method of that specific object:

java

// The existing object
var now = LocalDate.now();

// LAMBDA BASED ON EXISTING OBJECT
Predicate<LocalDate> isNowAfter = date -> now.isAfter(date);

// BOUND NON-STATIC METHOD REFERENCE
Predicate<LocalDate> isNowAfterAsRef = now::isAfter;

The existing object instance does not have to be an intermediate variable, though. Using the :: operator directly on another method call is fine:

java

// BIND RETURN VALUE
Predicate<LocalDate> isNowAfterAsRef = LocalDate.now()::isAfter;

// BIND STATIC FIELD
Function<Object, String> castToStr = String.class::cast;

The current instance and its super implementation are also valid reference origins by using this:: or super::respectively:

java

public class SuperClass {

  public String doWork(String input) {
    return "super: " + input;
  }
}

public class SubClass extends SuperClass {

  @Override
  public String doWork(String input){
    return "this: " + input;
  }

  public void superAndThis(String input) {

    // ACCESSING THIS INSTANCE
    Function<String, String> thisWorker = this::doWork;
    var thisResult = thisWorker.apply(input);

    // ACCESSING THE SUPER IMPLEMENTATION
    Function<String, String> superWorker = Subclass.super::doWork;
    var superResult = superWorker.apply(Subclass.super::doWork);

    // ...
  }
}

Bound method references are a great way to utilize already existing methods on variables, the current instance, or super implementation. It also allows you to refactor non-trivial or more complex lambdas to methods and use method references to streamline your code. Especially fluent pipelines, like Streams or Optionals, profit immensely from replacing and refactoring lambda expressions into method references.

The general syntax for bound non-static method references is objectName::instanceMethodName

Unbound non-static Method References

Unbound non-static method references aren’t, as their name suggests, bound to a specific object and refer to an instance method of a type instead:

java

// EXCERPT FROM java.lang.String
class String implements ... {

  String toLowerCase() {
    // ...
  }
}

// AS LAMBDA EXPRESSION
Function<String, String> toLowerCaseLambda = $ -> $.toLowerCase();

// AS METHOD REFERENCE
Function<String, String> toLowerCaseRef = String::toLowerCase;

The general syntax for unbound non-static method references is ClassName::instanceMethodName

Instead of representing a static method on a type, ClassName signifies the instance type in which the referenced instance method is defined. It also becomes the first argument of the lambda expression it replaces. This way, the referenced method is called on the incoming instance and not on an explicitly referenced instance of that type.

Constructor References

The last type of method reference points to a constructor:

java

// AS LAMBDA EXPRESSION
Function<String, Locale> newLocaleLambda = language -> new Locale(language);

// AS METHOD REFERENCE
Function<String, Locale> newLocaleMethodRef = Locale::new;

Constructor method references might look like static or unbound non-static method references, but the referenced method. Instead, the constructor is referenced via the new keyword.

The general syntax for constructor method references is ClassName::new

Behind-the-Scenes

At first glance, method references appear to be syntactic sugar, making the code from the previous section effectively identical. In some cases, like constructor references, the generated ByteCode seems identical. For example, take the following class with constructor method references based on the code from the previous section:

java

public class ConstructorRefs {

  public Function<String, Locale> asLambda() {
    return $ -> new Locale($);
  }

  public Function<String, Locale> asMethodRef() {
    return Locale::new;
  }
}

The generated ByteCode operations for both methods are identical.

// LAMBDA

0: invokedynamic #7,  0   // InvokeDynamic #0:apply:()Ljava/util/function/Function;
5: astore_1
6: aload_1
7: areturn


// METHOD REFERENCE

0: invokedynamic #11,  0  // InvokeDynamic #1:apply:()Ljava/util/function/Function;
5: astore_1
6: aload_1
7: areturn

Even though the ops are identical, the big difference lies in the invokedynamic. call. This ByteCode op was introduced in Java 7 to support more dynamic languages, like JRuby or Groovy. Instead of linking dynamic methods, like lambdas, directly at compile-time, the JVM links a dynamic call site instead. On its first call, a bootstrap method is used to do the actual work of handling the initialization, and a method handle is returned. It’s lazily initialized code done like a reflection-like mechanism, but it’s done directly by the JVM in a safe manner.

The correct bootstrap method number is visible in the comment for the invokedynamic call, not the op itself. Let’s take a look at the bootstrap methods:

// LAMBDA

0: #30 REF_invokeStatic java/lang/invoke/LambdaMetafactory.metafactory:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodHandle;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/CallSite;
  Method arguments:
    #37 (Ljava/lang/Object;)Ljava/lang/Object;
    #39 REF_invokeStatic C.lambda$lambdaConstructor$0:(Ljava/lang/String;)Ljava/util/Locale;
    #42 (Ljava/lang/String;)Ljava/util/Locale;


// METHOD REFERENCE

1: #30 REF_invokeStatic java/lang/invoke/LambdaMetafactory.metafactory:(Ljava/lang/invoke/MethodHandles$Lookup;Ljava/lang/String;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodType;Ljava/lang/invoke/MethodHandle;Ljava/lang/invoke/MethodType;)Ljava/lang/invoke/CallSite;
  Method arguments:
    #37 (Ljava/lang/Object;)Ljava/lang/Object;
    #43 REF_newInvokeSpecial java/util/Locale."<init>":(Ljava/lang/String;)V
    #42 (Ljava/lang/String;)Ljava/util/Locale;

That’s quite a mouthful, but the parts to look out for are how the returned method handle is created in #39 and #43.

The lambda variant in #39 returns a method handle to a static method generated by the compiler.

The method reference variant in #43 uses REF_newInvokeSpecial instead. The ByteCode op invokespecial is used to call constructors and private methods. So instead of relying on a previously generated method to create a lambda, a method handle of the constructor itself is returned directly.

Other types of method references differ in the generated ByteCode, not just the bootstrap methods. Bound method references introduce an additional null-check that’s missing on lambda variants:

java

public class BoundMethodRefs {

  public Supplier<String> asLambda(String input) {
    return () -> input.toLowerCase();
  }

  public Supplier<String> asMethodRef(String input) {
    return input::toLowerCase;
  }
}

The generated ByteCode:

// LAMBDA

0: aload_1
1: invokedynamic #7,  0   // InvokeDynamic #0:get:(Ljava/lang/String;)Ljava/util/function/Supplier;
6: areturn


// METHOD REFERENCE

0: aload_1
1: dup
2: invokestatic  #11      // Method java/util/Objects.requireNonNull:(Ljava/lang/Object;)Ljava/lang/Object;
5: pop
6: invokedynamic #17,  0  // InvokeDynamic #1:get:(Ljava/lang/String;)Ljava/util/function/Supplier;
11: areturn

The bootstrap methods are similar to the constructor references, as the lambda variant will also call a static method, and the method reference one will directly look up the toLowerCase method.

To be honest, I can’t tell you why the additional null-check is inserted for the method reference. My best guess is that it’s cheaper for the JVM to check before calling the bootstrap method instead of delaying the exception into the bootstrap method. If anyone has more information, I’m happy to hear all about it!

The different approach to the generated ByteCode isn’t that unexpected if you think about it. A method reference is an explicit call to a method or constructor, restricting the call context to a single purpose, so directly referencing them is a great optimization opportunity for the JVM. On the other hand, a lambda is primarily lambda, regardless of its actual logic.

I’m sure at some point, the compiler will be able to analyze lambdas and replace them if they match a method reference.

Method References As Glue Between Functional Interfaces

Functional interfaces have to live in Java’s static type system. Therefore, lambdas are statically typed, too. Thanks to local variable type inference, we don’t have to worry most of the time about the actual type of a lambda, at least if declared inline. If you create a lambda expression and store it in a variable, you might be in for a surprise.

Even if the signature of two single abstract methods of a lambda’s functional interface is 100% identical, we can’t cast between their types:

java

interface Predicate<T> {
  boolean test(T value);
}

interface AlsoPredicate<T> {
  boolean test(T value);
}


Predicate<String> isNull = $ -> $ == null;

AlsoPredicate<String> wontCompile = isNull;
// Error:
// incompatible types: java.util.function.Predicate<java.lang.String> cannot
// be converted to AlsoPredicate<java.lang.String>

Casting between lambda expressions is the same as casting any other two objects. If the types aren’t related, your code won’t compile.

The types might be incompatible, but the method references to a single abstract method aren’t!

java

Predicate<String> isNull = $ -> $ == null;

AlsoPredicate<String> wontCompile = isNull::test;

By using a method reference instead of trying to cast between the “identical but different” functional interfaces, you can refer to the SAM instead to make your code compile. But why?

Method references are decoupled from their surrounding type, so the compiler uses only the method signature to detect compatibility, not the overall type.

It’s a nice trick to circumvent functional interface incompatibilities. But it also indicates that our code might be refactored to use common types if possible. Still, it’s a good emergency tool to have in your functional kit, especially if you’re transitioning from a legacy code base to a more functional approach.

Conclusion

Method references are a great way to simplify your functional code and gain the best possible optimizations the JVM has to offer. In most cases, the readability of your code isn’t affected because you trade the usual lambda boilerplate with the related type and the :: operator. For simple one-argument lambda expressions, that’s great. But keep in mind that it hides the arguments, which can be confusing on overloaded methods if it’s not clear which method it actually references.

Like all the other functional goodies we got since Java 8, try to experiment with them in your code and find the ones that feel natural and improve your productivity. Even if a method reference might improve theoretical performance characteristics, it’s not worth doing if your code readability suffers. First and foremost, you have to be comfortable and productive with a language. But don’t discard them just because they’re unfamiliar, either.

Resources

The Java Tutorials - Method References (Oracle)
Java Virtual Machine Specifications - invokedynamic (Oracle)
java.lang.invoke.MethodHandleInfo (JavaDoc)
Local Variable Type Inference

#java #functional

Support Me on Ko-fi