Looking at Java 22: Foreign Function & Memory API

2024-04-17 · 8 min

The Foreign Function & Memory API, part of Project Pananama, improves Java’s interoperability with code and data outside the JVM. Calling native libraries and working with native memory becomes safer and more straightforward than using the fragile and often dangerous JNI.

Table of Contents

Grasping Beyond the JVM

As Java developers, we interact with non-JVM libraries and services all the time.

Whether accessing data via JDBC, using web services through an HTTP client, or more-level techniques like Unix domain sockets channels for communicating with processes, it all goes beyond the boundaries of the JVM and back in a well-defined and safe fashion.

On area that the JDK was a bit lackluster, though, was accessing code and data outside the JVM on the same machine. The Foreign Function & Memory API (FFM) fixes that by providing two main components:

Foreign Function Interface (FFI)
Memory Access API

Invoking native code, foreign functions isn’t a new concept for Java.

The Java Native Interface (JNI) has been available since Java 1.1 and enabled Java code to call and be called by applications and libraries that are “native” to the current hardware and operation system, written in languages such as C, but many other languages have JNI-support in some form.

The Problems with JNI

While JNI allows us to integrate Java with native code, it has several downsides that can affect performance, safety, and development complexity:

Complexity and Error-Prone

A significant amount of boilerplate code is required to bridge Java and native code, making the surface brittle in the case of API changes.

Performance Overhead

Calls between the boundaries involve context switching, which can incur serious overhead, especially in performance-critical code.

The same is true for passing data between Java and native code. It usually means converting or copying it (marshaling/unmarshaling), also incurring performance implimications.

Manual Memory Management

Memory used in native code isn’t automatically handled like in Java code. Any improper manual handling can easily lead to memory leaks.

This also can interfere with the Garbage Collector, impacting overall memory management and performance.

Safety and Security

JNI introduces big possible safety and security holes in both directions.

Native code bypasses all the JVM’s safety checks we’re used to and exposes us to things like buffer overflows and what often comes with them, such as crashes.

These crashes can lead to security problems. Therefore, using native code via JNI means including all possible security issues, too.

Portability

Native code is, well, native to the current platform, making our Java code no longer as easily portable as before.

For example, an SCSS library I was working on needs to include both the Linux and macOS aarch64 version of libsass to make it work on all our development and production machines. If I want to open-source the code, I most likely should also include Linux ARM, Windows, macOS Intel, too.

Maintenance Complexity

Our code is no longer “Java-only”, requiring us to have a particular amount of knowledge in the native language being used. And needing to debug an issue in C or C++ isn’t much fun if you’re not used to it…

Filling the JNI Gap

The awesome Java community and ecosystem is usually quick to fill any gaps found in the JDK and improve the developer experience.

In the case of FFI, projects like Java Native Access (JNA), Java Abstract Foreign Function Layer(JNR-FFI), or JavaCPP provide the missing pieces for simpler, more efficient access to native code. Still, compared with other languages like Python or Rust that offer easier, first-class native interop without much or any glue code at all, the Java side of things looks a little bleak.

But don’t fret, the new FFI API is here to brighten up our day!

Calling Foreign Functions

Let’s look at an example first, then check out the different parts involved.

All types are located in the java.lang.foreign package if not stated otherwise.

The C standard library function for getting the length of a String is defined as follows:

size_t strlen(const char* str);

This simple function accepts a pointer to a null-terminated String and returns an unsigned Integer as size_t.

To call the function, a “few” lines of Java code are necessary:

java

void main(String[] args) {

  // STEP 1: FIND FOREIGN FUNCTION

  Linker linker = Linker.nativeLinker();
  SymbolLookup stdlib = linker.defaultLookup();
  MemorySegment strlenAddress = stdlib.find("strlen").orElseThrow();

  // STEP 2: DEFINE IN/OUT AND CREATE METHOD HANDLE

  FunctionDescriptor descriptor =
    FunctionDescriptor.of(ValueLayout.JAVA_LONG,
                          ValueLayout.ADDRESS);
  MethodHandle strlen = linker.downcallHandle(strlenAddress,
                                              descriptor);

  // STEP 3: MANAGE OFF-HEAP MEMORY

  try (Arena offHeap = Arena.ofConfined()) {

    // STEP 4: MAKE ARGUMENT C-COMPATIBLE IN OFF-HEAP MEMORY

    MemorySegment funcArg = offHeap.allocateFrom(args[0]);

    // STEP 5: CALL THE FUNCTION

    long len = (long) strlen.invoke(funcArg);
  }
}

Even though the code is quite clearer than any JNI-based approach, there are still a lot of parts to go over:

STEP 1: Finding the Foreign Function

First, we need a Linker that gives us access to foreign functions. The Linker supports both downcalls (Java -> Native) and upcalls (Native -> Java).

The nativeLinker() call gives a platform-specific Linker conforming to the current Application Binary Interface (ABI).

Even though the Linker interface is “neutral”, the native variant is optimized for the calling conventions of the following platforms:

Linux (x64, AArch64, RISC-V, PPC64, s390)
macOS (x64, Aarch64)
Windows (x64, Aarch64)
AIX (ppc64)

Other platforms are supported via libffi.

Next, we have to look up the address of the symbol. The Linker.defaultLookup() is supposed to return a SymbolLookup for libraries that are “wildly recognized as useful” and the current OS/processor combo. In our case, the C Standard library is among them. However, there’s no definitive list, and each Linker implementation is responsible for curating the list instead.

Finally, we need to find the memory where the function is actually located as a MemorySegment. This interface gives access to memory, either in the Java heap or from a native segment of memory (“off-heap”).

Step 2: Define input and out argument and create a method handle

Now that we got the memory where the foreign function is located, it’s time to define the function signature with a FunctionDescriptor, accepting MemoryLayout instances.

With a FunctionDescriptor and MemorySegment at hand, we can create a java.lang.invoke.MethodHandle for the downcall from Java to native code.

Step 3: Memory Management

An Arena manages access to native memory and ensures each allocated memory block is released after its scope ends, thanks to the try-with-resources.

There are multiple kinds of Arena available:

Kind	Bounded lifetime	Explicitly closable	Accessible from multiple threads
Global	✗	✗	✓
Automatic	✓	✗	✓
Confined	✓	✓	✗
Shared	✓	✓	✓

This way, “manual” memory management becomes quite straightforward and bearable.

Step 4: Preparing the argument

Calling a native function requires native argument types, so we need to convert any Java type to its respective native counterpart.

We allocate the needed memory with the Arena#allocateFrom call.

The previous ByteBuffer approach to off-heap memory in JNI is replaced with the safer and more straightforward MemorySegment representing contiguous areas of memory.

Checkout SegmentAllocator for the many available options.

Step 5: Call the function

The java.lang.invoke.MethodHandle works as you’d expect it to be. Invocation is done by either the lenient invoke(Object...) which performs conversions for the arguments and return type if necessary, or the strict invokeExact(Object...) method, which requires an exact type match between the arguments and the caller type descriptor.

The simple example for strlen doesn’t require any special handling of the return value besides a cast. Just be aware that depending on arguments and return type, it might become a little bit more complicated, like copying memory back from off-heap to Java’s heap by reinterpreting memory. See the Java 22 documentation for more information about MemorySegment::reinterpret.

Conclusion

Looking at the goals defined in the JEP, a certain pattern seems to emerge with the latest features. From a more general point of view, it boils down to productivity, performance, soundness, and integrity. We want to see these properties in any new feature, and the FFM API is a great example of that.

The FFM API addresses many of the limitations of JNI:

Reduce complexity and needed boilerplate to get things running, increasing
Enhanced performance by reducing the needed overhead
Simpler memory management, thanks to more straightforward and safer abstractions that reduce the possibility of memory leaks.
Better portability thanks to less of the previously quite brittle boilerplate.
Easier maintenance, as the API is not as complex as before and requires less knowledge of the native language used.

Even though most developers seldom come into contact with native code and memory, having a more modern, safer, more efficient, and straightforward API for a historically complicated and error-prone task is still an enormous boon.

Resources

JEP 454: Foreign Function & Memory API
Foreign Function and Memory API (Java 22 Documentation)
Package summary: java.lang.foreign
Java Foreign Function & Memory APU (FFM) (HappyCoders.eu)

Looking at Java 22

Intro
Statements before super (JEP 447)
Unnamed Variables & Patterns (JEP 456)
Class-File API (JEP 457)
[Stream Gatherers (JEP 461)]({{ ref “posts/2024/2024-04-29-looking-at-java-22-stream-gatherers.md” }})

#java #looking-at-java-22

Support Me on Ko-fi