Looking at Java 22: Foreign Function & Memory API
The Foreign Function & Memory API, part of Project Pananama, improves Java’s interoperability with code and data outside the JVM. Calling native libraries and working with native memory becomes safer and more straightforward than using the fragile and often dangerous JNI.
Table of Contents
Grasping Beyond the JVM
As Java developers, we interact with non-JVM libraries and services all the time.
Whether accessing data via JDBC, using web services through an HTTP client, or more-level techniques like Unix domain sockets channels for communicating with processes, it all goes beyond the boundaries of the JVM and back in a well-defined and safe fashion.
On area that the JDK was a bit lackluster, though, was accessing code and data outside the JVM on the same machine. The Foreign Function & Memory API (FFM) fixes that by providing two main components:
- Foreign Function Interface (FFI)
- Memory Access API
Invoking native code, foreign functions isn’t a new concept for Java.
The Java Native Interface (JNI) has been available since Java 1.1 and enabled Java code to call and be called by applications and libraries that are “native” to the current hardware and operation system, written in languages such as C, but many other languages have JNI-support in some form.
The Problems with JNI
While JNI allows us to integrate Java with native code, it has several downsides that can affect performance, safety, and development complexity:
Complexity and Error-Prone
A significant amount of boilerplate code is required to bridge Java and native code, making the surface brittle in the case of API changes.
Performance Overhead
Calls between the boundaries involve context switching, which can incur serious overhead, especially in performance-critical code.
The same is true for passing data between Java and native code. It usually means converting or copying it (marshaling/unmarshaling), also incurring performance implimications.
Manual Memory Management
Memory used in native code isn’t automatically handled like in Java code. Any improper manual handling can easily lead to memory leaks.
This also can interfere with the Garbage Collector, impacting overall memory management and performance.
Safety and Security
JNI introduces big possible safety and security holes in both directions.
Native code bypasses all the JVM’s safety checks we’re used to and exposes us to things like buffer overflows and what often comes with them, such as crashes.
These crashes can lead to security problems. Therefore, using native code via JNI means including all possible security issues, too.
Portability
Native code is, well, native to the current platform, making our Java code no longer as easily portable as before.
For example, an SCSS library I was working on needs to include both the Linux and macOS aarch64 version of libsass to make it work on all our development and production machines. If I want to open-source the code, I most likely should also include Linux ARM, Windows, macOS Intel, too.
Maintenance Complexity
Our code is no longer “Java-only”, requiring us to have a particular amount of knowledge in the native language being used. And needing to debug an issue in C or C++ isn’t much fun if you’re not used to it…
Filling the JNI Gap
The awesome Java community and ecosystem is usually quick to fill any gaps found in the JDK and improve the developer experience.
In the case of FFI, projects like Java Native Access (JNA), Java Abstract Foreign Function Layer(JNR-FFI), or JavaCPP provide the missing pieces for simpler, more efficient access to native code. Still, compared with other languages like Python or Rust that offer easier, first-class native interop without much or any glue code at all, the Java side of things looks a little bleak.
But don’t fret, the new FFI API is here to brighten up our day!
Calling Foreign Functions
Let’s look at an example first, then check out the different parts involved.
All types are located in the
java.lang.foreign
package if not stated otherwise.
The C standard library function for getting the length of a String is defined as follows:
This simple function accepts a pointer to a null
-terminated String and returns an unsigned Integer as size_t
.
To call the function, a “few” lines of Java code are necessary:
Even though the code is quite clearer than any JNI-based approach, there are still a lot of parts to go over:
STEP 1: Finding the Foreign Function
First, we need a Linker
that gives us access to foreign functions.
The Linker
supports both downcalls (Java -> Native) and upcalls (Native -> Java).
The nativeLinker()
call gives a platform-specific Linker
conforming to the current Application Binary Interface (ABI).
Even though the Linker
interface is “neutral”, the native variant is optimized for the calling conventions of the following platforms:
- Linux (x64, AArch64, RISC-V, PPC64, s390)
- macOS (x64, Aarch64)
- Windows (x64, Aarch64)
- AIX (ppc64)
Other platforms are supported via libffi
.
Next, we have to look up the address of the symbol.
The Linker.defaultLookup()
is supposed to return a SymbolLookup
for libraries that are “wildly recognized as useful” and the current OS/processor combo.
In our case, the C Standard library is among them.
However, there’s no definitive list, and each Linker
implementation is responsible for curating the list instead.
Finally, we need to find the memory where the function is actually located as a MemorySegment
.
This interface gives access to memory, either in the Java heap or from a native segment of memory (“off-heap”).
Step 2: Define input and out argument and create a method handle
Now that we got the memory where the foreign function is located, it’s time to define the function signature with a FunctionDescriptor
, accepting MemoryLayout
instances.
With a FunctionDescriptor
and MemorySegment
at hand, we can create a java.lang.invoke.MethodHandle
for the downcall from Java to native code.
Step 3: Memory Management
An Arena
manages access to native memory and ensures each allocated memory block is released after its scope ends, thanks to the try-with-resources
.
There are multiple kinds of Arena
available:
Kind | Bounded lifetime | Explicitly closable | Accessible from multiple threads |
---|---|---|---|
Global | ✗ | ✗ | ✓ |
Automatic | ✓ | ✗ | ✓ |
Confined | ✓ | ✓ | ✗ |
Shared | ✓ | ✓ | ✓ |
This way, “manual” memory management becomes quite straightforward and bearable.
Step 4: Preparing the argument
Calling a native function requires native argument types, so we need to convert any Java type to its respective native counterpart.
We allocate the needed memory with the Arena#allocateFrom
call.
The previous ByteBuffer
approach to off-heap memory in JNI is replaced with the safer and more straightforward MemorySegment
representing contiguous areas of memory.
Checkout SegmentAllocator
for the many available options.
Step 5: Call the function
The java.lang.invoke.MethodHandle
works as you’d expect it to be.
Invocation is done by either the lenient invoke(Object...)
which performs conversions for the arguments and return type if necessary, or the strict invokeExact(Object...)
method, which requires an exact type match between the arguments and the caller type descriptor.
The simple example for strlen
doesn’t require any special handling of the return value besides a cast.
Just be aware that depending on arguments and return type, it might become a little bit more complicated, like copying memory back from off-heap to Java’s heap by reinterpreting memory.
See the Java 22 documentation for more information about MemorySegment::reinterpret
.
Conclusion
Looking at the goals defined in the JEP, a certain pattern seems to emerge with the latest features. From a more general point of view, it boils down to productivity, performance, soundness, and integrity. We want to see these properties in any new feature, and the FFM API is a great example of that.
The FFM API addresses many of the limitations of JNI:
- Reduce complexity and needed boilerplate to get things running, increasing
- Enhanced performance by reducing the needed overhead
- Simpler memory management, thanks to more straightforward and safer abstractions that reduce the possibility of memory leaks.
- Better portability thanks to less of the previously quite brittle boilerplate.
- Easier maintenance, as the API is not as complex as before and requires less knowledge of the native language used.
Even though most developers seldom come into contact with native code and memory, having a more modern, safer, more efficient, and straightforward API for a historically complicated and error-prone task is still an enormous boon.
Resources
- JEP 454: Foreign Function & Memory API
- Foreign Function and Memory API (Java 22 Documentation)
- Package summary:
java.lang.foreign
- Java Foreign Function & Memory APU (FFM) (HappyCoders.eu)
Looking at Java 22
- Intro
- Statements before super (JEP 447)
- Unnamed Variables & Patterns (JEP 456)
- Class-File API (JEP 457)
- [Stream Gatherers (JEP 461)]({{ ref “posts/2024/2024-04-29-looking-at-java-22-stream-gatherers.md” }})