Looking at Java 22: Class-File API

 ยท 13 min
AI-generated by DALL-E

Class files and the underlying Bytecode serve as the universal language within the Java ecosystem. Parsing, generating, and transforming class files are essential tasks enabling many of the tools and libraries we use daily.

Today’s JEP 457 is a preview feature that introduces a brand new API for working with Class files and targets to replace the JDK internal copy of ASM eventually.


Playing Catch-Up

The available libraries for parsing and generating Class files are plentiful, like ASM, BCEL, or Javassist. Each library is tailored to a specific need with its own design goals, advantages over others, and limitations.

Frameworks relying on Class-file generation usually bundle their favorite one. For example, the open-source framework I spend most of my time with, Apache Tapestry, bundles ASM to generate runtime Class files.

Another project you might have heard of that bundles a copy of ASM is the JDK. It’s used in the javac compiler and tools like jar and jlink. But using a third-party library in the JDK itself for such an essential task creates the problem of “playing catch up”.

For ASM to support a new JDK version X, it must wait for that version to be finalized before implementing its features. However, the javac compiler can’t emit Class files containing those features without being implemented in ASM. Instead, JDK X features can only be safely emitted with JDK X+1 when ASM has implemented them.

As you can see, that’s a vicious circle that the tools that want to support the latest circles can’t because they aren’t emitted yet, but any new Class-file format features can’t be emitted, as no one supports them… And with the tightened release cadence of six months, the class-file format might evolve quite quickly, forcing third-party libraries to play catch-up fast, or their users might be left behind until the new version drops.

That’s the issue JEP 457 tries to solve by providing an API for processing class files that evolves alongside the JDK itself, making new Class-file format features available directly at release.


The New Class-File API

The Class-file API is located in the newly added java.lang.classfile package. It’s defined by three main abstractions: elements, builders, and transforms.

Elements

An element describes some part of a class file. It’s immutable and can be as small as a single instruction or as large as the whole Class file, and everything in between, such as an attribute, a field, or a method.

Elements can further contain elements of their own, making them compound elements, like methods or classes.

Builders

For every type of compound element, there’s a matching builder equipped with specific building methods, such as ClassBuilder::withMethod. A builder also serves as a Consumer for the relevant element type.

Transforms

A transform is a function accepting an element and builder, guiding the process of how, or if, that element should be transformed into other elements.

Parsing Class Files with Patterns

Whereas ASM uses the visitor pattern to create its view of Class files, the new Class-File API goes a different way, relying on a feature introduced in Java 17 and finalized in 21: switch Pattern Matching.

The main type for reading Class files is ClassModel. It’s an immutable description of the Class file with accessor methods for metadata and things like fields, attributes, etc., which are lazily inflated, meaning most parts of a Class file aren’t parsed if not needed.

To create such a model, we can convert bytes into an instance:

java
ClassModel classModel = ClassFile.of().parse(bytes);

We can traverse the specific elements with loop-structures:

java
ClassModel cm = ClassFile.of().parse(bytes);

for (FieldModel fm : cm.fields()) {
  System.out.printf("Field %s%n", fm.fieldName().stringValue());
}

However, the model being a series of Class elements, we can also utilize pattern matching for a more organized way of doing things:

java
ClassModel cm = ClassFile.of().parse(bytes);

for (ClassElement classElement : cm) {
  switch (cm) {
    case MethodModel mm -> System.out.printf("Method %s%n",
                                             mm.methodName().stringValue());
    case FieldModel fm -> System.out.printf("Field %s%n",
                                            fm.fieldName().stringValue());
    default -> { /* NO-OP */ }
  }
}

A ClassElement, like a MethodModel can be the source of more elements.

For example, to find all Classes which fields and methods the initial ClassModel methods access, we need to go deeper:

java
ClassModel cm = ClassFile.of().parse(bytes);
Set<ClassDesc> deps = new HashSet<>();

for (ClassElement ce : cm) {

  // We're only interested in methods
  if (ce instanceof MethodModel mm) {

    for (MethodElement me : mm) {

      // Only if the current element provides more elements
      if (me instanceof CodeModel xm) {

        // Pattern Match all elements found
        for (CodeElement e : xm) {
          switch (e) {
            case InvokeInstruction i -> deps.add(i.owner().asSymbol());
            case FieldInstruction i -> deps.add(i.owner().asSymbol());
            default -> { /* NO-OP */ }
          }
        }
      }
    }
  }
}

Having to go 5 levels deep is quite unwieldy… let’s convert it more readable Stream<ClassElement> pipeline:

java
ClassModel cm = ClassFile.of().parse(bytes);

Set<ClassDesc> deps =
  cm.elementStream()
    // Only Method elements
    .flatMap(ce -> ce instanceof MethodModel mm ? mm.elementStream()
                                                : Stream.empty())

    // Only elements that provide more elements
    .flatMap(me -> me instanceof CodeModel com ? com.elementStream()
                                               : Stream.empty())

    // Pattern match the elements
    .<ClassDesc> mapMulti((xe, c) -> {
      switch (xe) {
        case InvokeInstruction i -> c.accept(i.owner().asSymbol());
        case FieldInstruction i -> c.accept(i.owner().asSymbol());
        default -> { /* NO-OP */ }
      }
    })

    // Start Stream processing and gather results
    .collect(Collectors.toSet());

With a Stream pipeline, the different steps could be refactored to methods, so method references would clean up the code even more.

Generating a Class File

Parsing Class files is only one part of the equation. Another is generating new Class files.

Let’s say we want to generate a class containing the following simple method:

java
void fooBar(boolean z, int x) {
  if (z) {
    foo(x);
  } else {
    bar(x);
  }
}

First, it ASM’s and its visitor-based approach turn:

java
// The actual class is irrelevant for this example
ClassWriter classWriter = ...;

MethodVisitor mv = classWriter.visitMethod(0, "fooBar", "(ZI)V", null, null);
mv.visitCode();
mv.visitVarInsn(ILOAD, 1);
Label label1 = new Label();
mv.visitJumpInsn(IFEQ, label1);
mv.visitVarInsn(ALOAD, 0);
mv.visitVarInsn(ILOAD, 2);
mv.visitMethodInsn(INVOKEVIRTUAL, "Foo", "foo", "(I)V", false);
Label label2 = new Label();
mv.visitJumpInsn(GOTO, label2);
mv.visitLabel(label1);
mv.visitVarInsn(ALOAD, 0);
mv.visitVarInsn(ILOAD, 2);
mv.visitMethodInsn(INVOKEVIRTUAL, "Foo", "bar", "(I)V", false);
mv.visitLabel(label2);
mv.visitInsn(RETURN);
mv.visitEnd();

To better understand what’s going on, here’s a mapping between the Java code and the resulting Bytecode:

The Java code
The Java code
The resulting Bytecode
The resulting Bytecode

Next, let’s take a look at how the new Class-file API deals with creating the method. Once again, the API departs from the visitor pattern and uses a lambda-accepting builder instead:

java
ClassBuilder classBuilder = ...;

classBuilder.withMethod("fooBar",
                        MethodTypeDesc.of(CD_void, CD_boolean, CD_int),
                        flags,
                        methodBuilder -> methodBuilder.withCode(codeBuilder -> {
  Label label1 = codeBuilder.newLabel();
  Label label2 = codeBuilder.newLabel();

  codeBuilder.iload(1)
    .ifeq(label1)
    .aload(0)
    .iload(2)
    .invokevirtual(ClassDesc.of("Foo"),
                   "foo",
                   MethodTypeDesc.of(CD_void, CD_int))
    .goto_(label2)
    .labelBinding(label1)
    .aload(0)
    .iload(2)
    .invokevirtual(ClassDesc.of("Foo"),
                   "bar",
                   MethodTypeDesc.of(CD_void, CD_int))
    .labelBinding(label2);
    .return_();
});

That’s way more straightforward and explicit. Instead of passing instructions via visitVarInsn, the builder has many convenient methods like aload or iload directly built-in.

The only one I really dislike is return_ with an underscore, but well, it’s a reserved keyword.

Decoupling the builder from the visitor pattern also creates another opportunity: higher-level block scoping and local-variable index calculation.

Since the Class-file API handles block scoping for us, we no longer need to create labels or branch instructions manually, making the previous example actually readable for people not fluent in Bytecode:

java
ClassBuilder classBuilder = ...;

classBuilder.withMethod("fooBar",
                        MethodTypeDesc.of(CD_void, CD_boolean, CD_int),
                        flags,
                        methodBuilder -> methodBuilder.withCode(codeBuilder -> {
  codeBuilder.iload(codeBuilder.parameterSlot(0))
             .ifThenElse(
                b1 -> b1.aload(codeBuilder.receiverSlot())
                        .iload(codeBuilder.parameterSlot(1))
                        .invokevirtual(ClassDesc.of("Foo"),
                                       "foo",
                                       MethodTypeDesc.of(CD_void, CD_int)),
                b2 -> b2.aload(codeBuilder.receiverSlot())
                        .iload(codeBuilder.parameterSlot(1))
                        .invokevirtual(ClassDesc.of("Foo"), 
                                       "bar",
                                       MethodTypeDesc.of(CD_void, CD_int))
             .return_();
});

Labels and branches are still in the resulting Bytecode, as they are essential, but it’s all done automatically on our behalf! This results in more straightforward and maintainable code.

Transforming Class Files

Class-file libraries are often used to combine the steps of reading and writing into transformation; a Class is read and localized changes applied, but most of it passes through unchanged. That’s why each builder has with... methods so that elements can just pass through without any transformation.

For example, let’s remove all methods from a Class whose names start with “debug” using a for loop:

java
ClassModel classModel = ClassFile.of().parse(bytes);

byte[] newBytes = ClassFile.of().build(classModel.thisClass().asSymbol(),
  classBuilder -> {
    for (ClassElement ce : classModel) { 
      if (ce instanceof MethodModel mm
          && mm.methodName().stringValue().startsWith("debug")) {
        continue;
      }

      classBuilder.with(ce);
    }
  });

For simple transformations, this seems fine. However, navigating the Class-tree and examining each element creates a lot of repetitive boilerplate, and we end up a few levels deep in nested code, like in the next example.

The following code swaps the invocations of methods on Class Foo with another Class called Bar. Formatting is a little off, but I wanted to have a chance to fit in the code box:

java
ClassFile cf = ClassFile.of();
ClassModel classModel = cf.parse(bytes);

byte[] newBytes = cf.build(classModel.thisClass().asSymbol(),
    classBuilder -> {
  for (ClassElement ce : classModel) {
    if (ce instanceof MethodModel mm) {
      classBuilder.withMethod(mm.methodName(), mm.methodType(),
          mm.flags().flagsMask(), methodBuilder -> {

            for (MethodElement me : mm) {
              if (me instanceof CodeModel codeModel) {
                methodBuilder.withCode(codeBuilder -> {

                  for (CodeElement e : codeModel) {
                    switch (e) {
                      case InvokeInstruction i
                          when i.owner().asInternalName().equals("Foo")) ->
                        codeBuilder.invokeInstruction(i.opcode(), 
                                                      ClassDesc.of("Bar"),
                                                      i.name(), i.type());
                      default -> codeBuilder.with(e);
                    }
                  }
                });
              } else {
                  methodBuilder.with(me);
              }
            }
          });
      } else {
        classBuilder.with(ce);
      }
  }
    });

To untangle such complex code, we can use a lambda-based approach of transforms, just like before.

Transforms are functional interfaces and named after the element they affect (e.g., ClassTransform). They accept a builder and an element, so they can either replace the element, drop it, or pass it through to the builder.

This way, the previous deeply nested example becomes more tangible:

java
ClassFile cf = ClassFile.of();
ClassModel classModel = cf.parse(bytes);

byte[] newBytes = cf.transform(classModel, (classBuilder, ce) -> {
  if (ce instanceof MethodModel mm) {
    classBuilder.transformMethod(mm, (methodBuilder, me) -> {
      if (me instanceof CodeModel cm) {
        methodBuilder.transformCode(cm, (codeBuilder, e) -> {
          switch (e) {
            case InvokeInstruction i
                when i.owner().asInternalName().equals("Foo") ->
              codeBuilder.invokeInstruction(i.opcode(),
                                            ClassDesc.of("Bar"), 
                                            i.name().stringValue(),
                                            i.typeSymbol(),
                                            i.isInterface());
            default -> codeBuilder.with(e);
          }
        });
      } else {
          methodBuilder.with(me);
      }
    });
  } else {
    classBuilder.with(ce);
  }
});

All for-loops are gone, removing a lot of boilerplate. Still, there’s still a lot of nesting going on.

To simplify the code further, the instruction transformation can be refactored to a CodeTransform:

java
CodeTransform codeTransform = (codeBuilder, e) -> {
  switch (e) {
        case InvokeInstruction i when i.owner().asInternalName().equals("Foo") ->
            codeBuilder.invokeInstruction(i.opcode(),
                                          ClassDesc.of("Bar"),
                                          i.name().stringValue(),
                                          i.typeSymbol(),
                                          i.isInterface());
        default -> codeBuilder.accept(e);
    }
};

Then, the CodeTransform, a transform on code elements, is lifted into one on method elements:

java
MethodTransform methodTransform = MethodTransform.transformingCode(codeTransform);

The idea here is that the transformation is only applied to the related elements. In this case, the MethodTransform will only transform any Code, as it was lifted from a CodeTransform.

We don’t even have to stop here! The MethodTransform can be lifted again into a transform for Class elements:

java
ClassTransform classTransform = ClassTransform.transformingMethods(methodTransform);

After all that lifting, the previous example becomes quite straightforward:

java
// THE ACTUAL TRANSFORM
CodeTransform codeTransform = (codeBuilder, e) -> {
  switch (e) {
        case InvokeInstruction i when i.owner().asInternalName().equals("Foo") ->
            codeBuilder.invokeInstruction(i.opcode(),
                                          ClassDesc.of("Bar"),
                                          i.name().stringValue(),
                                          i.typeSymbol(),
                                          i.isInterface());
        default -> codeBuilder.accept(e);
    }
};

// LIFTING FROM CODE ELEMENTS TO METHOD ELEMENTS
MethodTransform methodTransform = MethodTransform.transformingCode(codeTransform);

// LIFTING FROM METHOD ELEMENTS TO CLASS ELEMENTS
ClassTransform classTransform = ClassTransform.transformingMethods(methodTransform);

// LOADING AND TRANSFORMING THE CLASS
ClassFile cf = ClassFile.of();
byte[] newBytes = cf.transform(cf.parse(bytes), classTransform);

Now that’s on improvement over the for-loop and even the initial lambda-based variant.

Transforms being lambdas enable another great functional feature: Composition.

java
ClassFile cc = ClassFile.of();
byte[] newBytes = cc.transform(cc.parse(bytes),
                               ClassTransform.transformingMethods(
                                MethodTransform.transformingCode(
                                  firstTransform.andThen(secondTransform))));

This way, we can build a series of simple transformations for common use-cases and compose them as needed.


Design Philosophy of the Class-File API

The new API breaks away from a lot of things we’re used to with other Class-manipulation libraries. It might seems unfamiliar at first, but it’s all aligned to the design goals and principles the OpenJDK team has defined for the new API:

  • Immutable Elements:
    To open the door for reliable and safe sharing, all Class-file elements (e.g, fields, methods, instructions, …) are represented by immutable objects.

  • Tree Structure:
    Class files are represented by a tree of elements, which might have elements of their own.

  • User-driven Navigation & Laziness:
    Navigating the tree is decided by where we, as the user, choose to go. Therefore, only the essential parts of the tree should be parsed to satisfy the current navigation requirements.

  • Unified Streaming & Materialized Views:
    The new API supports a streaming and materialized representation of a Class file, just like ASM. Thanks to immutability and laziness, though, the materialized view is less expensive than ASM.

  • Emergent Transformation:
    With Class-file parsing and generation APIs being aligned, then transformation does not require its own special mode and becomes an emergent property of the API.

  • Detail Hiding:
    Class files consist of many parts, like constant pool, bootstrap methods, etc., and it seems nonsensical to construct these manually all the time. So instead, the API will do the heavy lifting for us.

  • Modern Language Features:
    ASM was released 22 years ago, in 2002. Since then, Java has evolved significantly, including lambdas, records, select classes, pattern matching, etc., and the new API is using them all to create a more flexible, pleasant, and modern experience.


Conclusion

With Class-file manipulation and Bytecode generation, there have always been calls for an “official” library for multiple reasons, such as having a “guaranteed always up-to-date” library. And the faster release cadence of the six-month release cadence created two more reasons to consider.

First, we encounter newer Class-file formats more often, as they might change every six months.

Second, Java evolves faster than ever, so there definitely will be more format changes in shorter timeframes than before.

Thankfully, instead of trying to just “standardize” an existing solution like ASM, the OpenJDK team decided to design a modern and versatile API from the ground up.

Don’t misunderstand me here; there’s nothing wrong with standardizing an existing library for the JDK. It worked great for JSR-310 and Joda Time. However, trying to standardize a design from 22 years ago might not be the best solution going forward, and a more modern approach, even if unfamiliar at first, will be the better solution in the long run.

One thing that needs to be mentioned is that the new API does not intend to replace ASM or any other Class-file library by creating “one bytecode library to rule them all”. Each of them has their own goals, pros, and cons. Over time, though, I’m sure it will take a big bite out of the existing library share.

But looking at the ASM-related code in my projects, replacing it would be a herculean effort and will take quite some time, if it even makes sense at all. Raising the minimum Java version would also be required, which isn’t simple.

Nevertheless, JEP 457 is another welcome addition to the JDK, even though it’s still in preview. It solves the chicken-egg problem of Class-file format changes and gives us much more. The design goals and principles behind it create a powerful, easy-to-use and less error-prone way working with Class files.


A Functional Approach to Java Cover Image
Interested in using functional concepts and techniques in your Java code?
Check out my book!
Available in English, Polish, and soon, Chinese.

Resources

Looking at Java 22