Better Code Snippets in JavaDoc

2024-03-06 · 19 min

Effective documentation is an essential staple of every software project. Still, way too often, we neglect it at the moment. If we’re lucky, our future selves will take care of it. But let’s be honest here for a moment… it won’t happen!

Consequently, any improvement that makes writing documentation more straightforward and easily accessible is an important and welcomed one.

Table of Contents

Let’s discuss some of the (relatively) recent enhancements to JavaDoc power-charge it.

History of JavaDoc

Java has done an excellent job providing tools for documentation by making JavaDoc an early language feature. Instead of needing technical writers to keep the documentation in sync with the actual code – often with standalone authoring tools – the documentation lives directly next to the code it documents. This tightly coupled documentation system is then utilized to generate comprehensive documentation.

Technical Architecture

Writing documentation is done with the /** ... */ (slash double asterisk) multi-line comment tag. The extra *(asterisk) at the beginning discerns “normal” multi-line comments (/* ... */) from comments that should be processed by JavaDoc.

JavaDoc comments usually consist of two parts: descriptive text and metadata. Let’s start with the latter first.

Metadata is provided with a multitude of @<field> tags:

java

/**
 * @param   i   an {@code int}.
 * @return  a string representation of the {@code int} argument.
 * @see     java.lang.Integer#toString(int, int)
 */
public static String valueOf(int i) {
    return Integer.toString(i);
}

The @<field> tag are rendered distinctively from the descriptive text, highlighting thinks like required arguments, return values, thrown Exceptions, and further documentation/methods that might be beneficial.

Typically, preceding the metadata is the textual description. It’s based on HTML, with all the implications that entails. For example, you might need to use <p> tags for new paragraphs, not just line breaks, to readability reasons.

To improve the description further and connect different parts of our documentation, we use certain inline @<field> tags. The inline {@link} tag is commonly used for, well, linking to another content, and the block tag @see as a standalone reference.

Here’s an example of JavaDoc:

java

/**
 * Returns a string whose value is this string, with all leading
 * and trailing {@linkplain Character#isWhitespace(int) white space}
 * removed.
 * <p>
 * If this {@code String} object represents an empty string,
 * or if all code points in this string are
 * {@linkplain Character#isWhitespace(int) white space}, then an empty string
 * is returned.
 * <p>
 * Otherwise, returns a substring of this string beginning with the first
 * code point that is not a {@linkplain Character#isWhitespace(int) white space}
 * up to and including the last code point that is not a
 * {@linkplain Character#isWhitespace(int) white space}.
 * <p>
 * This method may be used to strip
 * {@linkplain Character#isWhitespace(int) white space} from
 * the beginning and end of a string.
 *
 * @return  a string whose value is this string, with all leading
 *          and trailing white space removed
 *
 * @see Character#isWhitespace(int)
 *
 * @since 11
 */
public String strip() {
  //...
}

And here’s the output:

JavaDoc documentation for String#strip() — JavaDoc documentation for `String#strip()`

We have good-looking documentation here, but there’s still room for improvement!

Structured Access to Documentation (Java 8)

Having a well-organized and defined documentation system offers benefits beyond simply producing documentation that is easier to understand and straightforward to consume, as you know what to expect. It also makes the documentation accessible not just to other developers but to automated systems as well.

Java 8 delivered important JEPs that immensely improved the process of working with and generating documentation.

JEP 172 introduced “DocLint”, a, you might have guessed it already from its name, linter for JavaDoc. Prior to this, there was little to no validation, leading to any issue in the source documentation ending up in the generated HTML, like broken links, invalid/unescaped entities, etc.

The linter was actually built on another documentation-related Java 8 feature: the new DocTree API (JEP 105).

The DocTree API provides structured access to JavaDoc comments for any tool, not only the JDK. This enables a range of functionalities, including custom tags, beyond just linting.

Improving HTML Output (Java 9)

While Java 8 focused on enhancing the toolchain for documentation, Java 9 aimed to refine the output itself, introducing two significant JEPs.

JEP 224 added HTML5-compliant output. Previously, the output adhered to HTML 4.1, a dated standard missing several contemporary features, notably in accessibility. From Java 10 onwards, HTML5 became the default output format.

JEP 225 added a client-side search box. At a cursory look, this feature might appear minor compared to others. But don’t underestimate the impact of embedding a search feature without needing any additional server-side infrastructure!

Documentation is often deployed as static content because that’s essentially what it is. Searching through such content traditionally requires server-side processing, which can significantly complicate the deployment process for an optimal user experience.

By adding client-side/JS-based search functionality out-of-the-box, the need for a more complex hosting solution is negated, and we still only need a bare webserver to host documentation. For instance, my own blog’s search is client-side-only, so it doesn’t need anything but the cheapest, tiniest cloud server out there.

Does a client-side search deliver the best results? Not by a long shot…

But does it significantly enhance user accessibility without complicating hosting? Absolutely!

The javadoc tool generates the search index and necessary client-side JavaScript by default. If we don’t want the search box, we can use the -noindex option.

To many Java developers, the improvements discussed thus far may not come as a surprise. We’ve all seen nicely formatted and navigatable documentation for the JDK or popular frameworks and libraries, either on the web or rendered nicely directly in our IDEs.

One notable enhancement in the realm of documentation was JEP 221, the new Doclet API. Moving the “old” API from the com.sun.javadoc to the jdk.javadoc.doclet package was an overdue maintenance chore to reduce legacy code and outdated APIs. Although the older API remained accessible, it got marked “deprecated for removal” in the future.

Let’s pivot to discussing JavaDoc features that are newer and, perhaps, unfamiliar to you!

With Java’s accelerated release schedule, keeping pace with the plethora of new features and JEPs can be challenging, especially considering that many features evolve over several versions until they are delivered.

That’s why the next JavaDoc enhancement took 9 more versions after its predecessor! Although, to be fair, going from Java 9 to 18 only took ~1.4x as long as from 8 to 9.

Working with Code Snippets (before JEP 413)

Up to this point, adding code snippets to our documentation was a cumbersome and error-prone task:

java

/**
 * <pre>
 * &#64;Override
 * public List&#60;String&#62; toStringElements() {
 *     // ...
 * }
 * </pre>
 */

This simple snippet already has several issues:

As it’s HTML-based, the <pre> tag renders “as-is”, including indention and the need for escaping certain HTML entities like < or &.
No indication of the language used, making syntax highlighting an impossible task, or we need to assume that only Java code exists.
No way to highlight or link certain parts of the code.

To mitigate at least a little bit, we could add the {@code} tag:

java

/**
 * <pre>{@code
 * {@literal @Override}
 * public List<String> toStringElements() {
 *     // ...
 * }
 * }</pre>
 */

We no longer need to escape HTML entities, except the @ of @Override, as it has a special meaning for JavaDoc. Still, not pretty, not simple, with multiple issues remaining!

Java 18 released a great new feature in the form of JEP 413: the {@snippet} tag for simplifying code snippets.

The {@snippet} Tag (Java 18+)

Let’s take a quick peek at an example before going over the details:

java

/**
 * A demo of the new JavaDoc block tag for snippets.
 *
 * {@snippet id='demo' lang='java' :
 * public void snippetsSpecialCharacters() {
 *     System.out.println("No more escaping! @ & 🎉");
 * }
 * }
 */
public void newSnippetTag() { /* ... */ }

And here’s the output:

JavaDoc documentation using {@snippet} — JavaDoc documentation using `{@snippet}`

Looks way nicer and cleaner, is easier to type, and adds additional information. Unlike its older sibling {@code}, it respects indention based on the location of the closing curly brace. If you add whitespace to your code that’s deeper than the curly brace, it will end up in the output.

Without requiring the <pre> tag anymore, we also don’t need to escape anything (including @), as the content is no longer treated as HTML.

Other improvements include the gray background, making it easier to read, and the “copy to clipboard” button in the top right corner, improving DX.

Inline Snippets

Embedding code examples directly within documentation has been the norm, especially since there wasn’t another way before the {@snippet} tag.

To start a code snippet, just use the {@snippet} tag, add any of the optional attributes, and finally, add a : (colon). While the colon’s role might seem unusual at first, it is crucial, signaling that the content from the following line up to the corresponding closing curly bracket for the {@snippet} is the actual code snippet. That means the code must be balanced regarding curly braces, or the snippet block might end too early.

The tag has several attributes in the form of name=value pairs, with value needing to be wrapped by single or double quotes. The code itself is modifiable with supplemental @<field> tags that work as render interception points.

For inline snippets, the two attributes used in the previous example are: id and lang.

The first, id, is used to identify the snippet and serves as a reference anchor for both the documentation itself and the generated HTML output, becoming the id attribute of the corresponding <pre> tag.

The latter, lang, identifies the language of the code snippet, adding a CSS class language-<value of lang> to the <code> tag of the output.

Be aware that specifying the language doesn’t automatically add any form of syntax highlighting. Nevertheless, it serves as valuable metadata for other tools that may interact with or alter the documentation. By default, the documentation does not include any language-specific classes

Regions

To understand how the render intercepting supplemental @<field> tags work, we must first look at the concept of “regions”.

A code snippet region defines a range of lines and creates a scope usable by other actions, like highlighting or referencing.

Regions are created by using the optional region=<name> attribute to start a named scope and are terminated with @end region="<name>. Specifying the name is optional, as anonymous regions are possible. It’s used to create overlapping regions, which can be helpful in certain scenarios. But I would advise you to refrain from using them, as it tends to get convoluted and confusing.

Later, in the “External Snippet” section, you’ll learn more about using regions.

Markup Tags

One downside (so far) is that regions are still a 1:1 representation of the code. But documentation is supposed to confer more than “just” code, so wouldn’t it be nice to have a way to impact the rendering of the snippets?

Highlighting Code

The @highlight tag is either used to affect the current line or its current region scope:

java

/**
 * {@snippet :
 * public static void main(String... args) { // @highlight substring="main"
 *     for (var arg : args) { // @highlight region regex="\barg\b" type="highlighted"
 *         if (!arg.isBlank()) {
 *             System.out.println(arg);
 *         }
 *     }                      // @end
 * }
 * }
 */
public void highlight() { /* ... */ }

The generated output:

If the tag has no region attribute, it only affects the current line.

With a region attribute, it affects all code until the corresponding @end, in this case, an anonymous region.

The more interesting attributes are substring, regex, and type.

To select what is supposed to be highlighted, we use either substring for literal matches, or regex for more fine-grained control.

There are three different types of highlighting available, specified with the optional type attribute: bold (default), italic, and highlighted. The types correspond to CSS classes in the generated output.

At this point, the new {@snippet} already solves many of the previous woes and even adds new functionality. An important issue remains: how do we modify the displayed code?

Modifying the Displayed Code

To confer the actual point of a code snippet, we often reduce them to their bare minimum, add pseudo/invalid code to make a point, use ellipses, etc. Yet, this approach clashes with the snippet being a fully valid code example that we (or another tool) can actually verify. The implications of this will become clearer when we delve into the topic of external code snippets.

Thankfully, the supplemental tags got us covered in the form of @replace:

java

/**
 * {@snippet :
 * class HelloWorld {
 *     public static void main(String... args) {
 *         System.out.println("Hello World!");  // @replace regex='".*"' replacement="..."
 *     }
 * }
 * }
 */
public void replace() { /* ... */ }

And the generated output:

The regex and replacement attributes call out to String::replaceAll, which means we also have access to all of its features, like using substitution groups and the matched text for replacements.

java

/**
 * {@snippet :
  * class HelloWorld {
  *     public static void main(String... args) {
  *         System.out.println("Hello World!");  // @replace regex='"(.*)"' replacement='"I said, $1"'
  *     }
  * }
  * }
  */
public void replaceWithGroup() { /* ... */ }

Further, content is removed by using replacement="", or inserted by targeting a commented line with regex="// " replacement="".

Like @highlight, there’s a substring attribute for simpler 1:1 matching. Also, the same region-scoping rules apply.

Connecting the Dots

Another important aspect that elevates documentation from “good enough” to “awesome” is its interconnection. Navigating between related types and documentation is an essential part of discovery and accessibility.

Don’t worry, once again, {@snippet} got us covered.

JavaDoc already has the {@link} and {@linkplain} tags to be used in the descriptive text of a documentation comment. So, it was a natural evolution to include this functionality (with a few enhancements) in code snippets, too.

Like its inspiration, @link supports a target attribute, but it needs to be specified explicitly. Unlike {@link}, however, it also supports the same selection attributes we already discussed for @highlight and @replace, so only the relevant parts of the snippet get linked:

java

/**
 * {@snippet :
 * System.out.println("Link to println"); // @link substring="println" target="java.io.PrintStream#println(String)"
 * }
 */
public void link() { /* ... */ }

Here’s the output:

The linked target’s package must either be imported or a fully-qualified name is required.

The ability to connect code snippets further with other documentation increases their value immensely.

Got a special type that might be worth checking out? Link it!
A certain code construct is explained somewhere else? Link it!

External Snippets

The new {@snippet} tag already mitigates many of the previous issues, but not all of them. First, the code must be balanced regarding curly braces, as the first closing bracket matching the opening {@snippet tag finishes the content. And second, some things are still taboo in the content, like /** ... */.

That’s where external code snippets come into play.

An external snippet is simply a code fragment defined in another file. Three attributes reference such a fragment: file or class, and optionally region.

java

/**
 * This includes the whole file.
 * {@snippet file="ExternalSnippets.java"}
 */
public void wholeFile() { /* ... */  }

/**
 * Include a specific class.
 * {@snippet class="ExternalSnippets"}
 */
public void specificClass() { /* ... */ }

/**
 * And this only the requested region.
 * {@snippet file="ExternalSnippets.java" region="snippet01"}
 */
public void regionOnly() { /* ... */ }

The file attribute is self-explanatory. It looks up a snippet file, but where should it be located?

By default, javadoc searches a folder called snippet-files at the same level as the package containing the code using the referenced external snippet.

Alternatively, the javadoc option --snippet-path sets the location manually.

If the documented class is located at src/com/beliefdrivendesign/App.java, that means javadoc looks up src/com/benweidig/snippet-files/ by default:

.
└─ src
   └─ com
      └─ beliefdrivendesign
         ├─ App.java
         └─ snippet-files
            ├─ ExternalSnippet.java
            └─ ...

To make this work, the --source-path option must be provided to javadoc.

The class attribute references a class by its name instead of the filename.

Using a whole file for an external code snippet is the default if region is omitted, but we wouldn’t gain much from that approach, as we’d need a lot of different snippet files.

Instead, if we define regions in the snippet file, similar to the supplemental tags discussed before. Then, only these regions can be targeted to be included. This way, all code snippets can live in a valid class, and we can automatically verify the overall correctness and still only use a part of it for documentation purposes. With the help of the supplemental tags, the snippets are modified further to be more useful as documentation.

To mark a region for reference purposes, use @start and @end:

java

package com.benweidig;

public class ExternalSnippet {

    // @start region="snippet01"
    /**
     * The first snippet.
     */
    public void snippet01() {
        System.out.println("Hello World!");  // @replace regex='".*"' replacement='"..."'
    }
    // @end

    // @start region="snippet02"
    /**
     * The second snippet.
     */
    public void snippet02() {
        System.out.println("Unbalanced Hello World!"); // @end region="snippet02"
    }

    public void snippet03() { // @start region="snippet03"
        System.out.println("Hello World!"); // @replace regex='".*"' replacement='"..."'
    } // @end
    
}

Let’s reference the unbalanced snippet02:

java

/**
 * This will render an unbalanced snippet by class name.
 * {@snippet class="ExternalSnippets" region="snippet02"}
 */
public void unbalanced() { /* ... */ }

Here’s the output:

As before, ending a region can optionally contain its name, so overlapping regions are possible.

The main advantage of named regions is using a single file for multiple code snippets. Imagine, for instance, having a complex type with various use cases and edge cases we want to document comprehensively. Thanks to JEP 445, “Unnamed Classes and Instance Main Methods (Preview)”, the additional file needed is free of any unnecessary boilerplate and still have a new method/region for each of the use- and edge cases.

Even without this preview feature, enclosing our snippet code within a simple class does not significantly increase complexity for its intended use.

Now, we can structure our code snippets any way we want and verify them automatically, wether by compiling the code to ensure the API hasn’t changed or possibly by unit-test them. Keep in mind, though, that the Standard Doclet won’t compile or test our snippets, and it requires external tools to make that happen.

This approach future-proofs the code snippets in a way the actual code is. Any breaking change will automatically invalidate a snippet, forcing us to update the documentation.

No more outdated or broken code snippets!

Nonetheless, the situation with inline snippets is somewhat more complex. We could use the Compiler and Compiler Tree APIs to parse our code into syntax trees and extract the documentation, detect code snippets, detect their language, package them into compilable units, etc…

So, if you want to verify your code snippets automatically, I’d recommend sticking to external snippets.
Or you could read the next section.

Hybrid Snippets

External snippets offer the benefit of being verifiable with minimal additional work. On the other hand, inline snippets are straightforward to implement and allow for the inclusion of documentation directly within the code without needing any additional tooling or files. Given that both approaches have their respective advantages and drawbacks, why not employ them together?

With a bit of added inconvenience, we can use a combination of inline and external snippets for the same documented element. When referencing an external snippet, we can still add the code inline, too:

java

/**
 * {@snippet class="ExternalSnippets" region="snippet03" :
  * public void snippet03() {
  *     System.out.println("...");
  * }}
  */
public void hybrid() {
    // ...
}

The inline code must match the external code excatly! That means you might need @start and @end as end-of-line comments. @replace does not have to be in the inline code if it reflects the end result.

Your first thought seeing this is most likely that code duplication is bad. However, the Standard Doclet verifies that the external snippet matches the inline code.

This approach maintains the verifiability benefits of an external snippet while also incorporating the code directly into the documentation. To minimize ongoing maintenance, though, I’d recommend converting external snippets to hybrid ones when a feature is finished, so you don’t need to constantly keep them aligned.

The hybrid approach seems to still have some issues and might not work as expected in certain cases. See JDK-8304408.

JEP 413 at a Glance

That was a lot of information and examples for what boils down to a singular new JavaDoc tag accompanied by a few supplements. But in my opinion, this change is a significant one that will change how we write documentation going forward.

The new tag simultaneously simplified the way code is included/written in documentation comments and also power-charged it with additions like highlighting, replacements, and linking to other content.

Here’s a cheat sheet of the available tags and options related to {@snippet}:

Tag	Attribute	Description
`{@snippet}`	`class`	Reference external snippet by class name
	`file`	Reference external snippet by filename
	`id`	Snippet identifier. Becomes HTML `id`
	`lang`	Snippet language. Becomes CSS class `language-<lang>` on `<code>` tag.
	`region`	Region name of external snippet
`@start`	`region`	Starts a new named region
`@end`	`name`	Name of the region to end
`@highlight`	`substring`	Literal text to be highlighted
	`regex`	Regular expression for highlighting
	`region`	Defines the region scope to be used
	`type`	Type of highlighting (`bold`, `italic`, `highlighted`)
`@replace`	`substring`	Literal text to be replaced
	`regex`	Regular expression to be replaced
	`replacement`	The replacement text for matches
	`region`	Defines the region scope to be used
`@link`	`substring`	Literal text to be linked
	`regex`	Regular expression for linking
	`region`	Defines the region scope to be used
	`target`	Link target, equivalent to `{@link ...}`
	`type`	Type of the link. `link` or `linkplain`.

Conclusion

I’ve written about (the problems of) documenting your code before, and I still believe that the best documentation is not to need documentation in the first place. This can be achieved by designing and writing our code to be self-documenting by choosing better names, abstractions, and overall structure. However, that doesn’t work for all documentation needs.

How do you confer edge cases?
How do you add metadata and connect different parts?

That’s where JavaDoc truly excels. JavaDoc and the Doclet API blur the line that previously separated code and documentation, and making it available to the toolchain, too. This allows for tighter integration, so the documentation can concentrate more on the “why” than just the “how” and “what” of our code without the fear of drifting apart.

Resources

Code examples

Java 18

Java 9

Java 8

#java

Support Me on Ko-fi