We use comments to help readers of our source code to be able to more easily understand it better by explaining our intentions, clarifying, or annotating it. This is a good intention and often useful. But we tend to comment too much on the wrong things and too little of the right things.

Comments in our source code have an audience of 2: humans and compilers.

And both don’t care much about them: compilers ignore them completely (except for special macro comments or type comments, but we ignore that for now), and humans only might read them as a last resort, that’s why we need to make them count.

Even if we write pleasant and helpful comments, the topicality and correctness can’t be guaranteed over its lifetime. A single code change might eradicate all the hard work of creating good comments. Keeping them up-to-date will become a tedious task without instant gratification, so naturally, we don’t give it the attention it actually needs. Not only do we need to maintain our code, but now we also have to maintain our comments regularly, too, or they will slowly rot over time.

The best comment is no comment at all

The best and easiest way to avoid a bad comment is not to write it at all. Instead of relying too much on comments explaining intentions and clarifying hard to read source code, we must let the source code talk for itself.

This doesn’t mean to stop commenting code altogether. But we need to write our source code in a way that makes (most) comments unnecessary. If your code-base is reasonable, without hidden surprises, written in a precise and concise way, it will become self-explanatory, and as a result of this self-documenting.

Incomprehensible code equals incomprehensible comments. If we couldn’t express the reasoning behind our code with the code itself, what makes us think that we can communicate it better in English, or your language of choice?

We often fall for writing smart code doing magic-like things, and we’re happy with being so smart to do it that way. But sometimes this won’t be easily understandable for others, or even ourselves later on.

First, fix the non-self-speaking code. Then write comments about all the things our code can’t tell us by itself.

Different types of comments

Not all comments are crafted equally.

Some are useful, some are harmful, some are just necessary. We need to know when to comment and when we shouldn’t, depending on its purpose.

Documentation

Documentation is one of the most essential parts of any project, and comments are a big part of it. Comments used as documentation are supposed to help a reader of our source code to understand it without actually reading the code. But too often, we end up commenting on almost every line or code block, just repeating the code itself with new words.

There wouldn’t be a need to explain our code if it were self-explanatory. If we think about documenting blocks of code, e.g., steps of data processing, we just shouldn’t.

Here’s an example of a simple for-loop that doesn’t need comments if done right because they don’t give much additional info if we rewrite/refactor:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
public List<String> processItems(List<String> items) {  

    List<String> processedItems = new ArrayList<>();
  
    for (String item : items) {

        // Filter items
        if (item.startsWith("_") == false) {
            continue;
        }

        // Transform and add to list
        item = item.toUpperCase();
        processedItems.add();
  
        // Limit to 5 items
        if (processedItems.size() >= 5) {
            break;
        }
    }

    return processedItems;
}

A better way is to think about refactoring our code to multiple methods with self-explanatory names:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
public List<String> processItems(List<String> items) {

    List<String> processedItems = new ArrayList<>();

    for (String item : items) {

        if (shouldBeFiltered(item)) {
            continue;
        }

        processedItems.add(transform(item));

        if (limitReached(processedItems, 5) {
            break;
        } 
    }

    return processedItems;
}

That’s better, but we could do even more to make it more readable. By rewriting the source code with the Java Stream API it becomes even more self-explanatory:

1
2
3
4
5
6
7
public List<String> processItems(List<String> items) {
    return items.stream()
                .filter(item -> item.startsWith("_"))
                .map(String::toUpperCase)
                .limit(5)
                .collect(Collectors.toList());
}

Even without knowing Java Streams, you have to appreciate the much cleaner look of the source code and the easy readability. Its intent is clear, not a single comment needed.

One valid usage for documentation comments is… documentation. Putting the requirements or documentation of a method/class/interface in a comment before the code might be useful not to lose focus on the actual intent, but we still have to deal with topicality, obsolescence and the added visual noise.

Here’s an example from Apache Commons Lang3:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
/**  
 * <p>Checks if a CharSequence is not empty ("") and not null.</p>
 *  
 * <pre>
 * StringUtils.isNotEmpty(null)      = false
 * StringUtils.isNotEmpty("")        = false
 * StringUtils.isNotEmpty(" ")       = true
 * StringUtils.isNotEmpty("bob")     = true
 * StringUtils.isNotEmpty("  bob  ") = true
 * </pre>
 *  
 * @param cs  the CharSequence to check, may be null  
 * @return {@code true} if the CharSequence is not empty and not null  
 * @since 3.0 Changed signature from isNotEmpty(String) to isNotEmpty(CharSequence)  
 */  
public static boolean isNotEmpty(CharSequence cs) {  
    ...  
}

Because this comment is supposed to be consumed without reading the actual source code, it’s necessary to state all the behavior explicitly in the comment, so the IDE might display it.

Clarification / Description

No-one cares that we’ve just set an integer to a specific value, but they might be interested to know why.

Comment the intent of your code, not the actual code itself:

1
2
3
// Bad: Sets the value of size to 32
// Good: Underlying data pipeline only supports size 32
int size = 32;

We should use comments to clarify the intention of our code that can’t speak for itself after we tried our best to make it self-explanatory by refactoring or rewriting.

Good examples are regular expressions, they can easily be overwhelming when you don’t deal with them regularly, and just a variable name or named capture groups might not be enough to grasp the full intent of it. But if you decide to comment don’t just explain the different capture groups, try to tell the reader why we need the groups, not how:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
// BAD:  
// Regex for customer names   
Pattern nameRegex = Pattern.compile("(\[A-Za-z\]+)\\\s{1}(\[A-Z\]\\\.\\\s{1})?(\[A-Za-z\]+)");

// BETTER:  
// Regex for validating customer full names (e.g. "Benjamin L. Weidig", "Ben Weidig")
Pattern nameRegex = Pattern.compile(
    "(\[A-Za-z\]+)" +            // Firstname (mandatory)
    "\\\s{1}" +                  // Whitespace (mandatory, 1)
    "(\[A-Z\]{1}\\\.\\\s{1})?" + // Initial with dot and space (optional)
    "(\[A-Za-z\]+)"              // Lastname (mandatory)
);

Requirements

The worst use of comments is enforcement of requirements.

Stating special requirements or unexpected behavior in comments will eventually lead to our code blowing up. Because we can’t guarantee that other users of our code, or even ourselves in the future, will actually read or remember them. Maybe they don’t even see them if they just include it as a dependency and only consume the code.

Requirements need to be enforced by documentation and in the code, not by comments alone. If you like to add comments during development as a mental scaffold, like pseudo-code, remove them after being finished.


Conclusion

A better commenting habit has many properties:

  • Intent, not action:
    We see the code in front of us, we don’t need a comment repeating it. Instead, we need to know why the code is there in the first place. Express the intention of your code. The level of detail depends on the complexity of the code, but maybe simplifying the code is a better long-term strategy than writing a lot of comments explaining it.

  • Broken English:
    Incorrect syntax, typos, and bad grammar can make a comment likely to be misunderstood and defeat the original purpose of comments. If English (or whatever language your team prefers to document in) is not your first language, do others a favor and use a spell-checker. Especially non-native speakers will thank you for it.

  • Precise and concise:
    Comments shouldn’t outweigh the lines of code if not used for documentation purposes. It’s hard to find and read code if all you see are comments. Comments need to be an addition to the code, not a replacement. They are seen by people reading the code, not a code-less manual.

  • Know your audience:
    The need for comments, either for documentation or clarification differs widely for different target audiences of our source code. Internally used code might need more clarification than documentation. But a 3rd-party-library has way more documentation than our own code. Comment your code for the desired recipient and medium, a screen, not a stand-alone manual.