Formatting Strings With Java

 · 7 min

We all know String.format(...). But there are other options. Java has multiple ways of formatting, aligning, padding, and justifying Strings.


History of String Formatting

Programming languages have a long tradition of formatting Strings.

Over 60 years ago, Fortran was released with the FORMAT keyword:

fortran
     WRITE OUTPUT TAPE 6, 601, IA, IB, IC, AREA  
601  FORMAT (4H A= ,I5,5H  B= ,I5,5H  C= ,I5,  
   &         8H  AREA= ,F10.2, 13H SQUARE UNITS)

The more commonly known C-style format Strings, used by the printf, and Java’s String.format originated from BCPL:

BCPL
WRITEF("%I2-QUEENS PROBLEM HAS %I5 SOLUTIONS*N", NUMQUEENS, COUNT)

Java wasn’t incepted with String.format. It took eight years until the release of Java 1.5 for it to be included. Before that, java.text.MessageFormat was the way to bend Strings to your will.


Format Specifier

All format Strings start with % and consist of multiple optional parts and the actual conversion specifier.

The general syntax can be split into three different groups:

  • General, character, and numeric types %[argument_index$][flags][width][.precision]conversion
  • Date and times %[argument_idnex$][flags][width][.precision]conversion
  • Argumentless specifiers %[flags][width]conversion

The conversion argument can also be separated into different groups:

  Category | Applicable Types
-----------|--------------------------
   General | any
           |
 Character | char, Character
           | byte, Byte
           | short, Short
           | int, Integer (if Character.isValidCodePoint(int))
           |
  Integral | byte, Byte
           | short, Short
           | int, Integer
           | long, Long
           | BigInteger
 Fl. Point | float, Float
           | double, Double
           | BigDecimal
           |
 Date/Time | long, Long
           | Calendar
           | Date
           | TemporalAccessor
           |
     Other | Percent
           | Line Separator

Most conversions support lowercase and uppercase, differentiated by using the conversion specifier in lowercase or uppercase. For simplicity, only the lowercase version will be listed in the article.

If a null argument is supplied, it’s converted to “null”.

General conversions

%b | null defaults to "false"
   | if boolean or Boolean, String.valueOf(arg) or "true"
   |
%h | Integer.toHexString(arg.hashCode())
   |
%s | if java.util.Formattable is implemented, arg.formatTo is invoked
   | otherwise arg.toString is invoked

Examples:

java
String.format("%b", "value")
// ==> "true"

String.format("%h", 255)
// ==> "ff"

Characters

%c | null defaults to "null"
   | Converted to unicode
   | (e.g. 0x2603 --> ☃)

Integrals

%d | Decimal integer
%o | Octal integer
%x | Hexdecimal integer

Examples:

java
String.format("%d", 128)
// ==> "128"

String.format("%o", 128)
// ==> "200"

String.format("%x", 128)
// ==> "80"

Floating-point

%e | Scientific notation
%f | Decimal number
%g | Decimal format, or scientific notation, depending on the precision
%a | Hexdecimal floating-point with significand and exponent
   | BigDecimal is not supported

Examples:

java
String.format("%e", 3.141)
// ==> "3.141000e+00"

String.format("%f", 3.141)
// ==> "3.141000"

String.format("%g", 3.141)
// ==> "3.141000"

String.format("%a", 3.141)
// ==> "0x1.920c49ba5e354p1"

Date / time

Date and time format specifiers are prefixed with %t, followed by the specific part.

Many of the conversions are locale-specific and default to the default locale of the JVM if no alternative is provided.

%tF | ISO 8601, equals "%tY-%tm-%td"
%tc | Date/time, equals "%ta %tb %td %tT %tZ %tY",
    | e.g., "Sun Apr 10 20:17:36 CET 2020".

Other conversions

%% | Literal '%'
%n | System-dependant line separator

Flags

Flags can be used to modify the conversion:

F = Flag
G = General
C = Character
I = Integral
F = Floating point
D = Date/Time

  F  | G | C | I | F | D | Description
-----|---|---|---|---|---| ---------------------------------
 '-' | x | x | x | x | x | Left-justified
 '#' | 1 | - | 3 | x | - | Conversion-dependant alternate-form
 '+' | - | - | 4 | x | - | Include sign
 ' ' | - | - | 4 | x | - | Leading space for positive values
 '0' | - | - | x | x | - | Zero-padded
 ',' | - | - | 2 | 5 | - | Include locale-specific separators
 '(' | - | - | 4 | 5 | - | Enclose negative values in parenthesis

1: java.util.Formattable dependent
2: Only '%d'
3: Only '%o' and '%x'
4: '%d', '%o', and '%x' for java.math.BigInter,
   '%d' for byte, Byte, short, Short, int and Integer, long, and Long
5: Only '%e', '%f', and '%g'

Width

The absolute width of the output can specific, except for the line separator conversion:

java
String.format("%5d", 42)
// ==> "   42"

String.format("%-5d", 42)
// ==> "42   "

Precision

The behaviour is dependent on the conversion type.

For general argument types, the precision is the number of resulting formatted characters.

The floating-point conversions (%a, %e, %f) are restricting the number of digits after the decimal separator.

A special case is %g, which will define the number of digits in the resulting magnitude after rounding.

For the other conversions (character, integral, date/time, and the percent and line separator conversions) specifying a precision is not applicable, and will throw an exception.

Argument index

The argument index indicates the position of the argument in the argument list, e.g., 1$ for the first, 2$ for the second, and so forth.

This way, we can use different format specifiers without changing the actual argument list.

If we want to reuse the previously used argument position, we can use < (\u003c) instead:

java
Calendar cal = Calendar.getInstance();
cal.set(2020, 6, 28);

String s1 = String.format("My birthday: %1$tm %1$te,%1$tY", cal);
String s2 = String.format("My birthday: %1$tm %<te,%<tY", cal);

s1.equals(s2);
// ==> true

Formatting Options

java.lang.String.format

The most simple formatting options available are these two static methods:

Actually, these are just delegating the work to java.util.Formatter:

java
static String format(String format,
                     Object... args) {
    return new Formatter().format(format, args).toString();
}

static String format(Locale l,
                     String format,
                     Object... args) {
    return new Formatter(l).format(format, args).toString();
}

If no java.util.Locale is provided, the JVM default is used.

java.util.Formatter

The java.util.Formatter class is the actual interpreter for printf format specifiers and is doing all the heavy lifting in our format needs.

Internally it all comes down to the java.lang.Appendable interface and a java.util.Locale.

With its multiple constructors, java.util.Formatter can have a different target for the formatted result:

java
// Backed by a new StringBuilder instance
Formatter()
Formatter(Locale l)

// Backed by the provided Appendable or implementation
Formatter(Appendable a)
Formatter(Appendable a, Locale l)
Formatter(PrintStream ps)


// Backed by a BufferedWriter, writes to File
Formatter(File file)
Formatter(File file, String charset)
Formatter(File file, String charset, Locale l)
Formatter(String fileName)
Formatter(String fileName, String csn)
Formatter(String fileName, String csn, Locale l)

// Backed by a BufferedWriter
Formatter(OutputStream os)
Formatter(OutputStream os, String csn)
Formatter(OutputStream os, String csn, Locale l)

Both format methods we know from java.lang.String are present, but instead of returning a new String, the Formatter instance is returned.

Depending on the backing, java.lang.Appendable additional calls to flush() and close().

System.out.printf

The method System.out.printf uses a java.util.Formatter internally, just like java.lang.String.

The big difference is the calls are synchronized to the System.out, which is a java.io.PrintStream.


java.text.MessageFormat

We now have learned about the different ways to use printf format specifiers. But Java had a more natural-language-based way of formatting Strings.

Instead of a rather cryptic format String, simpler specifiers in curly braces are used:

{argument index , format type , style }

The argument index is mandatory. The format type and style are optional, but style can’t be used alone. It must always follow a format type.

Format types and corresponding styles

The different types support different styles:

Type: none

  Style | Actual Subformat
--------|------------------
 (none) | null

Type: number

    Style | Actual Subformat
----------|-----------------------------------------------
   (none) | NumberFormat.getInstance(getLocale())
  integer | NumberFormat.getIntegerInstance(getLocale())
 currency | NumberFormat.getCurrencyInstance(getLocale())
  percent | NumberFormat.getPercentInstance(getLocale())

Type: date

  Style | Actual Subformat
--------|------------------------------------------------------------
 (none) | DateFormat.getDateInstance(DateFormat.DEFAULT, getLocale())
  short | DateFormat.getDateInstance(DateFormat.SHORT, getLocale())
 medium | DateFormat.getDateInstance(DateFormat.DEFAULT, getLocale())
   long | DateFormat.getDateInstance(DateFormat.LONG, getLocale())
   full | DateFormat.getDateInstance(DateFormat.FULL, getLocale())

In addition to these predefined styles, we can also provide our own subformat pattern:

   Type | Actual Subformat
--------|---------------------------------------------------------
 number | new DecimalFormat(pattern, DecimalFormatSymbols.getInstance(getLocale()))
   date | new SimpleDateFormat(pattern, getLocale())
   time | new SimpleDateFormat(pattern, getLocale())
 choice | new ChoiceFormat(pattern)

Example:

java
MessageFormat.format("Price: {0,number,#.##} EUR", 3.555);
// ==> Price: 3.56 EUR

Conclusion

Now we know about the different ways to format Strings. Which one to choose depends on our requirements. java.text.MessageFormat is easier on the eyes, especially if you’re not fluent in printf. But String.format is widely used and understood, so we can’t go wrong with it. And the printf format specifiers aren’t only used by Java, so we learn a more universal skill.


Resources

Java Documentation (Oracle)

Other


A Functional Approach to Java Cover Image
Interested in using functional concepts and techniques in your Java code?
Check out my book!
Available in English, Polish, Korean, and soon, Chinese.