Formatting Strings With Java

We all know String.format(...)
.
But there are other options. Java has multiple ways of formatting, aligning, padding, and justifying Strings.
Table of Contents
History of String Formatting
Programming languages have a long tradition of formatting Strings.
Over 60 years ago, Fortran
was released with the FORMAT
keyword:
WRITE OUTPUT TAPE 6, 601, IA, IB, IC, AREA
601 FORMAT (4H A= ,I5,5H B= ,I5,5H C= ,I5,
& 8H AREA= ,F10.2, 13H SQUARE UNITS)
The more commonly known C
-style format Strings, used by the printf
, and Java’s String.format
originated from BCPL
:
WRITEF("%I2-QUEENS PROBLEM HAS %I5 SOLUTIONS*N", NUMQUEENS, COUNT)
Java wasn’t incepted with String.format
.
It took eight years until the release of Java 1.5 for it to be included.
Before that, java.text.MessageFormat
was the way to bend Strings to your will.
Format Specifier
All format Strings start with %
and consist of multiple optional parts and the actual conversion specifier.
The general syntax can be split into three different groups:
- General, character, and numeric types
%[argument_index$][flags][width][.precision]conversion
- Date and times
%[argument_idnex$][flags][width][.precision]conversion
- Argumentless specifiers
%[flags][width]conversion
The conversion argument can also be separated into different groups:
Category | Applicable Types
-----------|--------------------------
General | any
|
Character | char, Character
| byte, Byte
| short, Short
| int, Integer (if Character.isValidCodePoint(int))
|
Integral | byte, Byte
| short, Short
| int, Integer
| long, Long
| BigInteger
Fl. Point | float, Float
| double, Double
| BigDecimal
|
Date/Time | long, Long
| Calendar
| Date
| TemporalAccessor
|
Other | Percent
| Line Separator
Most conversions support lowercase and uppercase, differentiated by using the conversion specifier in lowercase or uppercase. For simplicity, only the lowercase version will be listed in the article.
If a null
argument is supplied, it’s converted to “null”.
General conversions
%b | null defaults to "false"
| if boolean or Boolean, String.valueOf(arg) or "true"
|
%h | Integer.toHexString(arg.hashCode())
|
%s | if java.util.Formattable is implemented, arg.formatTo is invoked
| otherwise arg.toString is invoked
Examples:
String.format("%b", "value")
// ==> "true"
String.format("%h", 255)
// ==> "ff"
Characters
%c | null defaults to "null"
| Converted to unicode
| (e.g. 0x2603 --> ☃)
Integrals
%d | Decimal integer
%o | Octal integer
%x | Hexdecimal integer
Examples:
String.format("%d", 128)
// ==> "128"
String.format("%o", 128)
// ==> "200"
String.format("%x", 128)
// ==> "80"
Floating-point
%e | Scientific notation
%f | Decimal number
%g | Decimal format, or scientific notation, depending on the precision
%a | Hexdecimal floating-point with significand and exponent
| BigDecimal is not supported
Examples:
String.format("%e", 3.141)
// ==> "3.141000e+00"
String.format("%f", 3.141)
// ==> "3.141000"
String.format("%g", 3.141)
// ==> "3.141000"
String.format("%a", 3.141)
// ==> "0x1.920c49ba5e354p1"
Date / time
Date and time format specifiers are prefixed with %t
, followed by the specific part.
Many of the conversions are locale-specific and default to the default locale of the JVM if no alternative is provided.
%tF | ISO 8601, equals "%tY-%tm-%td"
%tc | Date/time, equals "%ta %tb %td %tT %tZ %tY",
| e.g., "Sun Apr 10 20:17:36 CET 2020".
Other conversions
%% | Literal '%'
%n | System-dependant line separator
Flags
Flags can be used to modify the conversion:
F = Flag
G = General
C = Character
I = Integral
F = Floating point
D = Date/Time
F | G | C | I | F | D | Description
-----|---|---|---|---|---| ---------------------------------
'-' | x | x | x | x | x | Left-justified
'#' | 1 | - | 3 | x | - | Conversion-dependant alternate-form
'+' | - | - | 4 | x | - | Include sign
' ' | - | - | 4 | x | - | Leading space for positive values
'0' | - | - | x | x | - | Zero-padded
',' | - | - | 2 | 5 | - | Include locale-specific separators
'(' | - | - | 4 | 5 | - | Enclose negative values in parenthesis
1: java.util.Formattable dependent
2: Only '%d'
3: Only '%o' and '%x'
4: '%d', '%o', and '%x' for java.math.BigInter,
'%d' for byte, Byte, short, Short, int and Integer, long, and Long
5: Only '%e', '%f', and '%g'
Width
The absolute width of the output can specific, except for the line separator conversion:
String.format("%5d", 42)
// ==> " 42"
String.format("%-5d", 42)
// ==> "42 "
Precision
The behaviour is dependent on the conversion type.
For general argument types, the precision is the number of resulting formatted characters.
The floating-point conversions (%a
, %e
, %f
) are restricting the number of digits after the decimal separator.
A special case is %g
, which will define the number of digits in the resulting magnitude after rounding.
For the other conversions (character, integral, date/time, and the percent and line separator conversions) specifying a precision is not applicable, and will throw an exception.
Argument index
The argument index indicates the position of the argument in the argument list, e.g., 1$
for the first, 2$
for the second, and so forth.
This way, we can use different format specifiers without changing the actual argument list.
If we want to reuse the previously used argument position, we can use <
(\u003c
) instead:
Calendar cal = Calendar.getInstance();
cal.set(2020, 6, 28);
String s1 = String.format("My birthday: %1$tm %1$te,%1$tY", cal);
String s2 = String.format("My birthday: %1$tm %<te,%<tY", cal);
s1.equals(s2);
// ==> true
Formatting Options
java.lang.String.format
The most simple formatting options available are these two static methods:
Actually, these are just delegating the work to java.util.Formatter
:
static String format(String format,
Object... args) {
return new Formatter().format(format, args).toString();
}
static String format(Locale l,
String format,
Object... args) {
return new Formatter(l).format(format, args).toString();
}
If no java.util.Locale
is provided, the JVM default is used.
java.util.Formatter
The java.util.Formatter
class is the actual interpreter for printf
format specifiers and is doing all the heavy lifting in our format needs.
Internally it all comes down to the java.lang.Appendable
interface and a java.util.Locale
.
With its multiple constructors, java.util.Formatter
can have a different target for the formatted result:
// Backed by a new StringBuilder instance
Formatter()
Formatter(Locale l)
// Backed by the provided Appendable or implementation
Formatter(Appendable a)
Formatter(Appendable a, Locale l)
Formatter(PrintStream ps)
// Backed by a BufferedWriter, writes to File
Formatter(File file)
Formatter(File file, String charset)
Formatter(File file, String charset, Locale l)
Formatter(String fileName)
Formatter(String fileName, String csn)
Formatter(String fileName, String csn, Locale l)
// Backed by a BufferedWriter
Formatter(OutputStream os)
Formatter(OutputStream os, String csn)
Formatter(OutputStream os, String csn, Locale l)
Both format
methods we know from java.lang.String
are present, but instead of returning a new String, the Formatter
instance is returned.
Depending on the backing, java.lang.Appendable
additional calls to flush()
and close()
.
System.out.printf
The method System.out.printf
uses a java.util.Formatter
internally, just like java.lang.String
.
The big difference is the calls are synchronized to the System.out
, which is a java.io.PrintStream
.
java.text.MessageFormat
We now have learned about the different ways to use printf
format specifiers.
But Java had a more natural-language-based way of formatting Strings.
Instead of a rather cryptic format String, simpler specifiers in curly braces are used:
{argument index , format type , style }
The argument index is mandatory. The format type and style are optional, but style can’t be used alone. It must always follow a format type.
Format types and corresponding styles
The different types support different styles:
Type: none
Style | Actual Subformat
--------|------------------
(none) | null
Type: number
Style | Actual Subformat
----------|-----------------------------------------------
(none) | NumberFormat.getInstance(getLocale())
integer | NumberFormat.getIntegerInstance(getLocale())
currency | NumberFormat.getCurrencyInstance(getLocale())
percent | NumberFormat.getPercentInstance(getLocale())
Type: date
Style | Actual Subformat
--------|------------------------------------------------------------
(none) | DateFormat.getDateInstance(DateFormat.DEFAULT, getLocale())
short | DateFormat.getDateInstance(DateFormat.SHORT, getLocale())
medium | DateFormat.getDateInstance(DateFormat.DEFAULT, getLocale())
long | DateFormat.getDateInstance(DateFormat.LONG, getLocale())
full | DateFormat.getDateInstance(DateFormat.FULL, getLocale())
In addition to these predefined styles, we can also provide our own subformat pattern:
Type | Actual Subformat
--------|---------------------------------------------------------
number | new DecimalFormat(pattern, DecimalFormatSymbols.getInstance(getLocale()))
date | new SimpleDateFormat(pattern, getLocale())
time | new SimpleDateFormat(pattern, getLocale())
choice | new ChoiceFormat(pattern)
Example:
MessageFormat.format("Price: {0,number,#.##} EUR", 3.555);
// ==> Price: 3.56 EUR
Conclusion
Now we know about the different ways to format Strings. Which one to choose depends on our requirements.
java.text.MessageFormat
is easier on the eyes, especially if you’re not fluent in printf
.
But String.format
is widely used and understood, so we can’t go wrong with it.
And the printf
format specifiers aren’t only used by Java, so we learn a more universal skill.
Resources
Java Documentation (Oracle)
Other
printf
Format String (Wikipedia)- Guide to
java.util.Formatter
(Baeldung) - Formatting with
printf
(Baeldung)