Languages Around The World

Formatting Messages

Overview

Messages are a concatenation of strings, numbers, and dates that present a complex formatting challenge——how to put together the sequences of strings, numbers, dates, and other formats to create language-neutral messages. Localization is facilitated because there is no required hard coding message strings or concatenation sequences. ICU has two classes used to create language-neutral messages:

The MessageFormat class facilitates localization by preventing the concatenation of message strings. This class enables localizers to create more natural messages and avoid phrases like "3 file(s)". While the MessageFormat class formats message strings, the ChoiceFormat class enables users to attach a format to a range of numbers. The two classes enable localizers to change the content, format, and order of any text, as appropriate, for any language. Both of these classes parse as well as format. However, formatting is their main purpose.

MessageFormat

MessageFormat is a concrete class that enables users to produce concatenated, language-neutral messages. The methods supplied in this class are used to build all the messages that are seen by end users.

The MessageFormat class assembles messages from various fragments (such as text fragments, numbers, and dates) supplied by the program using ICU. Because of the MessageFormat class, the program does not need to know the order of the fragments. The class uses the formatting specifications for the fragments to assemble them into a message that is contained in a single string within a resource bundle. For example, MessageFormat enables you to print the phrase "Finished printing x out of y files..." in a manner that still allows for flexibility in translation.

Previously, an end user message was created as a sentence and handled as a string. This procedure created problems for localizers because the sentence structure, word order, number format and so on are very different from language to language. The language-neutral way to create messages keeps each part of the message separate and provides keys to the data. These keys are stored in ResourceBundles. Using these keys, the MessageFormat class can concatenate the parts of the message, localize them, and display a well-formed string to the end user.

MessageFormat takes a set of objects, formats them, and then inserts the formatted strings into the pattern at the appropriate places. ChoiceFormat, a class that inherits from NumberFormat, can be used in conjunction with MessageFormat to handle plurals, match numbers, and select from an array of items. Typically, the message format will come from resources and the arguments will be dynamically set at runtime. The following code fragment created this output: "At 4:34:20 PM on 23-Mar-98, there was a disturbance in the Force on planet 7."

    UErrorCode err = U_ZERO_ERROR;
    Formattable arguments[] = {
       (int32_t)7,
       Formattable(Calendar.getNow(), Formattable::kIsDate),
       "a disturbance in the Force"
    };

    UnicodeString result;
    result = MessageFormat::format(
       "At {1,time} on {1,date}, there was {2} on planet{0,number,integer}.",
       arguments,
       3,
       result,
       err);

ChoiceFormat

The ChoiceFormat class returns a fixed string based on a numeric value. The class can be used in conjunction with the MessageFormat class to handle plurals in messages.

ChoiceFormat enables users to attach a format to a range of numbers. The choice is specified with an ascending list of doubles, where each item specifies a half-open interval up to the next item as in the following:

X matches j if and only if limit[j] <= X < limit[j+1]

If there is no match, then either the first or last index is used. The first or last index is used depending on whether the number is too low or too high. The length of the format array must be the same as the length of the limits array. For example:

 
double limits[]  = {1,2,3,4,5,6,7};
UnicodeString fmts[] = {"Sun","Mon","Tue","Wed","Thur","Fri","Sat"};

double limits2[]  = {0, 1, 1};
UBool closures2[] = { T, T, F };
UnicodeString fmts2[] = {"no files", "one file", "many files"};

ChoiceFormat objects also may be converted to and from patterns. The conversion can be done programmatically, as in the above example, or by using a pattern like the following:

"1#Sun|2#Mon|3#Tue|4#Wed|5#Thur|6#Fri|7#Sat"

"0#are no files|1#is one file|1<are many files"

where:

 

<number> "#"  Specifies a limit value
<number> "<"  Specifies a limit of nextDouble(<number>)
<number> ">"  Specifies a limit of previousDouble(<number>)

Note Each limit value is followed by a string and is terminated by a vertical bar character ("|"). The last string, however, is terminated by the end of the string.

Programming Examples

There are several programming examples for the MessageFormat and ChoiceFormat classes in C and C++ .



Copyright (c) 2000 - 2006 IBM and Others - PDF Version - Feedback: http://icu.sourceforge.net/contacts.html

User Guide for ICU v3.6 Generated 2006-08-31.