Main Page   Class Hierarchy   Compound List   File List   Header Files   Sources   Compound Members   File Members  

ucol.h File Reference

Collator C API. More...


Defines

#define UCOL_PRIMARYMASK
#define UCOL_SECONDARYMASK
#define UCOL_TERTIARYMASK
#define UCOL_NULLORDER

Typedefs

typedef void* UCollator
A collator. More...

typedef enum UCollationResult UCollationResult
typedef enum UNormalizationMode UNormalizationMode
typedef enum UNormalizationOption UNormalizationOption
typedef enum UCollationStrength UCollationStrength

Enumerations

enum  UCollationResult { UCOL_EQUAL, UCOL_GREATER, UCOL_LESS }
Possible values for a comparison result. More...

enum  UNormalizationMode { UCOL_NO_NORMALIZATION, UCOL_DECOMP_CAN, UCOL_DECOMP_COMPAT, UCOL_DECOMP_CAN_COMP_COMPAT, UCOL_DECOMP_COMPAT_COMP_CAN, UCOL_DEFAULT_NORMALIZATION }
Possible collation normalization modes. More...

enum  UNormalizationOption { UCOL_IGNORE_HANGUL }
Possible normalization options. More...

enum  UCollationStrength { UCOL_PRIMARY, UCOL_SECONDARY, UCOL_TERTIARY, UCOL_IDENTICAL, UCOL_DEFAULT_STRENGTH }
Possible collation strengths. More...


Detailed Description

Collator C API.

The C API for Collator performs locale-sensitive String comparison. You use this class to build searching and sorting routines for natural language text.

Like other locale-sensitive classes, you can use the function ucol_open(), to obtain the appropriate pointer to UCollator object for a given locale. If you need to understand the details of a particular collation strategy or if you need to modify that strategy.

The following example shows how to compare two strings using the UCollator for the default locale.

 // Compare two strings in the default locale
 UErrorCode success = U_ZERO_ERROR;
 UCollator* myCollator = ucol_open(NULL, &success);
 UChar source[4], target[4];
 u_uastrcpy(source, "abc");
 u_uastrcpy(target, "ABC");
 if( u_strcoll(myCollator, source, u_strlen(source), target, u_strlen(target)) == UCOL_LESS) {
     printf("abc is less than ABC\n");
 }else{
     printf("abc is greater than or equal to ABC\n");
 }
 

You can set a Collator's strength property to determine the level of difference considered significant in comparisons. Four strengths are provided: UCOL_PRIMARY, UCOL_SECONDARY, UCOL_TERTIARY, and UCOL_IDENTICAL. The exact assignment of strengths to language features is locale dependant. For example, in Czech, "e" and "f" are considered primary differences, while "e" and "\u00EA" are secondary differences, "e" and "E" are tertiary differences and "e" and "e" are identical. The following shows how both case and accents could be ignored for US English.

 //Get the Collator for US English and set its strength to UCOL_PRIMARY
 UErrorCode success = U_ZERO_ERROR;
 UCollator* usCollator = ucol_open("en_US", &success);
 ucol_setStrength(usCollator, UCOL_PRIMARY);
 UChar source[4], target[4];
 u_uastrcpy(source, "abc");
 u_uastrcpy(target, "ABC");
 if( u_strcoll(myCollator, source, u_strlen(source), target, u_strlen(target)) == UCOL_EQUAL) {
     printf("'abc' and 'ABC' strings are equivalent with strength UCOL_PRIMARY\n");
 }
 

For comparing Strings exactly once, the u_strcoll method provides the best performance. When sorting a list of Strings however, it is generally necessary to compare each String multiple times. In this case, sortKeys provide better performance. The ucol_getsortKey method converts a String to a series of bits that can be compared bitwise against other sortKeys using memcmp()

Note: UCollators with different Locale, Collation Strength and Decomposition Mode settings will return different sort orders for the same set of strings. Locales have specific collation rules, and the way in which secondary and tertiary differences are taken into account, for example, will result in a different sorting order for same strings.

See also:
UCollationResult , UNormalizationMode , UCollationStrength , UCollationElements

Definition in file ucol.h.


Define Documentation

#define UCOL_PRIMARYMASK ()

#define UCOL_SECONDARYMASK ()

#define UCOL_TERTIARYMASK ()

#define UCOL_NULLORDER ()


Typedef Documentation

typedef void* UCollator

A collator.

For usage in C programs.

Definition at line 96 of file ucol.h.

typedef enum UCollationResult UCollationResult

Definition at line 116 of file ucol.h.

typedef enum UNormalizationMode UNormalizationMode

Definition at line 142 of file ucol.h.

typedef enum UNormalizationOption UNormalizationOption

Definition at line 149 of file ucol.h.

typedef enum UCollationStrength UCollationStrength

Definition at line 187 of file ucol.h.


Enumeration Type Documentation

enum UCollationResult

Possible values for a comparison result.

Enumeration values:
UCOL_EQUAL   string a == string b.
UCOL_GREATER   string a > string b.
UCOL_LESS   string a < string b.

Definition at line 108 of file ucol.h.

enum UNormalizationMode

Possible collation normalization modes.

Enumeration values:
UCOL_NO_NORMALIZATION   No decomposition/composition.
UCOL_DECOMP_CAN   Canonical decomposition.
UCOL_DECOMP_COMPAT   Compatibility decomposition.
UCOL_DECOMP_CAN_COMP_COMPAT   Canonical decomposition followed by canonical composition.
UCOL_DECOMP_COMPAT_COMP_CAN   Compatibility decomposition followed by canonical composition.
UCOL_DEFAULT_NORMALIZATION   Default normalization.

Definition at line 128 of file ucol.h.

enum UNormalizationOption

Possible normalization options.

Enumeration values:
UCOL_IGNORE_HANGUL   Do not normalize Hangul.

Definition at line 145 of file ucol.h.

enum UCollationStrength

Possible collation strengths.

Enumeration values:
UCOL_PRIMARY   Primary collation strength.
UCOL_SECONDARY   Secondary collation strength.
UCOL_TERTIARY   Tertiary collation strength.
UCOL_IDENTICAL   Identical collation strength.
UCOL_DEFAULT_STRENGTH   Default collation strength.

Definition at line 175 of file ucol.h.


Generated at Mon Jun 5 12:52:58 2000 for ICU1.5 by doxygen 1.0.0 written by Dimitri van Heesch, © 1997-1999