com.ibm.icu.lang
Interface UProperty


public interface UProperty

Selection constants for Unicode properties.

These constants are used in functions like UCharacter.hasBinaryProperty(int) to select one of the Unicode properties.

The properties APIs are intended to reflect Unicode properties as defined in the Unicode Character Database (UCD) and Unicode Technical Reports (UTR).

For details about the properties see http://www.unicode.org.

For names of Unicode properties see the UCD file PropertyAliases.txt.

Important: If ICU is built with UCD files from Unicode versions below 3.2, then properties marked with "new" are not or not fully available. Check UCharacter.getUnicodeVersion() to be sure.

Since:
March 8 2002
Author:
Syn Wee Quek
See Also:
UCharacter

Field Summary
static int ALPHABETIC
          Binary property Alphabetic.
static int ASCII_HEX_DIGIT
          Binary property ASCII_Hex_Digit (0-9 A-F a-f).
static int BIDI_CONTROL
          Binary property Bidi_Control.
static int BIDI_MIRRORED
          Binary property Bidi_Mirrored.
static int BINARY_LIMIT
          One more than the last constant for binary Unicode properties.
static int BINARY_START
          First constant for binary Unicode properties.
static int DASH
          Binary property Dash.
static int DEFAULT_IGNORABLE_CODE_POINT
          Binary property Default_Ignorable_Code_Point (new).
static int DEPRECATED
          Binary property Deprecated (new).
static int DIACRITIC
          Binary property Diacritic.
static int EXTENDER
          Binary property Extender.
static int FULL_COMPOSITION_EXCLUSION
          Binary property Full_Composition_Exclusion.
static int GRAPHEME_BASE
          Binary property Grapheme_Base (new).
static int GRAPHEME_EXTEND
          Binary property Grapheme_Extend (new).
static int GRAPHEME_LINK
          Binary property Grapheme_Link (new).
static int HEX_DIGIT
          Binary property Hex_Digit.
static int HYPHEN
          Binary property Hyphen.
static int ID_CONTINUE
          Binary property ID_Continue.
static int ID_START
          Binary property ID_Start.
static int IDEOGRAPHIC
          Binary property Ideographic.
static int IDS_BINARY_OPERATOR
          Binary property IDS_Binary_Operator (new).
static int IDS_TRINARY_OPERATOR
          Binary property IDS_Trinary_Operator (new).
static int JOIN_CONTROL
          Binary property Join_Control.
static int LOGICAL_ORDER_EXCEPTION
          Binary property Logical_Order_Exception (new).
static int LOWERCASE
          Binary property Lowercase.
static int MATH
          Binary property Math.
static int NONCHARACTER_CODE_POINT
          Binary property Noncharacter_Code_Point.
static int QUOTATION_MARK
          Binary property Quotation_Mark.
static int RADICAL
          Binary property Radical (new).
static int SOFT_DOTTED
          Binary property Soft_Dotted (new).
static int TERMINAL_PUNCTUATION
          Binary property Terminal_Punctuation.
static int UNIFIED_IDEOGRAPH
          Binary property Unified_Ideograph (new).
static int UPPERCASE
          Binary property Uppercase.
static int WHITE_SPACE
          Binary property White_Space.
static int XID_CONTINUE
          Binary property XID_Continue.
static int XID_START
          Binary property XID_Start.
 

Field Detail

ALPHABETIC

public static final int ALPHABETIC

Binary property Alphabetic.

Property for UCharacter.isUAlphabetic(), different from the property in UCharacter.isalpha().

Lu + Ll + Lt + Lm + Lo + Other_Alphabetic.


BINARY_START

public static final int BINARY_START
First constant for binary Unicode properties.

ASCII_HEX_DIGIT

public static final int ASCII_HEX_DIGIT
Binary property ASCII_Hex_Digit (0-9 A-F a-f).

BIDI_CONTROL

public static final int BIDI_CONTROL

Binary property Bidi_Control.

Format controls which have specific functions in the Bidi Algorithm.


BIDI_MIRRORED

public static final int BIDI_MIRRORED

Binary property Bidi_Mirrored.

Characters that may change display in RTL text.

Property for UCharacter.isMirrored().

See Bidi Algorithm; UTR 9.


DASH

public static final int DASH

Binary property Dash.

Variations of dashes.


DEFAULT_IGNORABLE_CODE_POINT

public static final int DEFAULT_IGNORABLE_CODE_POINT

Binary property Default_Ignorable_Code_Point (new).

Property that indicates codepoint is ignorable in most processing.

Cf+Cc+Cs+Other_Default_Ignorable_Code_Point-White_Space


DEPRECATED

public static final int DEPRECATED

Binary property Deprecated (new).

The usage of deprecated characters is strongly discouraged.


DIACRITIC

public static final int DIACRITIC

Binary property Diacritic.

Characters that linguistically modify the meaning of another character to which they apply.


EXTENDER

public static final int EXTENDER

Binary property Extender.

Extend the value or shape of a preceding alphabetic character, e.g. length and iteration marks.


FULL_COMPOSITION_EXCLUSION

public static final int FULL_COMPOSITION_EXCLUSION

Binary property Full_Composition_Exclusion.

CompositionExclusions.txt + Singleton Decompositions + Non-Starter Decompositions.


GRAPHEME_BASE

public static final int GRAPHEME_BASE

Binary property Grapheme_Base (new).

For programmatic determination of grapheme cluster boundaries. [0..10FFFF]-Cc-Cf-Cs-Co-Cn-Zl-Zp-Grapheme_Link-Grapheme_Extend


GRAPHEME_EXTEND

public static final int GRAPHEME_EXTEND

Binary property Grapheme_Extend (new).

For programmatic determination of grapheme cluster boundaries.

Me+Mn+Mc+Other_Grapheme_Extend-Grapheme_Link


GRAPHEME_LINK

public static final int GRAPHEME_LINK

Binary property Grapheme_Link (new).

For programmatic determination of grapheme cluster boundaries.


HEX_DIGIT

public static final int HEX_DIGIT

Binary property Hex_Digit.

Characters commonly used for hexadecimal numbers.


HYPHEN

public static final int HYPHEN

Binary property Hyphen.

Dashes used to mark connections between pieces of words, plus the Katakana middle dot.


ID_CONTINUE

public static final int ID_CONTINUE

Binary property ID_Continue.

Characters that can continue an identifier.

ID_Start+Mn+Mc+Nd+Pc


ID_START

public static final int ID_START

Binary property ID_Start.

Characters that can start an identifier.

Lu+Ll+Lt+Lm+Lo+Nl


IDEOGRAPHIC

public static final int IDEOGRAPHIC

Binary property Ideographic.

CJKV ideographs.


IDS_BINARY_OPERATOR

public static final int IDS_BINARY_OPERATOR

Binary property IDS_Binary_Operator (new).

For programmatic determination of Ideographic Description Sequences.


IDS_TRINARY_OPERATOR

public static final int IDS_TRINARY_OPERATOR

Binary property IDS_Trinary_Operator (new).


JOIN_CONTROL

public static final int JOIN_CONTROL

Binary property Join_Control.

Format controls for cursive joining and ligation.


LOGICAL_ORDER_EXCEPTION

public static final int LOGICAL_ORDER_EXCEPTION

Binary property Logical_Order_Exception (new).

Characters that do not use logical order and require special handling in most processing.


LOWERCASE

public static final int LOWERCASE

Binary property Lowercase.

Same as UCharacter.isULowercase(), different from UCharacter.islower().

Ll+Other_Lowercase


MATH

public static final int MATH

Binary property Math.

Sm+Other_Math


NONCHARACTER_CODE_POINT

public static final int NONCHARACTER_CODE_POINT

Binary property Noncharacter_Code_Point.

Code points that are explicitly defined as illegal for the encoding of characters.


QUOTATION_MARK

public static final int QUOTATION_MARK

Binary property Quotation_Mark.


RADICAL

public static final int RADICAL

Binary property Radical (new).

For programmatic determination of Ideographic Description Sequences.


SOFT_DOTTED

public static final int SOFT_DOTTED

Binary property Soft_Dotted (new).

Characters with a "soft dot", like i or j.

An accent placed on these characters causes the dot to disappear.


TERMINAL_PUNCTUATION

public static final int TERMINAL_PUNCTUATION

Binary property Terminal_Punctuation.

Punctuation characters that generally mark the end of textual units.


UNIFIED_IDEOGRAPH

public static final int UNIFIED_IDEOGRAPH

Binary property Unified_Ideograph (new).

For programmatic determination of Ideographic Description Sequences.


UPPERCASE

public static final int UPPERCASE

Binary property Uppercase.

Same as UCharacter.isUUppercase(), different from UCharacter.isUpperCase().

Lu+Other_Uppercase


WHITE_SPACE

public static final int WHITE_SPACE

Binary property White_Space.

Same as UCharacter.isUWhiteSpace(), different from UCharacter.isSpace() and UCharacter.isWhitespace().

Space characters+TAB+CR+LF-ZWSP-ZWNBSP


XID_CONTINUE

public static final int XID_CONTINUE

Binary property XID_Continue.

ID_Continue modified to allow closure under normalization forms NFKC and NFKD.


XID_START

public static final int XID_START

Binary property XID_Start.

ID_Start modified to allow closure under normalization forms NFKC and NFKD.


BINARY_LIMIT

public static final int BINARY_LIMIT

One more than the last constant for binary Unicode properties.



Copyright (c) 2001 IBM Corporation and others.