#include <rbt_data.h>
Public Methods | |
TransliterationRuleData (UErrorCode& status) | |
TransliterationRuleData (const TransliterationRuleData&) | |
~TransliterationRuleData () | |
const UnicodeSet* | lookupSet (UChar standIn) const |
int32_t | lookupSegmentReference (UChar c) const |
Return the zero-based index of the segment represented by the given character, or -1 if none. More... | |
UChar | getSegmentStandin (int32_t ref) const |
Return the character used to stand for the given segment reference. More... | |
Public Attributes | |
TransliterationRuleSet | ruleSet |
Rule table. More... | |
Hashtable* | variableNames |
Map variable name (String) to variable (UnicodeString). More... | |
UnicodeSet** | setVariables |
Map category variable (UChar) to set (UnicodeSet). More... | |
UChar | setVariablesBase |
The character that represents setVariables[0]. More... | |
int32_t | setVariablesLength |
The length of setVariables. More... | |
UChar | segmentBase |
The character that represents segment 1. More... |
RBT objects hold a const pointer to a TRD object that they do not own. TRD objects are essentially the parsed rules in compact, usable form. The TRD objects themselves are held for the life of the process in a static cache owned by Transliterator.
This class' API is a little asymmetric. There is a method to define a variable, but no way to define a set. This is because the sets are defined by the parser in a UVector, and the vector is copied into a fixed-size array here. Once this is done, no new sets may be defined. In practice, there is no need to do so, since generating the data and using it are discrete phases. When there is a need to access the set data during the parse phase, another data structure handles this. See the parsing code for more details.
Definition at line 34 of file rbt_data.h.
|
|
|
|
|
|
|
Return the character used to stand for the given segment reference. The reference must be in the range 1..9. Definition at line 106 of file rbt_data.h. |
|
Return the zero-based index of the segment represented by the given character, or -1 if none. Repeat: This is a zero-based return value, 0..8, even though these are notated "$1".."$9". |
|
|
|
Rule table. May be empty. Definition at line 43 of file rbt_data.h. |
|
The character that represents segment 1. Characters segmentBase through segmentBase + 8 represent segments 1 through 9. Definition at line 83 of file rbt_data.h. |
|
Map category variable (UChar) to set (UnicodeSet). Variables that correspond to a set of characters are mapped from variable name to a stand-in character in data.variableNames. The stand-in then serves as a key in this hash to lookup the actual UnicodeSet object. In addition, the stand-in is stored in the rule text to represent the set of characters. setVariables[i] represents character (setVariablesBase + i). Definition at line 65 of file rbt_data.h. |
|
The character that represents setVariables[0]. Characters setVariablesBase through setVariablesBase + setVariables.length - 1 represent UnicodeSet objects. Definition at line 72 of file rbt_data.h. |
|
The length of setVariables.
Definition at line 77 of file rbt_data.h. |
|
Map variable name (String) to variable (UnicodeString). A variable name corresponds to zero or more characters, stored in a UnicodeString in this hash. One or more of these chars may also correspond to a UnicodeSet, in which case the character in the UnicodeString in this hash is a stand-in: it is an index for a secondary lookup in data.setVariables. The stand-in also represents the UnicodeSet in the stored rules. Definition at line 54 of file rbt_data.h. |