ot
class Character
#include "ot/base/Character.h"
Represents a Unicode character using an internal sequence of one or more CharType characters.
It provides optimized routines for converting Unicode characters into a sequence of one or more CharType characters and for decoding a multi-character sequence into a Unicode code-point UCS4Char value.
The Character class also contains a number of convenient methods for querying the characteristics of the encoded Unicode character. These routines such as isHexDigit() and isSpace() are simply wrappers for functions in the Unicode class. They have counterparts in the standard C++ library, but the standard library routines rely on the capabilities of a locale which may not be available for Unicode. The Unicode class does not suffer from this drawback.
Constructor/Destructor Summary |
Character()
Default constructor. |
Character(const Character& rhs)
Copy constructor. |
Character(UCS4Char ch)
Constructs a Character with an internal CharType sequence equivalent to the Unicode character represented by the value of ch. |
Character(const CharType* pSeqStart, size_t len)
Constructs a Character given a pointer to the first member of a multi-character sequence and its maximum length. |
Method Summary
|
void |
appendToString(String& ) const
Appends the multi-character sequence controlled by this Character to the passed String str. |
const CharType* |
data() const
Returns a pointer to the controlled CharType character sequence buffer. |
const CharType |
first() const
Returns the first CharType character in the controlled sequence. |
bool |
isAscii() const
Tests if the Unicode character represented by this Character is in the ASCII range U+0000-U+007F. |
bool |
isDigit() const
Tests if the Unicode character represented by this Character represents an ASCII decimal digit 0-9. |
bool |
isEOF() const
Tests if this Character is equal to the special Character: Character::EndOfFileCharacter. |
bool |
isHexDigit() const
Tests if the Unicode character represented by this Character represents an ASCII hexadecimal digit [0-9], [A-F], [a-f]. |
bool |
isSpace() const
Tests if the Unicode character represented by this Character represents white-space according to common Windows and Unix conventions. |
const size_t |
length() const
Returns the number of CharType characters in the controlled character sequence. |
bool |
operator!=(const Character& rhs) const
Inequality operator. |
bool |
operator!=(CharType c) const
Inequality operator. |
Character& |
operator=(const Character& rhs)
Assignment operator. |
bool |
operator==(const Character& rhs) const
Equality operator. |
bool |
operator==(CharType c) const
Equality operator. |
String |
toString() const
Returns the multi-character sequence controlled by this Character as a String. |
UCS4Char |
toUnicode() const
Converts the controlled multi-character sequence into a 32-bit Unicode code-point value. |
Public Static Data Members |
EndOfFileCharacter
Character EndOfFileCharacter
Character representing the 'end of file' condition.
This is a special Character that can be returned from functions that read a single Character when the end of file condition has been reached.
Constructor/Destructor Detail |
Character
Character()
-
Default constructor.
Creates a Character that is equivalent to the EndOfFile character.
Character
Character(const Character& rhs)
-
Copy constructor.
Constructs a Character with the same value as rhs.
- Parameters:
rhs
-
the Character to copy
Character
Character(UCS4Char ch)
-
Constructs a Character with an internal CharType sequence equivalent to the Unicode character represented by the value of ch.
- Exceptions:
IllegalCharacterException
-
if ch is not a legal Unicode character in the range U+0000-U+10FFFF.
Character
Character(const CharType* pSeqStart,
size_t len)
-
Constructs a Character given a pointer to the first member of a multi-character sequence and its maximum length.
A multi-character sequence consists of one or more CharType characters that, taken together, represent a single Unicode character.
The sequence, including the first CharType character and any trailing characters are copied into the internal multi-character sequence.
- Parameters:
pSeqStart
-
a pointer to the first character of a multi-character sequence that represents a single Unicode character.
len
-
the number of CharType characters that are legally addressable within the array starting at pSeqStart
- Exceptions:
NullPointerException
-
if pSeqStart is null.
IllegalCharacterException
-
if the array starting at pSeqStart does not represent a valid Unicode character in the internal encoding
appendToString
void appendToString(String& ) const
-
Appends the multi-character sequence controlled by this Character to the passed String str.
- Parameters:
str
-
the String which will have this Character appended
data
const CharType* data() const
-
Returns a pointer to the controlled CharType character sequence buffer.
- Returns:
-
a pointer to the controlled character sequence.
- See also:
-
length()
first
const CharType first() const
-
Returns the first CharType character in the controlled sequence.
- Returns:
-
the first CharType character in the controlled sequence.
- Exceptions:
IllegalCharacterException
-
if this Character does not represent a valid Unicode character in the range U+0000-U+10FFFF.
isAscii
bool isAscii() const
-
Tests if the Unicode character represented by this Character is in the ASCII range U+0000-U+007F.
- Returns:
-
true if this Character is in the ASCII range; false otherwise.
- See also:
-
UnicodeCharacterType::IsAscii()
isDigit
bool isDigit() const
-
Tests if the Unicode character represented by this Character represents an ASCII decimal digit 0-9.
- Returns:
-
true if this Character is a decimal digit [0-9]; false otherwise.
- See also:
-
UnicodeCharacterType::IsDigit()
isEOF
bool isEOF() const
-
Tests if this Character is equal to the special Character: Character::EndOfFileCharacter.
Functions that read a character stream and return a Character need a method to indicate that the end of stream has been reached. To achieve this they return a special Character with a unique value that is different from all valid Unicode characters.
- Returns:
-
true if this Character is equal to the Character::EndOfFileCharacter; false otherwise.
isHexDigit
bool isHexDigit() const
-
Tests if the Unicode character represented by this Character represents an ASCII hexadecimal digit [0-9], [A-F], [a-f].
- Returns:
-
true if this Character is a hexadecimal digit; false otherwise.
- See also:
-
UnicodeCharacterType::IsHexDigit()
isSpace
bool isSpace() const
-
Tests if the Unicode character represented by this Character represents white-space according to common Windows and Unix conventions.
Space characters are:-
-
'\t' U+0009 HORIZONTAL TABULATION
-
'\n' U+000A NEW LINE
-
'\f' U+000C FORM FEED
-
'\r' U+000D CARRIAGE RETURN
-
' ' U+0020 SPACE
- Returns:
-
true if this Character is a space character; false otherwise.
- See also:
-
UnicodeCharacterType::IsSpace()
length
const size_t length() const
-
Returns the number of CharType characters in the controlled character sequence.
- Returns:
-
the length of the controlled character sequence.
- See also:
-
data()
operator!=
bool operator!=(const Character& rhs) const
-
Inequality operator.
Tests if the Unicode character represented by this is not the same Unicode character as rhs;
- Returns:
-
false if the Unicode character represented by this Character is equal to the Unicode character rhs; true otherwise
operator!=
bool operator!=(CharType c) const
-
Inequality operator.
Tests if the internal multi-character sequence has a length other than 1 or the first member is not equal to c.
- Returns:
-
true if the Unicode character represented by this Character is equal to the single CharType character c; false otherwise
operator=
Character& operator=(const Character& rhs)
-
Assignment operator.
Sets this Character equal to rhs.
- Returns:
-
a reference to this Character.
operator==
bool operator==(const Character& rhs) const
-
Equality operator.
Tests if the Unicode character represented by this is the same Unicode character as rhs;
- Returns:
-
true if the Unicode character represented by this Character is equal to the Unicode character rhs; false otherwise
operator==
bool operator==(CharType c) const
-
Equality operator.
Tests if the internal multi-character sequence has a length of 1 and the first member is equal to c.
- Returns:
-
true if the Unicode character represented by this Character is equal to the CharType character c; false otherwise
toString
String toString() const
-
Returns the multi-character sequence controlled by this Character as a String.
- Returns:
-
a String with the same sequence of CharType characters.
toUnicode
UCS4Char toUnicode() const
-
Converts the controlled multi-character sequence into a 32-bit Unicode code-point value.
- Returns:
-
the Unicode character represented by this Character as a 32-bit value.
- Exceptions:
IllegalCharacterException
-
if this Character does not represent a valid Unicode character in the range U+0000-U+10FFFF.
Found a bug or missing feature? Please email us at support@elcel.com