Main Page   Class Hierarchy   Compound List   File List   Header Files   Sources   Compound Members   File Members  

UnicodeConverterCPP Class Reference

#include <convert.h>

List of all members.


Public Members

 UnicodeConverterCPP ()
Creates Unicode Conversion Object will default to LATIN1 <-> encoding. More...

 UnicodeConverterCPP (const char* name, UErrorCode& err)
Creates Unicode Conversion Object by specifying the codepage name. More...

 UnicodeConverterCPP (const UnicodeString& name, UErrorCode& err)
Creates a UnicodeConverter object with the names specified as unicode strings. More...

 UnicodeConverterCPP (int32_t codepageNumber, UConverterPlatform platform, UErrorCode& err)
Creates Unicode Conversion Object using the codepage ID number. More...

 ~UnicodeConverterCPP ()
void fromUnicodeString (char* target, int32_t& targetSize, const UnicodeString& source, UErrorCode& err) const
Transcodes the source UnicodeString to the target string in a codepage encoding with the specified Unicode converter. More...

void toUnicodeString (UnicodeString& target, const char* source, int32_t sourceSize, UErrorCode& err) const
Transcode the source string in codepage encoding to the target string in Unicode encoding. More...

void fromUnicode (char*& target, const char* targetLimit, const UChar*& source, const UChar* sourceLimit, int32_t * offsets, UBool flush, UErrorCode& err)
Transcodes an array of unicode characters to an array of codepage characters. More...

void toUnicode (UChar*& target, const UChar* targetLimit, const char*& source, const char* sourceLimit, int32_t * offsets, UBool flush, UErrorCode& err)
Converts an array of codepage characters into an array of unicode characters. More...

int8_t getMaxBytesPerChar (void) const
Returns the maximum length of bytes used by a character. More...

int8_t getMinBytesPerChar (void) const
Returns the minimum byte length for characters in this codepage. More...

UConverterType getType (void) const
Gets the type of conversion associated with the converter e.g. More...

void getStarters (UBool starters[256], UErrorCode& err) const
Gets the "starter" bytes for the converters of type MBCS will fill in an U_ILLEGAL_ARGUMENT_ERROR if converter passed in is not MBCS. More...

void getSubstitutionChars (char* subChars, int8_t& len, UErrorCode& err) const
Fills in the output parameter, subChars, with the substitution characters as multiple bytes. More...

void setSubstitutionChars (const char* subChars, int8_t len, UErrorCode& err)
Sets the substitution chars when converting from unicode to a codepage. More...

void resetState (void)
Resets the state of stateful conversion to the default state. More...

const char* getName ( UErrorCode& err) const
Gets the name of the converter (zero-terminated). More...

int32_t getCodepage (UErrorCode& err) const
Gets a codepage number associated with the converter. More...

UConverterToUCallback getMissingCharAction (void) const
Returns the current setting action taken when a character from a codepage is missing. More...

UConverterFromUCallback getMissingUnicodeAction (void) const
Return the current setting action taken when a unicode character is missing. More...

void setMissingCharAction (UConverterToUCallback action, UErrorCode& err)
Sets the current setting action taken when a character from a codepage is missing. More...

void setMissingUnicodeAction (UConverterFromUCallback action, UErrorCode& err)
Sets the current setting action taken when a unicode character is missing. More...

void getDisplayName (const Locale& displayLocale, UnicodeString& displayName) const
Returns the localized name of the UnicodeConverter, if for any reason it is available, the internal name will be returned instead. More...

UConverterPlatform getCodepagePlatform (UErrorCode& err) const
Returns the T_UnicodeConverter_platform (ICU defined enum) of a UnicodeConverter available, the internal name will be returned instead. More...

UnicodeConverterCPP& operator= (const UnicodeConverterCPP& that)
UBool operator== (const UnicodeConverterCPP& that) const
UBool operator!= (const UnicodeConverterCPP& that) const
 UnicodeConverterCPP (const UnicodeConverterCPP& that)
void fixFileSeparator (UnicodeString& source) const
Fixes the backslash character mismapping. More...

UBool isAmbiguous (void) const
Determines if the converter contains ambiguous mappings of the same character or not. More...


Static Public Members

const char* const* getAvailableNames (int32_t& num, UErrorCode& err)
Returns the available names. More...

int32_t flushCache (void)
Iterates through every cached converter and frees all the unused ones. More...


Detailed Description

Definition at line 15 of file convert.h.


Member Function Documentation

UnicodeConverterCPP::UnicodeConverterCPP ()

Creates Unicode Conversion Object will default to LATIN1 <-> encoding.

Returns:
An object Handle if successful or a NULL if the creation failed
Stable:

UnicodeConverterCPP::UnicodeConverterCPP (const char * name, UErrorCode & err)

Creates Unicode Conversion Object by specifying the codepage name.

The name string is in ASCII format.

Parameters:
code_set   the pointer to a char[] object containing a codepage name. (I)
UErrorCode   Error status (I/O) IILLEGAL_ARGUMENT_ERROR will be returned if the string is empty. If the internal program does not work correctly, for example, if there's no such codepage, U_INTERNAL_PROGRAM_ERROR will be returned.
Returns:
An object Handle if successful or a NULL if the creation failed
Stable:

UnicodeConverterCPP::UnicodeConverterCPP (const UnicodeString & name, UErrorCode & err)

Creates a UnicodeConverter object with the names specified as unicode strings.

The name should be limited to the ASCII-7 alphanumerics. Dash and underscore characters are allowed for readability, but are ignored in the search.

Parameters:
code_set   name of the uconv table in Unicode string (I)
err   error status (I/O) IILLEGAL_ARGUMENT_ERROR will be returned if the string is empty. If the internal program does not work correctly, for example, if there's no such codepage, U_INTERNAL_PROGRAM_ERROR will be returned.
Returns:
the created Unicode converter object
Stable:

UnicodeConverterCPP::UnicodeConverterCPP (int32_t codepageNumber, UConverterPlatform platform, UErrorCode & err)

Creates Unicode Conversion Object using the codepage ID number.

Parameters:
code_set   a codepage # (I) @UErrorCode Error status (I/O) IILLEGAL_ARGUMENT_ERROR will be returned if the string is empty. If the internal program does not work correctly, for example, if there's no such codepage, U_INTERNAL_PROGRAM_ERROR will be returned.
Returns:
An object Handle if successful or a NULL if failed
Stable:

UnicodeConverterCPP::~UnicodeConverterCPP ()

void UnicodeConverterCPP::fromUnicodeString (char * target, int32_t & targetSize, const UnicodeString & source, UErrorCode & err) const

Transcodes the source UnicodeString to the target string in a codepage encoding with the specified Unicode converter.

For example, if a Unicode to/from JIS converter is specified, the source string in Unicode will be transcoded to JIS encoding. The result will be stored in JIS encoding.

Parameters:
source   the source Unicode string
target   the target string in codepage encoding
targetSize   Input the number of bytes available in the "target" buffer, Output the number of bytes copied to it
err   the error status code. U_MEMORY_ALLOCATION_ERROR will be returned if the the internal process buffer cannot be allocated for transcoding. U_ILLEGAL_ARGUMENT_ERROR is returned if the converter is null or the source or target string is empty.
Draft:
backslash versus Yen sign in shift-JIS

void UnicodeConverterCPP::toUnicodeString (UnicodeString & target, const char * source, int32_t sourceSize, UErrorCode & err) const

Transcode the source string in codepage encoding to the target string in Unicode encoding.

For example, if a Unicode to/from JIS converter is specified, the source string in JIS encoding will be transcoded to Unicode encoding. The result will be stored in Unicode encoding.

Parameters:
source   the source string in codepage encoding
target   the target string in Unicode encoding
targetSize   : I/O parameter, Input size buffer, Output # of bytes copied to it
err   the error status code U_MEMORY_ALLOCATION_ERROR will be returned if the the internal process buffer cannot be allocated for transcoding. U_ILLEGAL_ARGUMENT_ERROR is returned if the converter is null or the source or target string is empty.
Stable:

void UnicodeConverterCPP::fromUnicode (char *& target, const char * targetLimit, const UChar *& source, const UChar * sourceLimit, int32_t * offsets, UBool flush, UErrorCode & err)

Transcodes an array of unicode characters to an array of codepage characters.

The source pointer is an I/O parameter, it starts out pointing at the place to begin translating, and ends up pointing after the first sequence of the bytes that it encounters that are semantically invalid. if T_UnicodeConverter_setMissingCharAction is called with an action other than STOP before a call is made to this API, consumed and source should point to the same place (unless target ends with an imcomplete sequence of bytes and flush is FALSE).

Parameters:
target   : I/O parameter. Input : Points to the beginning of the buffer to copy codepage characters to. Output : points to after the last codepage character copied to target.
targetLimit   the pointer to the end of the target array
source   the source Unicode character array
sourceLimit   the pointer to the end of the source array
flush   TRUE if the buffer is the last buffer and the conversion will finish in this call, FALSE otherwise. (future feature pending)
UErrorCode   the error status. U_ILLEGAL_ARGUMENT_ERROR will be returned if the converter is null.
Draft:
backslash versus Yen sign in shift-JIS

void UnicodeConverterCPP::toUnicode (UChar *& target, const UChar * targetLimit, const char *& source, const char * sourceLimit, int32_t * offsets, UBool flush, UErrorCode & err)

Converts an array of codepage characters into an array of unicode characters.

The source pointer is an I/O parameter, it starts out pointing at the place to begin translating, and ends up pointing after the first sequence of the bytes that it encounters that are semantically invalid. if T_UnicodeConverter_setMissingUnicodeAction is called with an action other than STOP before a call is made to this API, consumed and source should point to the same place (unless target ends with an imcomplete sequence of bytes and flush is FALSE).

Parameters:
target   : I/O parameter. Input : Points to the beginning of the buffer to copy Unicode characters to. Output : points to after the last UChar copied to target.
targetLimit   the pointer to the end of the target array
source   the source codepage character array
sourceLimit   the pointer to the end of the source array
flush   TRUE if the buffer is the last buffer and the conversion will finish in this call, FALSE otherwise. (future feature pending)
err   the error code status U_ILLEGAL_ARGUMENT_ERROR will be returned if the converter is null, targetLimit < target, sourceLimit < source
Stable:

int8_t UnicodeConverterCPP::getMaxBytesPerChar (void) const

Returns the maximum length of bytes used by a character.

This varies between 1 and 4

Returns:
the max number of bytes per codepage character * converter is null, targetLimit < target, sourceLimit < source
Stable:

int8_t UnicodeConverterCPP::getMinBytesPerChar (void) const

Returns the minimum byte length for characters in this codepage.

This is either 1 or 2 for all supported codepages.

Returns:
the minimum number of byte per codepage character
Stable:

UConverterType UnicodeConverterCPP::getType (void) const

Gets the type of conversion associated with the converter e.g.

SBCS, MBCS, DBCS, UTF8, UTF16_BE, UTF16_LE, ISO_2022, EBCDIC_STATEFUL, LATIN_1

Returns:
the type of the converter
Stable:

void UnicodeConverterCPP::getStarters (UBool starters[256], UErrorCode & err) const

Gets the "starter" bytes for the converters of type MBCS will fill in an U_ILLEGAL_ARGUMENT_ERROR if converter passed in is not MBCS.

fills in an array of boolean, with the value of the byte as offset to the array. At return, if TRUE is found in at offset 0x20, it means that the byte 0x20 is a starter byte in this converter.

Parameters:
starters:   an array of size 256 to be filled in
err:   an array of size 256 to be filled in
See also:
ucnv_getType()
Stable:

void UnicodeConverterCPP::getSubstitutionChars (char * subChars, int8_t & len, UErrorCode & err) const

Fills in the output parameter, subChars, with the substitution characters as multiple bytes.

Parameters:
subChars   the subsitution characters
len   the number of bytes of the substitution character array
err   the error status code. U_ILLEGAL_ARGUMENT_ERROR will be returned if the converter is null. If the substitution character array is too small, an U_INDEX_OUTOFBOUNDS_ERROR will be returned.
Stable:

void UnicodeConverterCPP::setSubstitutionChars (const char * subChars, int8_t len, UErrorCode & err)

Sets the substitution chars when converting from unicode to a codepage.

The substitution is specified as a string of 1-4 bytes, and may contain null byte. The fill-in parameter err will get the error status on return.

Parameters:
cstr   the substitution character array to be set with
len   the number of bytes of the substitution character array and upon return will contain the number of bytes copied to that buffer
err   the error status code. U_ILLEGAL_ARGUMENT_ERROR if the converter is null. or if the number of bytes provided are not in the codepage's range (e.g length 1 for ucs-2)
Stable:

void UnicodeConverterCPP::resetState (void)

Resets the state of stateful conversion to the default state.

This is used in the case of error to restart a conversion from a known default state.

Stable:

const char * UnicodeConverterCPP::getName (UErrorCode & err) const

Gets the name of the converter (zero-terminated).

the name will be the internal name of the converter

Parameters:
converter   the Unicode converter
err   the error status code. U_INDEX_OUTOFBOUNDS_ERROR in the converterNameLen is too small to contain the name.
Stable:

int32_t UnicodeConverterCPP::getCodepage (UErrorCode & err) const

Gets a codepage number associated with the converter.

This is not guaranteed to be the one used to create the converter. Some converters do not represent IBM registered codepages and return zero for the codepage number. The error code fill-in parameter indicates if the codepage number is available.

Parameters:
err   the error status code. U_ILLEGAL_ARGUMENT_ERROR will returned if the converter is null or if converter's data table is null.
Returns:
If any error occurrs, null will be returned.
Stable:

UConverterToUCallback UnicodeConverterCPP::getMissingCharAction (void) const

Returns the current setting action taken when a character from a codepage is missing.

(Currently STOP or SUBSTITUTE).

Returns:
the action constant when a Unicode character cannot be converted to a codepage equivalent
Stable:

UConverterFromUCallback UnicodeConverterCPP::getMissingUnicodeAction (void) const

Return the current setting action taken when a unicode character is missing.

(Currently STOP or SUBSTITUTE).

Returns:
the action constant when a codepage character cannot be converted to a Unicode eqivalent
Stable:

void UnicodeConverterCPP::setMissingCharAction (UConverterToUCallback action, UErrorCode & err)

Sets the current setting action taken when a character from a codepage is missing.

(Currently STOP or SUBSTITUTE).

Parameters:
action   the action constant if an equivalent codepage character is missing
Stable:

void UnicodeConverterCPP::setMissingUnicodeAction (UConverterFromUCallback action, UErrorCode & err)

Sets the current setting action taken when a unicode character is missing.

(currently T_UnicodeConverter_MissingUnicodeAction is either STOP or SUBSTITUTE, SKIP, CLOSEST_MATCH, ESCAPE_SEQ may be added in the future).

Parameters:
action   the action constant if an equivalent Unicode character is missing
err   the error status code
Stable:

void UnicodeConverterCPP::getDisplayName (const Locale & displayLocale, UnicodeString & displayName) const

Returns the localized name of the UnicodeConverter, if for any reason it is available, the internal name will be returned instead.

Parameters:
displayLocale   the valid Locale, from which we want to localize
displayString   a UnicodeString that is going to be filled in.
Stable:

UConverterPlatform UnicodeConverterCPP::getCodepagePlatform (UErrorCode & err) const

Returns the T_UnicodeConverter_platform (ICU defined enum) of a UnicodeConverter available, the internal name will be returned instead.

Parameters:
err   the error code status
Returns:
the codepages platform
Stable:

UnicodeConverterCPP& UnicodeConverterCPP::operator= (const UnicodeConverterCPP & that)

UBool UnicodeConverterCPP::operator== (const UnicodeConverterCPP & that) const

UBool UnicodeConverterCPP::operator!= (const UnicodeConverterCPP & that) const

UnicodeConverterCPP::UnicodeConverterCPP (const UnicodeConverterCPP & that)

void UnicodeConverterCPP::fixFileSeparator (UnicodeString & source) const

Fixes the backslash character mismapping.

For example, in SJIS, the backslash character in the ASCII portion is also used to represent the yen currency sign. When mapping from Unicode character 0x005C, it's unclear whether to map the character back to yen or backslash in SJIS. This function will take the input buffer and replace all the yen sign characters with backslash. This is necessary when the user tries to open a file with the input buffer on Windows.

Parameters:
source   the input buffer to be fixed
Draft:

UBool UnicodeConverterCPP::isAmbiguous (void) const

Determines if the converter contains ambiguous mappings of the same character or not.

Returns:
TRUE if the converter contains ambiguous mapping of the same character, FALSE otherwise.
Draft:

const char *const * UnicodeConverterCPP::getAvailableNames (int32_t & num, UErrorCode & err) [static]

Returns the available names.

Lazy evaluated, Library owns the storage

Parameters:
num   the number of available converters
err   the error code status
Returns:
the name array
Stable:

int32_t UnicodeConverterCPP::flushCache (void) [static]

Iterates through every cached converter and frees all the unused ones.

Returns:
the number of cached converters successfully deleted
Stable:

The documentation for this class was generated from the following file:
Generated at Mon Jun 5 12:53:27 2000 for ICU1.5 by doxygen 1.0.0 written by Dimitri van Heesch, © 1997-1999