com.ibm.text
Class UnicodeDecompressor

java.lang.Object
  |
  +--com.ibm.text.UnicodeDecompressor
All Implemented Interfaces:
com.ibm.text.SCSU

public final class UnicodeDecompressor
extends java.lang.Object
implements com.ibm.text.SCSU

A decompression engine implementing the Standard Compression Scheme for Unicode (SCSU) as outlined in Unicode Technical Report #6.

USAGE

The static methods on UnicodeDecompressor may be used in a straightforward manner to decompress simple strings:

  byte [] compressed = ... ; // get compressed bytes from somewhere
  String result = UnicodeDecompressor.decompress(compressed);
 

The static methods have a fairly large memory footprint. For finer-grained control over memory usage, UnicodeDecompressor offers more powerful APIs allowing iterative decompression:

  // Decompress an array "bytes" of length "len" using a buffer of 512 chars
  // to the Writer "out"

  UnicodeDecompressor myDecompressor         = new UnicodeDecompressor();
  final static int    BUFSIZE                = 512;
  char []             charBuffer             = new char [ BUFSIZE ];
  int                 charsWritten           = 0;
  int []              bytesRead              = new int [1];
  int                 totalBytesDecompressed = 0;
  int                 totalCharsWritten      = 0;

  do {
    // do the decompression
    charsWritten = myDecompressor.decompress(bytes, totalBytesDecompressed, 
                                             len, bytesRead,
                                             charBuffer, 0, BUFSIZE);

    // do something with the current set of chars
    out.write(charBuffer, 0, charsWritten);

    // update the no. of bytes decompressed
    totalBytesDecompressed += bytesRead[0];

    // update the no. of chars written
    totalCharsWritten += charsWritten;

  } while(totalBytesDecompressed < len);

  myDecompressor.reset(); // reuse decompressor
 

Decompression is performed according to the standard set forth in Unicode Technical Report #6

Version:
1.5 05 Aug 99
Author:
Stephen F. Booth
See Also:
UnicodeCompressor

Field Summary
static int ARMENIANINDEX
           
static int COMPRESSIONOFFSET
           
static int GREEKINDEX
           
static int HALFWIDTHKATAKANAINDEX
           
static int HIRAGANAINDEX
           
static int INVALIDCHAR
           
static int INVALIDWINDOW
           
static int IPAEXTENSIONINDEX
           
static int KATAKANAINDEX
           
static int LATININDEX
           
static int MAXINDEX
           
static int NUMSTATICWINDOWS
           
static int NUMWINDOWS
           
static int RESERVEDINDEX
           
static int SCHANGE0
           
static int SCHANGE1
           
static int SCHANGE2
           
static int SCHANGE3
           
static int SCHANGE4
           
static int SCHANGE5
           
static int SCHANGE6
           
static int SCHANGE7
           
static int SCHANGEU
           
static int SDEFINE0
           
static int SDEFINE1
           
static int SDEFINE2
           
static int SDEFINE3
           
static int SDEFINE4
           
static int SDEFINE5
           
static int SDEFINE6
           
static int SDEFINE7
           
static int SDEFINEX
           
static int SINGLEBYTEMODE
           
static int[] sOffsets
          Static compression window offsets
static int[] sOffsetTable
          For window offset mapping
static int SQUOTE0
           
static int SQUOTE1
           
static int SQUOTE2
           
static int SQUOTE3
           
static int SQUOTE4
           
static int SQUOTE5
           
static int SQUOTE6
           
static int SQUOTE7
           
static int SQUOTEU
           
static int SRESERVED
           
static int UCHANGE0
           
static int UCHANGE1
           
static int UCHANGE2
           
static int UCHANGE3
           
static int UCHANGE4
           
static int UCHANGE5
           
static int UCHANGE6
           
static int UCHANGE7
           
static int UDEFINE0
           
static int UDEFINE1
           
static int UDEFINE2
           
static int UDEFINE3
           
static int UDEFINE4
           
static int UDEFINE5
           
static int UDEFINE6
           
static int UDEFINE7
           
static int UDEFINEX
           
static int UNICODEMODE
           
static int UQUOTEU
           
static int URESERVED
           
 
Constructor Summary
UnicodeDecompressor()
          Create a UnicodeDecompressor.
 
Method Summary
static java.lang.String decompress(byte[] buffer)
          Decompress a byte array into a String.
static char[] decompress(byte[] buffer, int start, int limit)
          Decompress a byte array into a Unicode character array.
 int decompress(byte[] byteBuffer, int byteBufferStart, int byteBufferLimit, int[] bytesRead, char[] charBuffer, int charBufferStart, int charBufferLimit)
          Decompress a byte array into a Unicode character array.
 void reset()
          Reset the decompressor to its initial state.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

COMPRESSIONOFFSET

public static final int COMPRESSIONOFFSET

NUMWINDOWS

public static final int NUMWINDOWS

NUMSTATICWINDOWS

public static final int NUMSTATICWINDOWS

INVALIDWINDOW

public static final int INVALIDWINDOW

INVALIDCHAR

public static final int INVALIDCHAR

SINGLEBYTEMODE

public static final int SINGLEBYTEMODE

UNICODEMODE

public static final int UNICODEMODE

MAXINDEX

public static final int MAXINDEX

RESERVEDINDEX

public static final int RESERVEDINDEX

LATININDEX

public static final int LATININDEX

IPAEXTENSIONINDEX

public static final int IPAEXTENSIONINDEX

GREEKINDEX

public static final int GREEKINDEX

ARMENIANINDEX

public static final int ARMENIANINDEX

HIRAGANAINDEX

public static final int HIRAGANAINDEX

KATAKANAINDEX

public static final int KATAKANAINDEX

HALFWIDTHKATAKANAINDEX

public static final int HALFWIDTHKATAKANAINDEX

SDEFINEX

public static final int SDEFINEX

SRESERVED

public static final int SRESERVED

SQUOTEU

public static final int SQUOTEU

SCHANGEU

public static final int SCHANGEU

SQUOTE0

public static final int SQUOTE0

SQUOTE1

public static final int SQUOTE1

SQUOTE2

public static final int SQUOTE2

SQUOTE3

public static final int SQUOTE3

SQUOTE4

public static final int SQUOTE4

SQUOTE5

public static final int SQUOTE5

SQUOTE6

public static final int SQUOTE6

SQUOTE7

public static final int SQUOTE7

SCHANGE0

public static final int SCHANGE0

SCHANGE1

public static final int SCHANGE1

SCHANGE2

public static final int SCHANGE2

SCHANGE3

public static final int SCHANGE3

SCHANGE4

public static final int SCHANGE4

SCHANGE5

public static final int SCHANGE5

SCHANGE6

public static final int SCHANGE6

SCHANGE7

public static final int SCHANGE7

SDEFINE0

public static final int SDEFINE0

SDEFINE1

public static final int SDEFINE1

SDEFINE2

public static final int SDEFINE2

SDEFINE3

public static final int SDEFINE3

SDEFINE4

public static final int SDEFINE4

SDEFINE5

public static final int SDEFINE5

SDEFINE6

public static final int SDEFINE6

SDEFINE7

public static final int SDEFINE7

UCHANGE0

public static final int UCHANGE0

UCHANGE1

public static final int UCHANGE1

UCHANGE2

public static final int UCHANGE2

UCHANGE3

public static final int UCHANGE3

UCHANGE4

public static final int UCHANGE4

UCHANGE5

public static final int UCHANGE5

UCHANGE6

public static final int UCHANGE6

UCHANGE7

public static final int UCHANGE7

UDEFINE0

public static final int UDEFINE0

UDEFINE1

public static final int UDEFINE1

UDEFINE2

public static final int UDEFINE2

UDEFINE3

public static final int UDEFINE3

UDEFINE4

public static final int UDEFINE4

UDEFINE5

public static final int UDEFINE5

UDEFINE6

public static final int UDEFINE6

UDEFINE7

public static final int UDEFINE7

UQUOTEU

public static final int UQUOTEU

UDEFINEX

public static final int UDEFINEX

URESERVED

public static final int URESERVED

sOffsetTable

public static final int[] sOffsetTable
For window offset mapping

sOffsets

public static final int[] sOffsets
Static compression window offsets
Constructor Detail

UnicodeDecompressor

public UnicodeDecompressor()
Create a UnicodeDecompressor. Sets all windows to their default values.
See Also:
reset()
Method Detail

decompress

public static java.lang.String decompress(byte[] buffer)
Decompress a byte array into a String.
Parameters:
buffer - The byte array to decompress.
Returns:
A String containing the decompressed characters.
See Also:
decompress(byte [], int, int)

decompress

public static char[] decompress(byte[] buffer,
                                int start,
                                int limit)
Decompress a byte array into a Unicode character array.
Parameters:
buffer - The byte array to decompress.
start - The start of the byte run to decompress.
limit - The limit of the byte run to decompress.
Returns:
A character array containing the decompressed bytes.
See Also:
decompress(byte [])

decompress

public int decompress(byte[] byteBuffer,
                      int byteBufferStart,
                      int byteBufferLimit,
                      int[] bytesRead,
                      char[] charBuffer,
                      int charBufferStart,
                      int charBufferLimit)
Decompress a byte array into a Unicode character array. This function will either completely fill the output buffer, or consume the entire input.
Parameters:
byteBuffer - The byte buffer to decompress.
byteBufferStart - The start of the byte run to decompress.
byteBufferLimit - The limit of the byte run to decompress.
bytesRead - A one-element array. If not null, on return the number of bytes read from byteBuffer.
charBuffer - A buffer to receive the decompressed data. This buffer must be at minimum two characters in size.
charBufferStart - The starting offset to which to write decompressed data.
charBufferLimit - The limiting offset for writing decompressed data.
Returns:
The number of Unicode characters written to charBuffer.

reset

public void reset()
Reset the decompressor to its initial state.


Copyright (c) 2001 IBM Corporation and others.