ZLIB Manual


Next: , Up: (dir)

ZLIB Manual

ZLIB general purpose compression library version 1.2.1. This manual has been converted to Texinfo format by Marco Maggi marcomaggi@tiscalinet.it, with the addition of small bits.

Copyright © 1995-2004 Jean–loup Gailly and Mark Adler.
Copyright © 2004 Marco Maggi.

Permission is granted to make and distribute verbatim copies of this document provided the copyright notice and this permission notice are preserved on all copies.

Permission is granted to copy and distribute modified versions of this document under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one.

Appendices

Indexes


Next: , Previous: Top, Up: Top

1 Introduction

The ZLIB compression library provides in–memory compression and decompression functions, including integrity checks of the uncompressed data. This version of the library supports only one compression method (deflation) but other algorithms will be added later and will have the same stream interface.

Compression can be done in a single step if the buffers are large enough (for example if an input file is memory mapped with mmap()), or can be done by repeated calls of the compression function. In the latter case, the application must provide more input and/or consume the output (providing more output space) before each call.

The compressed data format used by the in–memory functions is the ZLIB format, which is a ZLIB wrapper documented in RFC 1950, wrapped around a deflate stream, which is itself documented in RFC 1951.

The library also supports reading and writing files in gzip (.gz) format with an interface similar to that of the standard stream C library (stdio.h) using the functions that start with gz. The GZIP format is different from the ZLIB format. GZIP is a GZIP wrapper, documented in RFC 1952, wrapped around a deflate stream.

The ZLIB format was designed to be compact and fast for use in memory and on communications channels. The GZIP format was designed for single– file compression on file systems, has a larger header than ZLIB to maintain directory information, and uses a different, slower check method than zlib.

This library does not provide any functions to write GZIP files in memory. However such functions could be easily written using ZLIB's deflate function, the documentation in the GZIP RFC, and the examples in gzio.c.

The library does not install any signal handler. The decoder checks the consistency of the compressed data, so the library should never crash even in case of corrupted input.


Next: , Previous: Introduction, Up: Top

2 Utility functions

The following utility functions are implemented on top of the basic stream–oriented functions. To simplify the interface, some default options are assumed (compression level and memory usage, standard memory allocation functions). The source code of these utility functions can easily be modified if you need special options.


Next: , Up: Utility functions

2.1 Data Types

— Typedef: Byte

On most platforms is an alias for unsigned char.

— Typedef: uLong

An alias for unsigned long.

— Struct Pointer: gzFile

Declared as void pointer. It's used to reference compressed file descriptors.

— Typedef: z_off_t

Used to represent offsets in the file content. On some systems it is an alias for off_t, on other systems it is an alias for long.


Next: , Previous: Utility Data Types, Up: Utility functions

2.2 Buffer functions

compress() and uncompress() can be used to process a whole file at once if the input file is memory mapped.


Next: , Up: Buffer Functions

2.2.1 Compression functions

— Function: int compress (Bytef * dstPtr, uLongf * dstLenVar, const Bytef * srcPtr, uLong srcLen)

Compresses the source buffer into the destination buffer. compress() is a wrapper for the triplet deflateInit(), deflate(), deflateEnd().

srcLen must be the byte length of the source buffer; srcPtr must be the pointer to the source buffer, at least srclen bytes wide.

Upon entry: dstLenVar must reference a variable holding be the total size of the destination buffer, which (to avoid Z_BUF_ERROR) should be at least the value returned by compressBound()). dstPtr must be the pointer to the destination buffer, at least dstLen bytes wide.

Upon exit: the variable referenced by dstLenVar is modified to hold the actual size of the compressed buffer.

The return value is:

Z_OK
if success;
Z_MEM_ERROR
if there was not enough memory;
Z_BUF_ERROR
if there was not enough room in the output buffer;

other error codes may be returned to signal invalid data.

— Function: int compress2 (Bytef * dstPtr, uLongf * dstLenVar, const Bytef * srcPtr, uLong srcLen, int level)

Like compress() but allows the user to select a compression level. level has the same meaning as in deflateInit() (Deflate Functions, for details). The return values are the same as in compress(); Z_STREAM_ERROR is returned if level has an invalid value.

— Function: uLong compressBound (uLong sourceLen)

Returns an upper bound on the compressed size after compress() or compress2() on sourceLen bytes. It would be used before a compress() or compress2() call to allocate the destination buffer.


Next: , Previous: Compression Functions, Up: Buffer Functions

2.2.2 Decompression functions

— Function: int uncompress (Bytef * dstPtr, uLongf * dstLenVar, const Bytef * srcPtr, uLong srcLen)

Decompresses the source buffer into the destination buffer. uncompress() is a wrapper for the triplet inflateInit(), inflate(), inflateEnd().

srcLen must be the byte length of the source buffer; srcPtr must be a pointer to the source buffer, at least srcLen bytes wide.

Upon entry: dstLenVar must reference a variable holding be the total size of the destination buffer. dstPtr must be the pointer to the destination buffer, at least dstLen bytes wide.

Upon exit: the variable referenced by dstLenVar is modified to hold the actual size of the compressed buffer.

The return value is:

Z_OK
if success;
Z_MEM_ERROR
if there was not enough memory;
Z_BUF_ERROR
if there was not enough room in the output buffer;
Z_DATA_ERROR
if the input data was corrupted or incomplete;

other codes may be returned to signal invalid data.

To avoid the Z_BUF_ERROR: the size of the uncompressed data must have been saved previously by the compressor and transmitted to the decompressor by some mechanism outside the scope of this compression library. That size then can be used to allocate the output buffer.


Previous: Decompression Functions, Up: Buffer Functions

2.2.3 Buffer functions examples

Simple data compression.

     Bytef *     src_p;
     Bytef *     dst_p;
     uLong       src_l;
     uLong       dst_l;
     int         e;
     
     src_l = get_source_data_length();
     src_p = (Bytef *) malloc((size_t) src_l);
     
     fill_buffer_with_source_data(src_l, src_p);
     
     dst_l = compressBound(src_l);
     dst_p = (Bytef *) malloc((size_t) dst_l);
     
     e = compress(dst_p, &dst_l, src_p, src_l);
     free(src_p);
     if (Z_OK == e)
       {
         dst_p = realloc(dst_p, dst_l);
         use_compressed_data(dst_l, dst_p);
       }
     free(dst_p);
     
     if (Z_OK != e)
       {
         /* handle the error */
       }


Previous: Buffer Functions, Up: Utility functions

2.3 Compressed file functions


Next: , Up: File Functions

2.3.1 Opening and closing a compressed file

— Function: gzFile gzopen (const char * path, const char * mode)

This function opens a gzip file for reading or writing. The mode parameter is as in fopen() (rb or wb) but can also include a compression level (wb9) or a strategy: f for filtered data as in wb6f, h for Huffman only compression as in wb1h or R for run–length encoding (wb1R) (Deflate Functions, for details on the strategy parameter).

The function can be used to read a file which is not in gzip format; in this case gzread() will directly read from the file without decompression.

The return value is NULL if the file could not be opened or if there was insufficient memory to allocate the (de)compression state; errno can be checked to distinguish the two cases: if errno is zero, the ZLIB error is Z_MEM_ERROR.

— Function: gzFile gzdopen (int fd, const char * mode)

This function associates a gzFile with the file descriptor fd. File descriptors are obtained from calls like open(), dup(), creat(), pipe() or fileno() (if the file has been previously opened with fopen()). The mode parameter is as in gzopen().

The next call of gzclose() on the returned gzFile will also close the file descriptor fd, just like:

          fclose(fdopen(fd), mode)
     

closes the file descriptor fd. If you want to keep fd open, use:

          gzdopen(dup(fd), mode)
     

gzdopen() returns NULL if there was insufficient memory to allocate the (de)compression state.

— Function: int gzclose (gzFile file)

Flushes all pending output if necessary, closes the compressed file and deallocates all the (de)compression state. The return value is the ZLIB error number (File Errors for details).


Next: , Previous: Open and Close, Up: File Functions

2.3.2 Writing data to a compressed file

— Function: int gzwrite (gzFile file, voidpc buf, unsigned len)

Writes the given number of uncompressed bytes into the compressed file. The return value is the number of uncompressed bytes actually written (0 in case of error).

— Function: int gzprintf (gzFile file, const char * format, ...)

Converts, formats, and writes the args to the compressed file under control of the format string, as in fprintf(). The return value is the number of uncompressed bytes actually written (0 in case of error).

The number of uncompressed bytes written is limited to 4095. The caller should assure that this limit is not exceeded. If it is exceeded, then gzprintf() will return return an error (0) with nothing written. In this case, there may also be a buffer overflow with unpredictable consequences, which is possible only if ZLIB was compiled with the insecure functions sprintf() or vsprintf() because the secure snprintf() or vsnprintf() functions were not available.

— Function: int gzputs (gzFile file, const char * s)

Writes the given NULL–terminated string to the compressed file, excluding the terminating NULL character. The return value is the number of characters written, or -1 in case of error.

— Function: int gzputc (gzFile file, int c)

Writes c, converted to an unsigned char, into the compressed file. The return value is the value that was written, or -1 in case of error.

— Function: int gzflush (gzFile file, int flush)

Flushes all pending output into the compressed file. The parameter flush is as in the deflate() function. The return value is the ZLIB error number (File Errors for details). The return value is Z_OK if the flush parameter is Z_FINISH and all output could be flushed.

This function should be called only when strictly necessary because it can degrade compression.


Next: , Previous: Writing to File, Up: File Functions

2.3.3 Reading data from a compressed file

— Function: int gzread (gzFile file, voidp buf, unsigned len)

Reads the given number of uncompressed bytes from the compressed file. If the input file was not in gzip format, this function copies the given number of bytes into the buffer.

The return value is the number of uncompressed bytes actually read: 0 for end of file, -1 for error.

— Function: char * gzgets (gzFile file, char * buf, int len)

Reads bytes from the compressed file until len-1 characters are read, or a newline character is read and transferred to buf, or an end–of–file condition is encountered. The string is then terminated with a null character. The return value is buf, or Z_NULL in case of error.

— Function: int gzgetc (gzFile file)

Reads one byte from the compressed file. The return value is this byte or -1 in case of end–of–file or error.

— Function: int gzungetc (int c, gzFile file)

Push one character back onto the stream to be read again later. Only one character of push–back is allowed. gzungetc() returns the character pushed, or -1 on failure. gzungetc() will fail if a character has been pushed but not read yet, or if c is -1. The pushed character will be discarded if the stream is repositioned with gzseek() or gzrewind().

— Function: int gzeof (gzFile file)

Returns 1 when end–of–file has previously been detected reading the given input stream, otherwise zero.


Next: , Previous: Reading from File, Up: File Functions

2.3.4 Moving the pointer

— Function: z_off_t gzseek (gzFile file, z_off_t offset, int whence)

Sets the starting position for the next read or write operation on the given compressed file. The offset represents a number of bytes in the uncompressed data stream. The whence parameter is defined as in lseek(); the value SEEK_END is not supported.

If the file is opened for reading, this function is emulated but can be extremely slow. If the file is opened for writing, only forward seeks are supported; the function then compresses a sequence of zeroes up to the new starting position.

The return value is resulting offset location as measured in bytes from the beginning of the uncompressed stream, or -1 in case of error: in particular if the file is opened for writing and the new starting position would be before the current position.

— Function: int gzrewind (gzFile file)

Rewinds the given file. This function is supported only for reading. A call to this function is equivalent to:

          (int) gzseek(file, 0L, SEEK_SET);
     
— Function: z_off_t gztell (gzFile file)

Returns the starting position for the next read or write operation on the given compressed file. This position represents a number of bytes in the uncompressed data stream.

A call to this function is equivalent to:

          gzseek(file, 0L, SEEK_CUR);
     


Next: , Previous: File Pointer, Up: File Functions

2.3.5 Configuring a file descriptor

— Function: int gzsetparams (gzFile file, int level, int strategy)

Dynamically update the compression level or strategy. See the description of deflateInit2() for the meaning of these parameters.

gzsetparams() returns Z_OK if success, or Z_STREAM_ERROR if the file was not opened for writing.


Previous: File Config, Up: File Functions

2.3.6 Examining errors in file operartions

— Function: const char * gzerror (gzFile file, int * errnum)

Returns the error message for the last error which occurred on the given compressed file. The variable referenced by errnum is set to the ZLIB error number. If an error occurred in the file system and not in the compression library, *errnum is set to Z_ERRNO and the application may consult the standard variable errno to get the exact error code.

— Function: void gzclearerr (gzFile file)

Clears the error and end–of–file flags for file. This is analogous to the clearerr() function in stdio.h. This is useful for continuing to read a GZIP file that is being written concurrently.

     #include <errno.h>
     #include <string.h>
     #include <zlib.h>
     
     
     gzFile      descriptor;
     char *      errorString;
     int         errorCode;
     
     ...
     
     errorString = gzerror(descriptor, &errorCode);
     if (Z_ERRNO == errorCode)
       {
          errorString = strerror(errno);
       }


Next: , Previous: Utility functions, Up: Top

3 Basic functions


Next: , Up: Basic functions

3.1 Compressing a stream of data


Next: , Up: Deflate Data

3.1.1 Basic steps of stream compression

The basic steps of stream compression are shown in the following code.

     #define BUFFER_SIZE        ...
     
     z_stream     stream;
     int          compression_level, flush_param, result;
     uLongf       output_buffer_size_in_bytes = BUFFER_SIZE;
     uBytef *     output_buffer[BUFFER_SIZE];
     
     
     /* Select memory allocation and release functions, and an
        optional parameter for them. */
     stream.zalloc   = Z_NULL;
     stream.zfree    = Z_NULL;
     stream.opaque   = Z_NULL;
     
     /* Acquire resources and configure the stream. */
     compression_level = Z_DEFAULT_COMPRESSION;
     deflateInit(&stream, compression_level);
     
     /* Register the input buffer and the output buffer. */
     stream.avail_in  = ...; /* number of bytes in the input buffer */
     stream.next_in   = ...; /* pointer to the next byte of input */
     stream.avail_out = output_buffer_size_in_bytes;
     stream.next_out  = output_buffer;
     
     /* Compress data. */
     flush_param = 0;
     do
       {
         result = deflate(&stream, flush_param);
         if (Z_OK != result)
           {
             break;
           }
     
         consume_compressed_data(output_buffer, output_buffer_size_in_bytes);
     
         /* Reinitialise the stream structure output buffer reference. */
         stream.avail_out = output_buffer_size_in_bytes;
         stream.next_out  = output_buffer;
       }
     while (stream.avail_in);
     
     /* Finish the stream. */
     flush_param = Z_FINISH;
     do
       {
         result = deflate(&stream, flush_param);
         if ((Z_OK != result) && (Z_STREAM_END != result))
           {
             break;
           }
     
         consume_compressed_data(output_buffer, output_buffer_size_in_bytes);
     
         /* Reinitialise the stream structure output buffer reference. */
         stream.avail_out = output_buffer_size_in_bytes;
         stream.next_out  = output_buffer;
       }
     while (stream.avail_in);
     
     /* Test for errors. */
     if (Z_STREAM_END != result)
       {
         /* Handle the error. Do not forget to call deflateEnd() to free
            resources. */
       }
     
     /* Free resources. */
     deflateEnd(&stream);


Next: , Previous: Compression Basic, Up: Deflate Data

3.1.2 Compressing streams of bytes

Input/output stream processing is a software layer that requires two synchronisation steps: between the input source and the layer; between the layer and the output sink. Compression and decompression is a kind of processing that requires accumulation of data in the middle layer.

Streams are just an abstraction: we write code to process input data block by block; so it is possible that:

3.1.2.1 Input and output buffers

We have to select a policy to handle input and output buffers. ZLIB is so kind to let us know how many bytes were consumed from the input buffer and how many free bytes are left in the output buffer.

We have two alternatives for input buffers:

We have two alternatives for output buffers, too:

3.1.2.2 Ending processing

When all the data from the input source has been processed, or if an unrecoverable error occurs while reading, we have to end the operations. Some data may still be in the middle layer, so the library has to provide a way to flush it to the output buffer. This may require more than one processing action, until all the data is flushed.


Previous: Compression Intro, Up: Deflate Data

3.1.3 Deflate Functions

— Function: int deflateInit (z_streamp strm, int level)

Initializes the internal stream state for compression.

The fields zalloc, zfree and opaque must be initialized before by the caller. If zalloc and zfree are set to Z_NULL, the function updates them to use default allocation functions.

The compression level must be Z_DEFAULT_COMPRESSION, or between 0 and 9: 1 gives best speed, 9 gives best compression, 0 gives no compression at all (the input data is simply copied a block at a time). Z_DEFAULT_COMPRESSION requests a default compromise between speed and compression (currently equivalent to level 6).

The return value is Z_OK if success, Z_MEM_ERROR if there was not enough memory, Z_STREAM_ERROR if level is not a valid compression level, Z_VERSION_ERROR if the ZLIB library version (zlibVersion()) is incompatible with the version assumed by the caller (ZLIB_VERSION). The field msg is set to NULL if there is no error message.

This function does not perform any compression: this will be done by deflate().

— Function: int deflate (z_streamp strm, int flush)

This function compresses as much data as possible, and stops when the input buffer becomes empty or the output buffer becomes full. It may introduce some output latency (reading input without producing any output) except when forced to flush.

The detailed semantics are as follows. The function performs one or both of the following actions.

— Function: int deflateEnd (z_streamp strm)

All dynamically allocated data structures for this stream are freed. This function discards any unprocessed input and does not flush any pending output.

The return value is Z_OK if success, Z_STREAM_ERROR if the stream state was inconsistent, Z_DATA_ERROR if the stream was freed prematurely (some input or output was discarded). In the error case: msg may be set but then points to a static string (which must not be deallocated).

Before the call to deflate(): the application should ensure that at least one of the actions is possible, by providing more input and/or consuming more output, and updating avail_in or avail_out accordingly; avail_out should never be zero before the call.

The application can consume the compressed output when it wants, for example when the output buffer is full (avail_out == 0), or after each call. If the return value is Z_OK and with zero avail_out, it must be called again after making room in the output buffer because there might be more output pending.

If the parameter flush is set to Z_SYNC_FLUSH, all pending output is flushed to the output buffer and the output is aligned on a byte boundary, so that the decompressor can get all input data available so far (in particular avail_in is zero after the call if enough output space has been provided before the call). Flushing may degrade compression for some compression algorithms and so it should be used only when necessary.

If flush is set to Z_FULL_FLUSH, all output is flushed as with Z_SYNC_FLUSH, and the compression state is reset so that decompression can restart from this point if previous compressed data has been damaged or if random access is desired. Using Z_FULL_FLUSH too often can seriously degrade the compression.

If the function returns with avail_out == 0, it must be called again with the same value of the flush parameter and more output space (updated avail_out), until the flush is complete (the function returns with non-zero avail_out). In the case of a Z_FULL_FLUSH or Z_SYNC_FLUSH, make sure that avail_out is greater than six to avoid repeated flush markers due to avail_out == 0 on return.

If the parameter flush is set to Z_FINISH, pending input is processed, pending output is flushed and the return value is Z_STREAM_END if there was enough output space; if the return value is Z_OK, this function must be called again with Z_FINISH and more output space (updated avail_out) but no more input data, until it returns with Z_STREAM_END or an error.

After the function has returned Z_STREAM_END, the only possible operations on the stream are deflateReset() or deflateEnd().

Z_FINISH can be used immediately after deflateInit() if all the compression is to be done in a single step. In this case: avail_out must be at least the value returned by deflateBound(). If deflate() does not return Z_STREAM_END, then it must be called again as described above.

deflate() sets strm->adler to the Adler32 checksum of all input read so far (that is, total_in bytes).

deflate() may update data_type if it can make a good guess about the input data type (Z_ASCII or Z_BINARY). In doubt, the data is considered binary. This field is only for information purposes and does not affect the compression algorithm in any manner.

deflate() returns Z_OK if some progress has been made (more input processed or more output produced), Z_STREAM_END if all input has been consumed and all output has been produced (only when flush is set to Z_FINISH), Z_STREAM_ERROR if the stream state was inconsistent (for example if next_in or next_out was NULL), Z_BUF_ERROR if no progress is possible (for example avail_in or avail_out was zero). Note that Z_BUF_ERROR is not fatal, and deflate() can be called again with more input and more output space to continue compressing.


Next: , Previous: Deflate Data, Up: Basic functions

3.2 Decompressing a stream of data

— Function: int inflateInit (z_streamp strm)

Initializes the internal stream state for decompression.

The fields next_in, avail_in, zalloc, zfree and opaque must be initialized before by the caller. If next_in is not Z_NULL and avail_in is large enough (the exact value depends on the compression method), inflateInit() determines the compression method from the ZLIB header and allocates all data structures accordingly; otherwise the allocation will be deferred to the first call of inflate(). If zalloc and zfree are set to Z_NULL, inflateInit() updates them to use default allocation functions.

inflateInit() returns:

Z_OK
if success;
Z_MEM_ERROR
if there was not enough memory;
Z_VERSION_ERROR
if the ZLIB library version is incompatible with the version assumed by the caller.

The msg field is set to null if there is no error message. inflateInit() does not perform any decompression apart from reading the ZLIB header if present: this will be done by inflate() (so next_in and avail_in may be modified, but next_out and avail_out are unchanged).

— Function: int inflate (z_streamp strm, int flush)

Decompresses as much data as possible, and stops when the input buffer becomes empty or the output buffer becomes full. It may introduce some some output latency (reading input without producing any output) except when forced to flush.

The detailed semantics are as follows. inflate() performs one or both of the following actions:

Before the call of inflate, the application should ensure that at least one of the actions is possible, by providing more input and/or consuming more output, and updating the next_* and avail_* values accordingly. The application can consume the uncompressed output when it wants, for example when the output buffer is full (avail_out == 0), or after each call of inflate(). If inflate() returns Z_OK and with zero avail_out, it must be called again after making room in the output buffer because there might be more output pending.

The flush parameter of inflate() can be Z_NO_FLUSH, Z_SYNC_FLUSH, Z_FINISH, or Z_BLOCK. Z_SYNC_FLUSH requests that inflate() flush as much output as possible to the output buffer. Z_BLOCK requests that inflate() stop if and when it get to the next deflate block boundary. When decoding the ZLIB or GZIP format, this will cause inflate() to return immediately after the header and before the first block. When doing a raw inflate, inflate() will go ahead and process the first block, and will return when it gets to the end of that block, or when it runs out of data.

The Z_BLOCK option assists in appending to or combining deflate streams. Also to assist in this, on return inflate() will set strm->data_type to the number of unused bits in the last byte taken from strm->next_in, plus 64 if inflate() is currently decoding the last block in the deflate stream, plus 128 if inflate() returned immediately after decoding an end–of–block code or decoding the complete header up to just before the first byte of the deflate stream. The end–of–block will not be indicated until all of the uncompressed data from that block has been written to strm->next_out(). The number of unused bits may in general be greater than seven, except when bit 7 of data_type is set, in which case the number of unused bits will be less than eight.

inflate() should normally be called until it returns Z_STREAM_END or an error. However if all decompression is to be performed in a single step (a single call of inflate()), the parameter flush should be set to Z_FINISH. In this case all pending input is processed and all pending output is flushed; avail_out must be large enough to hold all the uncompressed data. (The size of the uncompressed data may have been saved by the compressor for this purpose.) The next operation on this stream must be inflateEnd() to deallocate the decompression state. The use of Z_FINISH is never required, but can be used to inform inflate() that a faster approach may be used for the single inflate() call.

In this implementation, inflate() always flushes as much output as possible to the output buffer, and always uses the faster approach on the first call. So the only effect of the flush parameter in this implementation is on the return value of inflate(), as noted below, or when it returns early because Z_BLOCK is used.

If a preset dictionary is needed after this call (see inflateSetDictionary() below), inflate sets strm->adler to the Adler32 checksum of the dictionary chosen by the compressor and returns Z_NEED_DICT; otherwise it sets strm->adler to the Adler32 checksum of all output produced so far (that is, total_out bytes) and returns Z_OK, Z_STREAM_END or an error code as described below. At the end of the stream, inflate() checks that its computed Adler32 checksum is equal to that saved by the compressor and returns Z_STREAM_END only if the checksum is correct.

inflate() will decompress and check either ZLIB–wrapped or GZIP–wrapped deflate data. The header type is detected automatically. Any information contained in the GZIP header is not retained, so applications that need that information should instead use raw inflate, see inflateInit2(), or inflateBack() and perform their own processing of the gzip header and trailer.

inflate() returns:

Z_OK
if some progress has been made (more input processed or more output produced);
Z_STREAM_END
if the end of the compressed data has been reached and all uncompressed output has been produced;
Z_NEED_DICT
if a preset dictionary is needed at this point;
Z_DATA_ERROR
if the input data was corrupted (input stream not conforming to the zlib format or incorrect check value);
Z_STREAM_ERROR
if the stream structure was inconsistent (for example if next_in or next_out was NULL);
Z_MEM_ERROR
if there was not enough memory,
Z_BUF_ERROR
if no progress is possible or if there was not enough room in the output buffer when Z_FINISH is used.

Note that Z_BUF_ERROR is not fatal, and inflate() can be called again with more input and more output space to continue decompressing. If Z_DATA_ERROR is returned, the application may then call inflateSync() to look for a good compression block if a partial recovery of the data is desired.

— Function: int inflateEnd (z_streamp strm)

All dynamically allocated data structures for this stream are freed. This function discards any unprocessed input and does not flush any pending output.

inflateEnd() returns Z_OK if success, Z_STREAM_ERROR if the stream state was inconsistent. In the error case: msg may be set but then points to a static string (which must not be deallocated).


Previous: Inflate Data, Up: Basic functions

3.3 Checking library version

— Function: const char * zlibVersion (void)

The application can compare the return value of this function and ZLIB_VERSION for consistency. If the first character differs, the library code actually used is not compatible with the zlib.h header file used by the application. This check is automatically made by deflateInit() and inflateInit().


Next: , Previous: Basic functions, Up: Top

4 Advanced functions

The functions described in this chapter are needed only in some special applications.


Next: , Up: Advanced functions

4.1 Advanced Deflate

— Function: int deflateInit2 (z_streamp strm, int level, int method, int windowBits, int memLevel, int strategy)

This is another version of deflateInit() with more compression options. The fields next_in, zalloc, zfree and opaque must be initialized before by the caller.

The method parameter is the compression method. It must be Z_DEFLATED in this version of the library.

The windowBits parameter is the base two logarithm of the window size (the size of the history buffer). It should be in the range 8..15 for this version of the library. Larger values of this parameter result in better compression at the expense of memory usage. The default value is 15 if deflateInit() is used instead.

windowBits can also be -8..-15 for raw deflate. In this case, -windowBits determines the window size. deflate() will then generate raw deflate data with no ZLIB header or trailer, and will not compute an Adler32 check value.

windowBits can also be greater than 15 for optional GZIP encoding. Add 16 to windowBits to write a simple GZIP header and trailer around the compressed data instead of a ZLIB wrapper. The GZIP header will have no file name, no extra data, no comment, no modification time (set to zero), no header crc, and the operating system will be set to 255 (unknown).

The memLevel parameter specifies how much memory should be allocated for the internal compression state. memLevel==1 uses minimum memory but is slow and reduces compression ratio; memLevel==9 uses maximum memory for optimal speed. The default value is 8. See zconf.h for total memory usage as a function of windowBits and memLevel.

The strategy parameter is used to tune the compression algorithm. Use the value:

Z_DEFAULT_STRATEGY
for normal data;
Z_FILTERED
for data produced by a filter (or predictor);
Z_HUFFMAN_ONLY
to force Huffman encoding only (no string match);
Z_RLE
to limit match distances to one (run–length encoding).

Filtered data consists mostly of small values with a somewhat random distribution. In this case, the compression algorithm is tuned to compress them better. The effect of Z_FILTERED is to force more Huffman coding and less string matching; it is somewhat intermediate between Z_DEFAULT and Z_HUFFMAN_ONLY. Z_RLE is designed to be almost as fast as Z_HUFFMAN_ONLY, but give better compression for PNG image data. The strategy parameter only affects the compression ratio but not the correctness of the compressed output even if it is not set appropriately.

Return values are:

Z_OK
if success;
Z_MEM_ERROR
if there was not enough memory;
Z_STREAM_ERROR
if a parameter is invalid (such as an invalid method).

The msg field of the stream structure is set to null if there is no error message. deflateInit2() does not perform any compression: this will be done by deflate().

— Function: int deflateSetDictionary (z_streamp strm, const Bytef * dictionary, uInt dictLength)

Initializes the compression dictionary from the given byte sequence without producing any compressed output. This function must be called immediately after deflateInit(), deflateInit2() or deflateReset(), before any call of deflate(). The compressor and decompressor must use exactly the same dictionary (see inflateSetDictionary()).

The dictionary should consist of strings (byte sequences) that are likely to be encountered later in the data to be compressed, with the most commonly used strings preferably put towards the end of the dictionary. Using a dictionary is most useful when the data to be compressed is short and can be predicted with good accuracy ; the data can then be compressed better than with the default empty dictionary.

Depending on the size of the compression data structures selected by deflateInit() or deflateInit2(), a part of the dictionary may in effect be discarded, for example if the dictionary is larger than the window size in deflate() or deflate2(). Thus the strings most likely to be useful should be put at the end of the dictionary, not at the front.

Upon return of this function, strm->adler is set to the Adler32 value of the dictionary; the decompressor may later use this value to determine which dictionary has been used by the compressor. (The Adler32 value applies to the whole dictionary even if only a subset of the dictionary is actually used by the compressor.) If a raw deflate was requested, then the Adler32 value is not computed and strm->adler is not set.

deflateSetDictionary() returns Z_OK if success, or Z_STREAM_ERROR if a parameter is invalid (such as NULL dictionary) or the stream state is inconsistent (for example if deflate() has already been called for this stream or if the compression method is bsort). deflateSetDictionary() does not perform any compression: this will be done by deflate().

— Function: int deflateCopy (z_streamp dest, z_streamp source)

Sets the destination stream as a complete copy of the source stream.

This function can be useful when several compression strategies will be tried, for example when there are several ways of pre–processing the input data with a filter. The streams that will be discarded should then be freed by calling deflateEnd(). Note that deflateCopy() duplicates the internal compression state which can be quite large, so this strategy is slow and can consume lots of memory.

Return values are:

Z_OK
if success;
Z_MEM_ERROR
if there was not enough memory;
Z_STREAM_ERROR
if the source stream state was inconsistent (such as zalloc being NULL).

The msg field of the stream structure is left unchanged in both source and destination.

— Function: int deflateReset (z_streamp strm)

This function is equivalent to deflateEnd() followed by deflateInit(), but does not free and reallocate all the internal compression state. The stream will keep the same compression level and any other attributes that may have been set by deflateInit2().

deflateReset() returns Z_OK if success, or Z_STREAM_ERROR if the source stream state was inconsistent (such as zalloc or state being NULL).

— Function: int deflateParams (z_streamp strm, int level, int strategy)

Dynamically update the compression level and compression strategy. The interpretation of level and strategy is as in deflateInit2(). This can be used to switch between compression and straight copy of the input data, or to switch to a different kind of input data requiring a different strategy. If the compression level is changed, the input available so far is compressed with the old level (and may be flushed); the new level will take effect only at the next call of deflate().

Before the call of deflateParams(), the stream state must be set as for a call of deflate(), since the currently available input may have to be compressed and flushed. In particular, strm->avail_out must be non-zero.

Return values are:

Z_OK
if success;
Z_STREAM_ERROR
if the source stream state was inconsistent or if a parameter was invalid;
Z_BUF_ERROR
if strm->avail_out was zero.

— Function: uLong deflateBound (z_streamp strm, uLong sourceLen)

Returns an upper bound on the compressed size after deflation of sourceLen bytes. It must be called after deflateInit() or deflateInit2(). This would be used to allocate an output buffer for deflation in a single pass, and so would be called before deflate().

— Function: int deflatePrime (z_streamp strm, int bits, int value)

Inserts bits in the deflate output stream. The intent is that this function is used to start off the deflate output with the bits leftover from a previous deflate stream when appending to it. As such, this function can only be used for raw deflate, and must be used before the first deflate() call after a deflateInit2() or deflateReset(). bits must be less than or equal to 16, and that many of the least significant bits of value will be inserted in the output.

deflatePrime() returns Z_OK if success, or Z_STREAM_ERROR if the source stream state was inconsistent.


Previous: Advanced Deflate, Up: Advanced functions

4.2 Advanced Inflate

— Function: int inflateInit2 (z_streamp strm, int windowBits)

This is another version of inflateInit() with an extra parameter. The fields next_in, avail_in, zalloc, zfree and opaque must be initialized before by the caller.

The windowBits parameter is the base two logarithm of the maximum window size (the size of the history buffer). It should be in the range 8..15 for this version of the library. The default value is 15 if inflateInit() is used instead. windowBits must be greater than or equal to the windowBits value provided to deflateInit2() while compressing, or it must be equal to 15 if deflateInit2() was not used. If a compressed stream with a larger window size is given as input, inflate() will return with the error code Z_DATA_ERROR instead of trying to allocate a larger window.

windowBits can also be -8..-15 for raw inflate. In this case, -windowBits determines the window size. inflate() will then process raw deflate data, not looking for a ZLIB or GZIP header, not generating a check value, and not looking for any check values for comparison at the end of the stream. This is for use with other formats that use the deflate compressed data format such as ZIP. Those formats provide their own check values. If a custom format is developed using the raw deflate format for compressed data, it is recommended that a check value such as an Adler32 or a crc32 be applied to the uncompressed data as is done in the ZLIB, GZIP, and ZIP formats. For most applications, the ZLIB format should be used as is. Note that comments above on the use in deflateInit2() applies to the magnitude of windowBits.

windowBits can also be greater than 15 for optional GZIP decoding. Add 32 to windowBits to enable ZLIB and GZIP decoding with automatic header detection, or add 16 to decode only the GZIP format (the ZLIB format will return a Z_DATA_ERROR).

Return values are:

Z_OK
if success;
Z_MEM_ERROR
if there was not enough memory;
Z_STREAM_ERROR
if a parameter is invalid (such as a negative memLevel).

The msg field of the stream structure is set to NULL if there is no error message.

inflateInit2() does not perform any decompression apart from reading the ZLIB header if present (so next_in and avail_in may be modified, but next_out and avail_out are unchanged).

— Function: int inflateSetDictionary (z_streamp strm, const Bytef * dictionary, uInt dictLength)

Initializes the decompression dictionary from the given uncompressed byte sequence. This function must be called immediately after a call of inflate() if this call returned Z_NEED_DICT. The dictionary chosen by the compressor can be determined from the Adler32 value returned by this call of inflate(). The compressor and decompressor must use exactly the same dictionary (see deflateSetDictionary()).

Return values are:

Z_OK
if success;
Z_STREAM_ERROR
if a parameter is invalid (such as NULL dictionary) or the stream state is inconsistent;
Z_DATA_ERROR
if the given dictionary doesn't match the expected one (incorrect Adler32 value).

inflateSetDictionary() does not perform any decompression: this will be done by subsequent calls of inflate().

— Function: int inflateSync (z_streamp strm)

Skips invalid compressed data until a full flush point (see above the description of deflate() with Z_FULL_FLUSH) can be found, or until all available input is skipped. No output is provided.

inflateSync() returns:

Z_OK
if a full flush point has been found;
Z_BUF_ERROR
if no more input was provided;
Z_DATA_ERROR
if no flush point has been found;
Z_STREAM_ERROR
if the stream structure was inconsistent.

In the success case, the application may save the current current value of total_in which indicates where valid compressed data was found. In the error case, the application may repeatedly call inflateSync(), providing more input each time, until success or end of the input data.

— Function: int inflateCopy (z_streamp dest, z_streamp source)

Sets the destination stream as a complete copy of the source stream.

This function can be useful when randomly accessing a large stream. The first pass through the stream can periodically record the inflate state, allowing restarting inflate at those points when randomly accessing the stream.

inflateCopy() returns:

Z_OK
if success;
Z_MEM_ERROR
if there was not enough memory;
Z_STREAM_ERROR
if the source stream state was inconsistent (such as zalloc being NULL). The msg field of the stream structure is left unchanged in both source and destination.

— Function: int inflateReset (z_streamp strm)

This function is equivalent to inflateEnd() followed by inflateInit(), but does not free and reallocate all the internal decompression state. The stream will keep attributes that may have been set by inflateInit2().

inflateReset() returns:

Z_OK
if success;
Z_STREAM_ERROR
if the source stream state was inconsistent (such as zalloc or state being NULL).

— Function: int inflateBackInit (z_stream * strm, int windowBits, unsigned char * window)

Initialize the internal stream state for decompression using inflateBack() calls.

The fields zalloc, zfree and opaque in strm must be initialized before the call. If zalloc and zfree are Z_NULL, then the default library–derived memory allocation routines are used. windowBits is the base two logarithm of the window size, in the range 8..15. window is a caller supplied buffer of that size. Except for special applications where it is assured that deflate() was used with small window sizes, windowBits must be 15 and a 32K byte window must be supplied to be able to decompress general deflate streams.

See inflateBack() for the usage of these routines.

inflateBackInit() will return:

Z_OK
on success;
Z_STREAM_ERROR
if any of the paramaters are invalid;
Z_MEM_ERROR
if the internal state could not be allocated;
Z_VERSION_ERROR
if the version of the library does not match the version of the header file.

— Function Prototype Typedef: in_func
          typedef unsigned (*in_func) OF((void FAR *, unsigned char FAR * FAR *));
     
— Function Prototype Typedef: out_func
          typedef int (*out_func) OF((void FAR *, unsigned char FAR *, unsigned));
     
— Function: int inflateBack (z_stream * strm, in_func in, void * in_desc, out_func out, void * out_desc)

Does a raw inflate with a single call using a call–back interface for input and output. This is more efficient than inflate() for file I/O applications in that it avoids copying between the output and the sliding window by simply making the window itself the output buffer. This function trusts the application to not change the output buffer passed by the output function, at least until inflateBack() returns.

inflateBackInit() must be called first to allocate the internal state and to initialize the state with the user–provided window buffer. inflateBack() may then be used multiple times to inflate a complete, raw deflate stream with each call. inflateBackEnd() is then called to free the allocated state.

A raw deflate stream is one with no ZLIB or GZIP header or trailer. This routine would normally be used in a utility that reads ZIP or GZIP files and writes out uncompressed files. The utility would decode the header and process the trailer on its own, hence this routine expects only the raw deflate stream to decompress. This is different from the normal behavior of inflate(), which expects either a ZLIB or GZIP header and trailer around the deflate stream.

inflateBack() uses two subroutines supplied by the caller that are then called by inflateBack() for input and output. inflateBack() calls those routines until it reads a complete deflate stream and writes out all of the uncompressed data, or until it encounters an error. The function's parameters and return types are defined above in the in_func and out_func typedefs.

inflateBack() will call in(in_desc, &buf) which should return the number of bytes of provided input, and a pointer to that input in buf. If there is no input available, in must return zero (buf is ignored in that case) and inflateBack() will return a buffer error.

inflateBack() will call out(out_desc, buf, len) to write the uncompressed data buf[0..len-1]. out should return zero on success, or non–zero on failure. If out returns non–zero, inflateBack() will return with an error. Neither in nor out are permitted to change the contents of the window provided to inflateBackInit(), which is also the buffer that out uses to write from. The length written by out will be at most the window size. Any non–zero amount of input may be provided by in.

For convenience, inflateBack() can be provided input on the first call by setting strm->next_in and strm->avail_in. If that input is exhausted, then in will be called. Therefore strm->next_in must be initialized before calling inflateBack(). If strm->next_in is Z_NULL, then in will be called immediately for input. If strm->next_in is not Z_NULL, then strm->avail_in must also be initialized, and then if strm->avail_in is not zero, input will initially be taken from strm->next_in[0 .. strm->avail_in - 1].

The in_desc and out_desc parameters of inflateBack is passed as the first parameter of in and out respectively when they are called. These descriptors can be optionally used to pass any information that the caller–supplied in and out functions need to do their job.

On return, inflateBack() will set strm->next_in and strm->avail_in to pass back any unused input that was provided by the last in call.

The return values of inflateBack() can be:

Z_STREAM_END
on success;
Z_BUF_ERROR
if in or out returned an error,
Z_DATA_ERROR
if there was a format error in the deflate stream (in which case strm->msg is set to indicate the nature of the error);
Z_STREAM_ERROR
if the stream was not properly initialized.

In the case of Z_BUF_ERROR, an input or output error can be distinguished using strm->next_in which will be Z_NULL only if in returned an error. If strm->next_in is not Z_NULL, then the Z_BUF_ERROR was due to out returning non–zero (in will always be called before out, so strm->next_in is assured to be defined if out returns non–zero). Note that inflateBack() cannot return Z_OK.

— Function: int inflateBackEnd (z_stream * strm)

All memory allocated by inflateBackInit() is freed. inflateBackEnd() returns Z_OK on success, or Z_STREAM_ERROR if the stream state was inconsistent.

— Function: uLong zlibCompileFlags (void)

Return flags indicating compile–time options.

Type sizes: each field is two bits wide, 00 means the size is 16 bits, 01 means the size is 32 bits, 10 means 64, 11 means other. Position of bits follows:

1.0
Size of uInt (bit one and bit zero).
3.2
Size of uLong.
5.4
Size of voidpf (pointer).
7.6
Size of z_off_t.

Compiler, assembler, and debug options:

8
DEBUG
9
ASMV or ASMINF, use ASM code;
10
ZLIB_WINAPI, exported functions use the WINAPI calling convention;
11
0 (reserved).

One–time table building (smaller code, but not thread–safe if true):

12
BUILDFIXED, build static block decoding tables when needed;
13
DYNAMIC_CRC_TABLE, build CRC calculation tables when needed;
14,15
0 (reserved).

Library content (indicates missing functionality):

16
NO_GZCOMPRESS, gz* functions cannot compress (to avoid linking deflate code when not needed);
17
NO_GZIP, deflate can't write GZIP streams, and inflate can't detect and decode GZIP streams (to avoid linking crc code);
18-19
0 (reserved).

Operation variations (changes in library functionality):

20
PKZIP_BUG_WORKAROUND, slightly more permissive inflate;
21
FASTEST, deflate algorithm with only one, lowest compression level;
22,23
0 (reserved).

The sprintf() variant used by gzprintf() (zero is best):

24
0 = vs*, 1 = s* – 1 means limited to 20 arguments after the format;
25
0 = *nprintf, 1 = *printf – 1 means gzprintf() not secure!
26
0 = returns value, 1 = void – 1 means inferred string length returned;

Remainder: 27-31: 0 (reserved).


Next: , Previous: Advanced functions, Up: Top

5 Constants

Z_NO_FLUSH 0
Z_PARTIAL_FLUSH 1
Z_SYNC_FLUSH 2
Z_FULL_FLUSH 3
Z_FINISH 4
Z_BLOCK 5
Allowed flush values; see deflate() and inflate() below for details (Z_PARTIAL_FLUSH will be removed, use Z_SYNC_FLUSH instead).
Z_OK 0
Z_STREAM_END 1
Z_NEED_DICT 2
Z_ERRNO (-1)
Z_STREAM_ERROR (-2)
Z_DATA_ERROR (-3)
Z_MEM_ERROR (-4)
Z_BUF_ERROR (-5)
Z_VERSION_ERROR (-6)
Return codes for the compression/decompression functions. Negative values are errors, positive values are used for special but normal events.
Z_NO_COMPRESSION 0
Z_BEST_SPEED 1
Z_BEST_COMPRESSION 9
Z_DEFAULT_COMPRESSION (-1)
Compression levels.
Z_FILTERED 1
Z_HUFFMAN_ONLY 2
Z_RLE 3
Z_DEFAULT_STRATEGY 0
Compression strategy; see deflateInit2() for details.
Z_BINARY 0
Z_ASCII 1
Z_UNKNOWN 2
Possible values of the data_type field (though see inflate()).
Z_DEFLATED 8
The deflate() compression method (the only one supported in this version).
Z_NULL 0
For initializing zalloc, zfree(), opaque().
zlib_version
An alias to zlibVersion() for compatibility with versions less than 1.0.2.


Next: , Previous: Constants, Up: Top

6 The stream structure

— Struct Typedef: z_stream

No description.

Fields description follows.

Bytef * next_in
Next input byte.
uInt avail_in
Number of bytes available at next_in.
uLong total_in
Total number of input bytes read so far.
Bytef * next_out
Next output byte should be put there.
uInt avail_out
Remaining free space at next_out.
uLong total_out
Total number of bytes output so far.
char * msg
Last error message, NULL if no error.
struct internal_state FAR * state
Not visible by applications.
alloc_func zalloc
Used to allocate the internal state.
free_func zfree
Used to free the internal state.
voidpf opaque
Private data object passed to zalloc and zfree.
int data_type
Best guess about the data type: ascii or binary.
uLong adler
Adler32 value of the uncompressed data.
uLong reserved
reserved for future use.

— Struct Pointer: z_streamp

The pointer to the stream structure.

The application must update next_in and avail_in when avail_in has dropped to zero. It must update next_out and avail_out when avail_out has dropped to zero. The application must initialize zalloc, zfree and opaque before calling the init function. All other fields are set by the compression library and must not be updated by the application.

The opaque value provided by the application will be passed as the first parameter for calls of zalloc and zfree. This can be useful for custom memory management. The compression library attaches no meaning to the opaque value.

zalloc must return Z_NULL if there is not enough memory for the object. If ZLIB is used in a multi–threaded application, zalloc and zfree must be thread safe.

On 16–bit systems, the functions zalloc and zfree must be able to allocate exactly 65536 bytes, but will not be required to allocate more than this if the symbol MAXSEG_64K is defined (see zconf.h). Warning: on MSDOS, pointers returned by zalloc for objects of exactly 65536 bytes must have their offset normalized to zero. The default allocation function provided by this library ensures this (see zutil.c). To reduce memory requirements and avoid any allocation of 64K objects, at the expense of compression ratio, compile the library with -DMAX_WBITS=14 (see zconf.h).

The fields total_in and total_out can be used for statistics or progress reports. After compression, total_in holds the total size of the uncompressed data and may be saved for use in the decompressor (particularly if the decompressor wants to decompress everything in a single step).


Next: , Previous: z_stream, Up: Top

7 Checksum functions

These functions are not related to compression but are exported anyway because they might be useful in applications using the compression library.

— Function: uLong adler32 (uLong adler, const Bytef * buf, uInt len)

Update a running Adler-32 checksum with the bytes buf[0..len-1] and return the updated checksum. If buf is NULL, this function returns the required initial value for the checksum.

An Adler-32 checksum is almost as reliable as a CRC32 but can be computed much faster. Usage example:

     uLong adler = adler32(0L, Z_NULL, 0);
     
     while (read_buffer(buffer, length) != EOF)
       {
         adler = adler32(adler, buffer, length);
       }
     if (adler != original_adler) error();
— Function: uLong crc32 (uLong crc, const Bytef * buf, uInt len)

Update a running crc with the bytes buf[0..len-1] and return the updated crc. If buf is NULL, this function returns the required initial value for the crc. Pre– and post–conditioning (one's complement) is performed within this function so it shouldn't be done by the application.

Usage example:

     uLong crc = crc32(0L, Z_NULL, 0);
     
     while (read_buffer(buffer, length) != EOF)
       {
         crc = crc32(crc, buffer, length);
       }
     if (crc != original_crc) error();


Next: , Previous: Checksum Functions, Up: Top

8 Miscellaneous functions

deflateInit() and inflateInit() are macros to allow checking the ZLIB version and the compiler's view of z_stream.

— Function: const char * zError (int err)

Converts an error code to string; exported for compress(), compress2() and uncompress().

— Function: int inflateSyncPoint (z_streamp z)

No description.

— Function: const uLongf * get_crc_table (void)

No description.


Next: , Previous: Misc, Up: Top

Appendix A Software License

Copyright © 1995–2004 Jean–loup Gailly and Mark Adler.

This software is provided “as–is”, without any express or implied warranty. In no event will the authors be held liable for any damages arising from the use of this software.

Permission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions.

Jean–loup Gailly jloup@gzip.org
Mark Adler madler@alumni.caltech.edu


Next: , Previous: Software License, Up: Top

Appendix B Bibliography and references

The data format used by the ZLIB library is described by RFC (Request for Comments) 1950 to 1952 in the files:

ftp://ds.internic.net/rfc/rfc1950.txt
ZLIB format;
ftp://ds.internic.net/rfc/rfc1951.txt
deflate format;
ftp://ds.internic.net/rfc/rfc1952.txt
gzip format.

Visit http://ftp.cdrom.com/pub/infozip/zlib/ for the official ZLIB web page.


Previous: References, Up: Top

Appendix C An entry for each concept