asdlGen
Reference Manual
: Pickles
: Pickle Format Details The format of the data is described in
detail below for description writers that use the reader
and
writer
properties to replace the default pickling code for whatever
reason.
Since ASDL data structures have a tree-like form, they can be represented linearly with a simple prefix encoding. It is easy to generate functions that convert to and from the linear form. A pre-order walk of the data structure is all that's needed. The walk is implemented as recursively defined functions for each type in an ASDL definition. Each function visits a node of that type and recursively walks the rest of the tree. Functions that write a value take as their first argument the value to write. The second argument is the stream that is to be written to. Functions that read values take the stream they are to read the value from as their single argument and return the value read.
Since ASDL integers are intended to be of infinite precision they are represented with a variable-length, signed-magnitude encoding. The eight bit of each byte indicates if a given byte is the last byte of the integer encoding. The seventh bit of the most significant byte is used to determine the sign of the value. Numbers in the range of -63 to 63 can be encoded in one byte. Numbers out of this range require an extra byte for every seven bits of precision needed. Bytes are written out in little-endian order. If most integer values tend to be values near zero, this encoding of integers may use less space than a fixed precision representation.
Strings are represented with a length-header that describe how many more 8-bit bytes follow for the string and then the data for the string in bytes. The length-header is encoded with the same arbitrary precision integer encoding described previously.
Identifiers are represented as if they were strings.
Product types are represented sequentially without any tag. The fields of the product type are packed from left to right.
Sum types begin with a unique tag to identify the constructor followed sequentially by the fields of the constructor. Tag values are assigned based on the order of constructor definition in the description. The first constructor has a tag value of one. Fields are packed left to right based of the order in the definition. If there are any attribute values associated with the type, they are packed left to right after the tag but before other constructor fields. The tag is encoded with the same arbitrary precision integer encoding described previously.
Sequence types are represented with an integer length-header followed by that many values of that type. The length-header is encoded with the same arbitrary precision integer encoding described previously.
Optional values are preceded by an integer header that is either one or zero. A zero indicates that the value is empty (NONE, nil, or NULL) and no more data follows. A one indicates that the next value is the value of the optional value. The header is encoded with the same arbitrary precision integer encoding described previously.
asdlGen
Reference Manual
: Pickles
: Pickle Format Details