Converting EDI documents to business objects

To convert an EDI document to a business object, the EDI data handler loops through the attributes of the top-level business object definition. It obtains the name of the business object to create, then processes the attributes recursively, in the order in which attributes appear in the top-level business object and its children, assigning element values from the EDI document to the business object.

The EDI data handler processes an EDI document into a business object as follows:

  1. The data handler sets any properties that were passed in through the optional configuration object. This information would be passed in through the config argument of the getBO() method.
  2. The data handler initializes itself to prepare for reading the EDI document. For more information, see Initializing the data handler.
  3. If the data handler does not receive a business object from the caller, it must create one based on the business object name it finds in the name-handler lookup file. For more information, see Determining the name of the business object.
  4. Once the data handler has access to an instance of the top-level business object, the data handler populates this business object and its children with data from the EDI document. For more information, see Populating the business object.
  5. When the data handler completes the conversion, it returns the top-level business object to the caller. The data handler returns the entire hierarchy, the top-level business object and all its child objects.

Initializing the data handler

To initialize itself to convert an EDI document to a business object, the EDI data handler takes the following steps:

  1. Check that the Reader object that contains the serialized data supports the mark() operation.
  2. Begin parsing the EDI document to obtain the first segment name, the separators, the transaction ID, and DUNS number.

Each of these steps is described in more detail in the subsections that follow.

Checking the Reader object

The EDI data handler must be able to mark a particular position within the EDI document and then subsequently return to that position. Because the EDI document is passed to the EDI data handler as a Reader object, this Reader object must be able to support the mark() operation.

As its first initialization step, therefore, the EDI data handler checks that the Reader object that it receives supports the mark() operation. If not, the data handler logs an error and generates an exception. It is recommended that all serialized data be passed into the EDI data handler in a StringReader object.

Note:
For more information about a Reader object and the mark() operation, see the Notes section in the description of getBO() - public.

Determining the document separators to read

To convert an EDI document to a business object, the EDI data handler must correctly read separators in the EDI document. The data handler parses the document to obtain these separators. Because the first three characters of the EDI document are known, the data handler parses these characters first. It reads the first three characters to determine if they represent:

Checking for the UNA service string advice

The UNA service string advice is the optional first element in EDI documents that follow the EDIFACT standard. This service string consists of six alphanumeric characters in the following order:

Component separator
Element separator
Decimal mark
Release character
Repeat separator (syntax version 4 only)
Segment separator

If the first 3 characters of the EDI document are "UNA", the data handler uses the values that the UNA service string specifies to interpret the EDI document. These separator values take precedence over any other separator settings in the EDI document, including any in the UNA or UNB positional-information attributes of the child meta-object.

Note:
For an EDI document with a UNA service string advice, the data handler obtains the transaction ID and the DUNS number from the UNA positional-information attribute of the child meta-object. For more information, see the next section.
Obtaining positional information

If the first 3 characters of the EDI document are not "UNA", the data handler assumes that they represent the name of the initial segment. The data handler assumes that the initial segments are part of the header and have names that are exactly three characters in length. If no UNA service string advice exists, the data handler must obtain document separators from the EDI document itself. The data handler continues parsing the EDI document, performing the following tasks:

The positional-information attribute specifies the information with the series of tags that Table 46 shows.

Table 46. EDI document information in the positional-information attribute

EDI document information Attribute tag Description Required?
Segment separator length Specifies the length of the first segment (in number of characters), including the segment name but excluding the segment separator. Yes
Segment count seg_count Specifies the relative position of the field that contains the number of segments written to the EDI document (during business-object-to-EDI conversion). For information about the use of this tag, see Converting business objects to EDI documents. Yes
Composite separator cs Provides the relative position of the composite separator. No
Repeat separator rs Provides the relative position of the repeat separator. No
Transaction identifier tid Provides the relative position of the transaction ID. Yes
DUNS number duns Provides the relative position of the DUNS number. Yes
Version/release number version Provides the relative position of the functional group/message version number. No

As Table 46 indicates, the positional-information attribute must provide values for the seg_count, length, tid, and duns tags and optionally the version tag. Specifying values for the cs and rs tags is not required. However, if either of these tags are omitted and the data handler parses EDI documents that contain composites, the data handler uses the precedence shown in Table 47 to obtain a value.

Table 47. Default values for composite and repeat separators

Precedence step Composite separator Repeat separator
1 Obtain the value of the corresponding meta-object attribute. SEPARATOR_COMPOSIT
SEPARATOR_REPEAT
2 If the associated meta-object attribute is not set, use a hard-coded default. colon (:) caret (^)

The cs, rs, tid, and duns tags use the following format to indicate relative position within an EDI document:

tagname=seg_name+elem_pos+compos_pos
 

where:

The seg_count tag uses the following format to indicate its relative position within an EDI document:

seg_count=seg_name+elem_pos
 

where seg_name and elem_pos are as described above; that is, the seg_count specification never includes a compos_pos value.

Note:
Do not include spaces between the seg_name, elem_pos, and compos_pos values and the plus signs.

Figure 31 lists a sample EDI document that uses the X.12 standard. To improve readability of this EDI document, the example inserts newline characters at the end of each segment.

Figure 31. Sample EDI document in X.12 standard


ISA*00*0000000000*02*XXXX*cw*ldtp3*cw*ld*970106*1525*U*00200*0000000100*0*P*<
 GS*AA*ldtp3*ld*20010424*1525*142*X*004010
 ST*846*001420001
 SE*2*001420001
 GE*1*142
 IEA*1*0000000100
 

To obtain the positional information from an EDI document that follows the X.12 standard, the EDI data handler takes the following steps:

  1. Locate an attribute in the data handler's child meta-object that matches the name of the first segment.

    For the sample EDI document in Figure 31, the child meta-object needs to contain an ISA attribute (because the name of the first segment is ISA).

  2. From this meta-object attribute, obtain the positional information.

    In the current example, the ISA attribute contains the following positional information:

    length=77;tid=ST+1;duns=ISA+6;seg_count=SE+1
     
    

    or (If version is included in the dbfile.txt) then

    length=77;tid=ST+1;version=GS+8;duns=ISA+6;seg_count=SE+1
     
    
  3. Continue parsing the first segment of the document to determine the segment separator. The data handler assumes that the segment separator is at the end of the first segment so it takes the following steps:
    Note:
    Due to the algorithm that the data handler uses to locate the segment separator, this separator cannot be set to an alphanumeric character. In 2, the length tag specifies a segment length of 77 characters, which indicates that the segment separator in the Figure 31 document is the newline (carriage return) character. Therefore, the data handler interprets each newline character as a segment separator.
  4. Continue parsing the first segment to determine the transaction ID, based on the tid= tag in the positional-information attribute of the child meta-object.

    The Figure 31 document follows the X.12 standard. This example does not include any composites. Therefore, its ISA attribute (in 2) does not include the composite (cs) or repeat (rs) separators. This attribute does include the tid tag, which specifies that the transaction ID occurs as the first element in the segment named ST; therefore, the transaction ID for the Figure 31 document is 846.

  5. Continue parsing the first segment to find the version (if the optional version tag is specified in the dbfile.txt).

    In Figure 31 the version is the 8th element of the GS segment: 004010.

  6. Parse the document to find the DUNS number. If the data handler cannot find the DUNS number, it logs an error and generates an exception.

    In 2, the duns tag specifies that the DUNS number occurs as the sixth element in the segment named ISA; therefore, the DUNS number for the Figure 31 document is ldtp3.

To obtain the positional information from an EDI document that follows the EDIFACT standard, the data handler takes the same steps as described for parsing an EDI document that follows the X.12 standard. The only major differences are:

The following line is a fragment of an EDI document that follows the EDIFACT standard:

Figure 32. Sample EDI document fragment with composite separator

ST*st_child_value_1*,*st_grand_child_val_11,st_grand_child_val_12^
 st_grand_child_val_13,st_grand_child_val_14*st_child_value_4*
 st_grand_child_val_21,st_grand_child_val_22
 

If its first segment was named "UNB", then the child meta-object contains a UNB attribute that includes the following cs tag:

cs=ST+2;
 

This cs tag specifies that the composite separator occurs as the second element in the segment named ST; therefore, the data handler interprets the comma (,) as the composite separator. Suppose that the EDI document in which this fragment occurs did not specify a repeat separator; it uses the default value of the caret (^). Therefore, the UNB attribute in the child meta-object that this document uses does not need to contain an rs tag to specify the repeat separator. With no rs tag, the data handler assumes that the repeat separator has its default value. When the data handler encounters a caret (^), it interprets this character as the repeat separator.

To define a non-default repeat separator, the EDI document must include the non-default character in a field (usually in the header) and the positional-information attribute in the associated child meta-object must include the rs tag to indicate the location of this field.

Determining the name of the business object

A data handler can receive serialized data in one of two ways:

Note:
If the data handler receives a business object, it skips to the steps described in Populating the business object.

If the data handler does not receive a business object, the data handler must determine the type of the business object to create. The data handler calls the name handler, which takes the following steps:

  1. Open the EDI name-handler lookup file based on the name given in the NameHandlerFile attribute of the child meta-object. This name-handler lookup file must already exist. If this open fails, the name handler generates an exception. For more information, see Creating the name-handler lookup file.
  2. Check if the name-handler lookup file has been modified since the last time it was read. If it has, read the contents into the in-memory name-handler lookup table again.
  3. Based on the transaction ID and the DUNS number (which were determined in the initialization phase), look up the name of the top-level EDI business object that is associated with this EDI document in the name-handler lookup table.

If the lookup of the business object name fails, the data handler logs an error and generates an exception. If the lookup is successful, the data handler creates a business object of the specified type to contain the data.

Note:
These steps describe the behavior of the default name handler that is delivered with the EDI data handler. For information on how to create a custom name handler, see Customizing the EDI data handler.

Populating the business object

Once the EDI separators have been determined and the top-level business object has been created, the data handler takes the following steps to populate it with the serialized data:

  1. If the DefaultVerb meta-object attribute is set, the data handler sets the verb in the business object to the value that DefaultVerb specifies. The delivered value for DefaultVerb is Create. Otherwise, the data handler assumes that no verb needs to be set.
  2. The data handler determines if there are any child meta-objects (those whose names are listed in the cw_mo_ tag of the business object application-specific information). The data handler does not perform the processing to populate these attributes of the business object. For more information about the cw_mo_ tag, see Implementing conversion from a business object.
  3. The data handler loops through the remaining attributes in the top-level business object definition. Based on the cardinality of each attribute, the data handler determines what part of the EDI document the attribute represents. For more information, see Determining the attribute associated with the EDI data.
  4. Once the data handler identifies the attribute associated with the current EDI data, it can take the appropriate steps to write the EDI data to this attribute. The data handler parses the EDI data based on the separators (which were determined in the initialization phase). For more information, see Parsing the EDI document.

Once the data handler has populated all attributes of the top-level business object, it can perform an optional check to ensure that all the EDI data has been parsed.

Determining the attribute associated with the EDI data

The structure of the business objects that hold EDI is determined by the EDI document specification. (For information on how to create this business object structure, see Creating business object definitions for EDI documents.) The EDI data handler uses the cardinality of the attribute to determine this attribute represents the current EDI part of the EDI document. Based on this cardinality, the data handler takes the following actions:

Parsing the EDI document

The EDI data handler parses information in the EDI document based on the separators it has identified in the initialization phase. These separators determine each of the different pieces of data, which the data handler then matches to the appropriate attribute. Table 49 shows the parsing tasks that the data handler takes for the different EDI business objects.

Table 49. Parsing tasks for EDI business objects

Application-specific information Parsing task
type=header, type=trailer The data handler finds the position in the business object that corresponds to the next segment that appears in the document, and parses that segment to populate the child business object.
name=segment_name (no type tag) The data handler assumes that the business object represents a segment and parses the current segment to populate the business object.
type=loop The name of the first segment contained in the loop should be specified in the application-specific information. The data handler parses the EDI document for these loop segments and adds the data to the business object.

Copyright IBM Corp. 1997, 2003