HL7 messages

An HL7 message consists of the following data elements.

Message type

An HL7 message type is a unique identifier for the business purpose of a message. Every message must contain a message type id as way to announce the purpose of the message. For example, ADT is a unique message ID to Patient Administration.

However, it is rather not a unique classification on the structure of a message. One message type can have more than one message structure.

The message type is advertised in the message header segment

Message event

The message event, sometimes called a trigger, is a unique identifier to the context in which message is generated. The message event consists of an upper case letter and two digits. For example, A01 is for admission/visit notification and A61 is for changing consulting doctor. Both A01 and A61 are used with ADT messages.

Event type is advertised in the message header segment.

Message structure

The message structure is a data structure used to express an association of a message type with an event for a class of messages. Each message structure also contains a unique ID.

It structurally consists of a well-defined list of HL7 segments. Segments can be optional, and can repeat. There is no limit on how many times a segment can repeat.

Segments can be aggregated together to form a segment group, which can repeat as well. In the standard specification, segment group is indicated by {} or [], where {} signifies repetition and [] signifies optionality.

Because message structure definition allows {} or [] to enclose single segment, it is possible to interpret the segment group consisting of a single segments. But for purpose of discussing the data handler and related ISBOs, we reserve the term segment group to segment aggregation of two or more segments enclosed by {} or [].

Relative position of segments in a message structure and segment groups is well defined. At the message structure level, segment is the atomic data type.

Message structures are defined by both message type and events. One message type can associate with more than one event, but one event can only associate with exactly one message type. Furthermore, some events with a given message type associate with the same message structure. For example message type ADT with both event A01 and event A04 uses message structure ADT_A01

Table 46. Message structure of ADT_A61

ADT^A61^ADT_A61 ADT Message
MSH Message header
EVN Event type
[ PD1 ] Additional demographics
[ {ROL} ] Roll
[ PV2 ] Patient visit--additional information

Segment

Segment is a well-defined list attributes, each of an HL7 data type. All segments start with three upper case letter segment IDs. Segment attributes can be optional and can repeat. The maximum number of how many times an attribute can repeat is specified. Various tables define the validity of certain attribute values.

Note:
The relative position of segment attribute is significant in how it is defined.

Every segment ends with a segment terminator, either the ASCII carriage return or x0D.

HL7 calls the segment attribute Field. The standard designates a delimiter, whose value is a defined by the user on per message instances basis, to separate fields.

When the fields repeat, the standard designates a user-defined delimiter, whose value is defined on per message instance basis, to separate repeating fields. Delimiters that separate repeating instances of a field are also defined by use, and call repetition

Table 47. MSH segment specification

SEQ LEN DT OPT RP# TBL ITEM Element Name
1 1 ST R

00001 Field separator
2 4 ST R

00002 Encoding character
3 180 HD O
0361 00003 Sending application
4 180 HD O
0362 00004 Sending facility
5 180 HD O
0361 00005 Receiving application
6 180 HD O
0362 00006 Receiving facility
7 26 TS R

00007 Date/time of message
8 40 ST O

00008 Security
9 13 CM R
0076/0003 00009 Message type
10 20 ST R

00010 Message control ID
11 3 PT R

00011 Processing ID
12 60 VID R
0104 00012 Version ID
13 15 NM O

00013 Squence number
14 180 ST O

00014 Continuation pointer
15 2 ID O

0155 00015 Accept acknowledgement type
16 2 ID O
0155 00016 Application acknowledgement type
17 3 ID O
0399 0017 Country code
18 16 ID O Y 0211 00018 Character code
19 250 CE O

00692 Principle language of message
20 20 ID O
0356 01317 Alternate character set handling
21 10 ID O Y 0449 01598 Conformance statement ID
Note:
In Table 47"Len" stands for maximum length in number of characters; "DT" stands for data type; "OPT" stands for optionality; "RP#" stands for the maximum number of time the field can repeat; "TBL#" stands the id of table, which contains the finite list of legitimate values for validation; "ITEM#" stands of the unique id of the data element in the whole standard; "Element Name" stands for brief descriptive name of the field.

The structures of many segments are statically defined and can be shared in many message structures. There are also some segments that share the same segment ID but have a different structure -- different attribute in name and data type. QPD, QED, RCP, QAK belong to this category of segments

Data types

HL7 defines a long list of data types. Some are as defined as a primitive type, while others are defined as a complex type, A complex datatype consists of more than one attribute of teh primitive type. HL7 calls the attributes of a complex data type acomponent

For example, the following HD data type uses three components, namespace ID, universal ID, and universal ID type.

<namespace ID (IS)>^<universal ID (ST)>^
   <universal ID type )ID)>
 

The standard designates a delimiter, whose value is defined by the user on per message instances basis, to separate components.

Other complex data type can further define a component of a given data type. HL7 calls the attribute of this complex data type subcomponent. The complex data type must have all attributes be of primitive type.

The standard also designates a delimiter to separate subcomponents whose value is defined by the user on per message instances basis.

The component delimiter of the data type that defines a component is then demoted to subcomponent delimiter.

HL7 also designates an escape character for escapting characters that are identical to various delimiters. These are defined by the user on per message instance basis.

Most data types are well are statically specified as distinct data structures in the standard. A few data types, notably the CM and * data type, exhibit dynamic behavior, and warrant special attention.

Custom made data types

CM is also called a Composite data type. This data type is a custom construction using previously defined data types for each unique situation, in a different segment, or at different field of the same segment. There are number of data structures associated with the CM data type.

The data structures for the CM types, which can differ widely from one another, are solidified at time when segments they belong to are defined. The CM data type is known at design time.

In earlier versions of the HL7 standard, CM is meant to be custom-made data type, data type defined on local site of deployment. For this reason, CM data type represents endless number of structures with different combination and permutation of readily defined data types.

Version 2.4 of HL7 only has finite number of data structures that is labeled with CM data type.

Polymorphic data type

This is the data type that is marked with * symbol or with "varies" data type label, instead of the usual two or three capital letters.

This data type exhibits a common behavior that it can be any one of the already defined data type and the data type is declare in some other place. For example, OBX-2 declares the data type of OBX-5, which is actual data carrier of this polymorphic data type. For ease of discussion, lets call the field that announce the data type of polymorphic data type data Data Type Announcer, and call the field that contains the actual data of the polymorphic data type Data Carrier.

The data type announcer and data carrier can coexist in the same segment; they can also exist on different segments. For example, OBX segment has OBX-2 field being the data type announcer and OBX-5 field being the data carrier. RDF segment contains a field of RCD data type, of which each instance of the RCD field acts as a data announcer for data carrier in the RDT segment. The relationship of the RDF and RDT segment is in analogues to relationship database's table description meta-data and rows of actual data.

In the case of both data type announcer and data carrier residing in the same segment, the relative position of the announcer field and the data carrier field can also vary. Data type announcer can precede data carrier, or the other way around. Section 8.5.2.4, chapter 8, contains example that data type announcer MFE-5 trails behind the data carrier MFE-4.

There is no fixed number of fields that sets data type announcer and data carrier apart. Their relationship is shown in the following diagram.

Figure 4. Data carrier and data type announcer relationship in polymorphic data type where "...: represents other fields in the segment.

The range of the variant data type is known and finite, even though it is polymorphic.

Delimiter and enhaced data model

The delimiters for segment, field, component, and subcomponent are all user-defined Delimiters are announced in the message header for each individual message.

A segment delimiter is defined as an ASCII carriage return or x0A. When data containing the characters identical to these delimiters, an escape sequence is used to mark the character.

Component delimiters can be demoted to subcomponent delimiters when a data type is used on anther data type's components. The use of repetition delimiters in most data elements is contextual depending the nature of its data elements, with certain excpetions.

The repetition delimiter is used between repeating instances of data type element segment fields. The common character used is "~". Components of data type element are generally not expected to repeat, except for MA, NA, and QIP data types.

NA is defined as component:

<value1> ^ <value2> ^ <value3> ^ <value4> ^ ...
 

The data type of each of its component is NM. It is used to express an array of numbers. The "..." signifies that NA can in theory contain infinitely many values. So it represents a repetition of fields, and the repetition delimiter is the field delimiter. It is illustrated in the following example

Figure 5. The area in bold is an example of how NA is used in the OBX segment.

OBX|3|NA|5&WAV^^99SVL|1|0^1^2^3^4^5^6^7^8^7^6^5^4^3^2^1^0^-1^-2^-3^-4^-5^-6^-7^-8||||||F|...<cr>

MA is defined as component:

<sample 1 from channel 1 (NM)> ^ <sample 1 from channel 2 
   (NM)> ^ <sample 1 from channel 3 (NM)> ...~<sample 2 from channel 1 
    (NM)> ^ <sample 2 from channel 2 (NM)> ^ <sample 2 from channel 3 
     (NM)> ...~ ...
 

The pattern in the structure is that a group of clustered components delineated by component delimiter "^" repeats and the repetition is delineated by the segment repetition delimiter "~".

This definition example shows three channels per component cluster, with six channels per component cluster. It is possible to have than more components, depending on the implementation requirements.

Figure 6. The area in bold is an example of how MA is used in the OBX segment

OBX|3|NA|5&WAV^^99SVL|1|0^1^2^3^4^5^6^7^8^7^6^5^4^3^2^1^0^-1^-2^-3^-4^-5^-6^-7^-8~0^1^2^3^4^5^6^7^8^7^6^5^4^3^2^1^0^-1^-2^-3^-4^-5^-6^-7^-8~0^1^2^3^4^5^6^7^8^7^6^5^4^3^2^1^0^-1^-2^-3^-4^-5^-6^-7^-8||||||F|...<cr>

Unlike NA, the size of the component cluster is finite.

QIP is defined as component:<segment field name (ST) > ^ <value1 (ST) & value2 (ST) & value3 (ST) ...>

The definition of this data type has two components, the first being a conventional component, and the second having infinitely repeatable subcomponents.

Figure 7. The areas in bold are the repeating subcomponents of QIP

|@PID.5.1^EVANS&Param2&Param3|

As show, there are three kinds of repetitions in data type elements. NA has repeatable component; QIP has repeatable subcomponents; and MA has repeatable custom defined component clusters. Each of these repetitions uses different delimiters.

Assuming component can repeat with component delimiter, applying the component delimiter demotion rule to the component repetition results in reclassification of these repetitions; the subcomponent repetition can be further considered as in the same category as component repetition.

In order to arrive to a unified data model of HL7 data elements, some expansion has to be made to the traditional HL7 data mode. The enhanced data model allows the repetition of component, where the older data model does not.

The enhanced model permits an addition data element called the component cluster to the data type element. Furthermore both component and component cluster data element are allowed to repeat.

The following diagram depeicts the hierarchy of the data element in this enhanced data model.

Figure 8. Data element hierarchy in the enhanced data model

The following table summarizes the delimiter for each data element.

Table 48. Delimiters for each data element

Data element in ehanced data model Repeatability Delimters Description
Data type Yes Segment field delimiter (commonly ~ ) Segment field delimiter delineates repeating instances of the data type element
Component Cluster Yes Segment field delimiter (commonly ~) Segment field delimiter delineates repeating instances of the data type element. Data type, which contains the component cluster, must not be repeatable.
Component Yes Component delimiter (commonly ^) Component delimiter delineates repeating instances of the data type element.
Subcomponent Yes Subcomponent delimiter (commonly &) Subcomponent delimiter delineates repeating instances of the data type element.

Internationalization

HL& allows messages to be encoded using various characters adn encoding, on a per message basis.

The following table details the supported character sets.

Table 49. List of character set identifiers and cannonical names used in the adapter.

HL7 Character Set Description Corresponding Canonical Name Supported by BIA_Healthcare Adapter
ASCII The printable 7-bit ASCII character set. (This is the default if this field is omitted) (Regular ASCII7)
8859/1 The printable characters from the IDO 8859/1 character set. ISO8859_1
8859/ The printable characters from the IDO 8859/2 character set. ISO8859_2
8859/3 The printable characters from the IDO 8859/3 character set. ISO8859_3
8859/4 The printable characters from the IDO 8859/4 character set. ISO8859_4
8859/5 The printable characters from the IDO 8859/5 character set. ISO8859_5
8859/6 The printable characters from the IDO 8859/6 character set. ISO8859_6
8859/7 The printable characters from the IDO 8859/7 character set. ISO8859_7
8859/8 The printable characters from the IDO 8859/8 character set. ISO8859_8
8859/9 The printable characters from the IDO 8859/9 character set. ISO8859_9
ISO IR14 Code for Information Exchange (one byte)(JIS X 0201-1976). Note that the code contains a space, i.e. "ISO IR14". JIS0201
ISO IR87 Code for the Japanese graphic character set for information exchange (JIS X 0208 1990) JIS0208
ISO IR159 Code for the supplementary Japanese graphic character set for information exchange (JIS X 0212 1990). JISO21'2
UNICODE The world wide character standard from ISO/IEC 10646-1-1993 UTF-8 (same as IDO1010646
ISO-IR xxxx Other character sets naming convention layed out in ISO 2375. This is just a different representation of th esame character set id listed ablve.

HL7 defines the default character set as a single byte character set which should always have ISO IR6 (ISO 646) or ISO IR14 (JIS X 0201 1976) in the G0 area.

Escape sequence for multi-character set data

A message can be encoded using more than one character set and encoding scheme. An escape sequence is used to flag a switch to a different character set, as well as the return to the default character set. The following example shows the data pattern when using escape sequence for data encoded in alternate character set.

<Escape sequence> 
    <encoded data using alternate character set>
 

Returning from the alternate character set to the default character set uses the same technique using the escape set of the default character set.

<Escape sequence> 
   <encoded data using alternate character set> <Default escape sequence>
 

ISO 2002-1994 defines the technique on how the escape sequence is structured. by the ISO 2002-1994 standard, the escape sequence is a sequence of bit patterns is used to identify a character set. ISO 2002-1994 uses the decimal xx/yy notation when expressing the escape sequences. This document expresses the bit sequence of escape sequence in a sequence of bytes using hexadecimal notation. The following quote is from the HL7 standard specification, describing where and when to use the escape sequence.

Each repetition of a PN, XPN, XON, XCN, or XAD field is assumed to begin with the default character set. If another character set is to be used, the HL7 defined escape sequence used to announce that character set must be at the beginning of the repetition, and the HL7 defined escape sequence used to start the default character set must be at the end of the repetition. Note also that several character sets may be intermixed within a single repetition as long as the repetition ends with a return to the default character set."

Modes of escape sequence for the multi-character set

HL7 provides two modes of specifying and handling escpae sequences:

Both modes employ the sam ISO 2002-1994 technique for specifying the escape sequence. They differ in how the escape character is represented in the escape sequence.

ISO 2002 mode escape sequences

This mode uses ASCII escpae characters instead of the HL7 defined escape instead of the previously mentioned characters. The following table provides the list of escape sequences for character sets supported by HL7.

Table 50.

Escape sequence Character set used in HL7
ESC 2842 ISO-IR6 G0 or ASCII (ISO 646 : ASCII))
ESC 2D41 ISO-IR100 (ISO 8859 : Latin Alphabet 1)
ESC 2D42 ISO-IR101 (ISO 8859 : Latin Alphabet 2)
ESC 2D43 ISO-IR109 (ISO 8859 : Latin Alphabet 3)
ESC 2D44 ISO-IR110 (ISO 8859 : Latin Alphabet 4))
ESC 2D4C ISO-IR144 or 8859/5 (ISO 8859: Cyrillic))
ESC 2D47 ISO-IR127 or 8859/6 (ISO 8859 : Arabic)
ESC 2D46 ISO-IR126 or 8859/7 (ISO 8859 : Greek)
ESC 2D48 ISO-IR138 or 8859/8 (ISO 8859 : Hebrew)
ESC 2D4D ISO-IR148 or 8859/9 (ISO 8859 : Latin Alphabet 5)
ESC 284A ISO-IR14 (JIS X 0201 -1976: Romaji)
ESC 2949 ISO-IR13 (JIS X 0201 : Katakana)
ESC 2442 ISO-IR87 (JIS X 0208 : Kanji, hiragana and katakana)
ESC 242844 ISO-IR159 (JIS X 0212 : Supplementary Kanji)

Escape sequence in HL7 define mode

When another character set is to be used, the HL7 defined escape sequence must be used at the beginning of the repetition, and the HL7 defined escape sequence used to start the default character set must be at the end of the repetition.

The escape sequence consists of the escape character followed by an escape code ID of one character, zero, or more data characters, and another occurrence of the escape character where the escape character is defined by the user in the message.

The following table lists the escape sequence for each character set using "\" as the HL7 defined escape character.

Table 51. List of escape sequences for HL7-supported character sets

Escape sequence Character set used in HL7
\C2842\ ISO-IR6 G0 or ASCII (ISO 646 : ASCII))
\C2D41\ ISO-IR100 (ISO 8859 : Latin Alphabet 1)
\C2D42\ ISO-IR101 (ISO 8859 : Latin Alphabet 2)
\C2D43\ ISO-IR109 (ISO 8859 : Latin Alphabet 3)
\C2D44\ ISO-IR110 (ISO 8859 : Latin Alphabet 4))
\C2D4C\ ISO-IR144 or 8859/5 (ISO 8859: Cyrillic))
\C2D47\ ISO-IR127 or 8859/6 (ISO 8859 : Arabic)
\C2D46\ ISO-IR126 or 8859/7 (ISO 8859 : Greek)
\C2D48\ ISO-IR138 or 8859/8 (ISO 8859 : Hebrew)
\C2D4D\ ISO-IR148 or 8859/9 (ISO 8859 : Latin Alphabet 5)
\C284A\ ISO-IR14 (JIS X 0201 -1976: Romaji)
\C2949\ ISO-IR13 (JIS X 0201 : Katakana)
\M2442\ ISO-IR87 (JIS X 0208 : Kanji, hiragana and katakana)
\M242844\ ISO-IR159 (JIS X 0212 : Supplementary Kanji)

Hexadecimal escape sequences and local sequence

HL7 permits the transmission of binary data encoded in the form of \Xdddd...\, where the dddd denotes s a pair of binary value represented using ASCII character 1-9 and A-F. This is the hexadecimal escape sequence.

Alternatively, up on mutual agreements between parties engaged in the HL7 communication, users can also encode their data using a custom escape sequence, which has the form of \Zdddd...\, where the dddd are valid characters permitted in TX data type.

Other encoding schemes

Beside character set encoding, there are still other encoding schemes used in the HL7 message standard.

Sometimes, messages make reference to data in other systems. HL7 provides the user the ability to access that data in two ways, by reference and by value. When passing data by reference, it uses the Reference Pointer (RP) data type for the receiving system to track the referenced data, but without physically transferring the data across the wire to the receiving system.

When it is necessary for the receiving system to obtain an actual copy of the data, the data then needs transferred to the receiving system. Many times this data from a third-party application and does not follow the HL7 message construction roles. A special encoding scheme is required to fit for this kind of data into an HL7 message.

There are two ways to encode these special kinds of data: Hex and Base64.

The following table is an exact copy of HL7 Table 0290 for the value to ASCII lookup.

Table 52. HL7 table 0290 for the binary to ASCII value in base 64 MIME encoding scheme

Value Code Value Code Value Code Value Code
3 D 20 U 37 l 54 54 2
4 E 21 V 38 m 55 55 3
5 F 22 W 39 n 56 56 4
6 G 23 X 40 o 57 57 5
7 H 24 Y 41 p 58 58 6
8 I 25 Z 42 q 59 59 7
9 J 26 a 43 r 60 60 8
10 K 27 b 44 s 61 61 9
11 L 28 c 45 t 62 62 +
12 M 29 d 46 u 63 63 /
13 N 30 e 47 v

14 O 31 f 48 w (pad) =
15 P 32 g 49 x

16 Q 33 h 50 y

Presentation data

HL7 also allows the exchange of Formatted Text (FT) data intended for the purpose of display rendering. It is often used in the transmission of reports legible to human, rather than machine.

The presentation data contains embedded presentation instructions. Like HTML, there is a group of predefined tags that serve as instruction to the receiving HL7 application on how to render the data. Although the presentation instructions are well defined, the location at where the presentation instructions are embedded in the FT string is completely driven by the design of reports and hence unknown ahead of time.

Message construction rules

Construct the segments in the order defined for the message. Each message is contructed as follows:

  1. The first three characters are the segment ID code
  2. Each data field in the sequence is inserted in the segment in the following manner:
    1. A field separator is placed in the segment
    2. If the value is not present, no further characters are required
    3. If the value is present, but null, the characters "" (two consecutive double quotation marks) are placed in the field
    4. Otherwise, place the characters of the value in the segment. As many characters can be included as the maximum defined for the data field. It is not necessary, and is undesirable, to pad fields to fixed lengths. Padding to fixed lengths is permitted
    5. If the field definition calls for a field to be broken into components, the following rules are used
      1. If more than one component is included, htey are separated by the component separator
      2. Components that are present but null are represented by the characters ""
      3. Components that are not present are treated by including no characters in the component
      4. Components that are not present at the end of a component need not be represented by component separators. For example, the two data components are equivalent:
        |ABC^DEF^^| and |ABC^DEF|
         
        
    6. If the component definition calls for a subcomponent to be broken into subcomponents, the following rules are used:
      1. If more than one subcomponent is included, htey are separated by the component separator
      2. Subcomponents that are present but null are represented by the characters ""
      3. Subcomponents that are not present are treated by including no characters in the component
      4. Subcomponents that are not present at the end of a component need not be represented by component separators. For example, the two data components are equivalent:
        ^XXX&YYY&&^ and ^XXX&YYY^
         
        
    7. If the field definition permits repetition of a field, the repetition separator is used only if more than one occurrence is transmitted. In such a case, the repetition separator is placed between occurrences. If three occurrences are transmitted, two repetition separators are used.)
      In the example below, two occurrences of telephone number are being sent:
      |234-7120~599-1288B1234
       
      

    Repeat Step 1b while there are any fields present to be sent. If all the data fields remaining in the segment definitino are not present, there are no requirements to include any more delimiters.

The following rules apply to receiving HL7 messages and converting their contents to data values:

  1. Ignore segments, fields, components, subcomponents, and extra repetitions of a field that are present but were not expected
  2. Treat segments that were expected but are not present as consisting entirely of fields that are not present
  3. Treat fields and components that are expected but were not included in a segment as not present

Copyright IBM Corp. 1997, 2003