gtpa2m46 | Application Programming |
You must complete a number of offline tasks to create support for a new character set on the online system.
Character sets are chosen on the basis of the kinds of letters and symbols required. Once these requirements are understood, you can choose the appropriate character sets. To learn more about character sets, see Character Data Representation Architecture Reference and Registry.
A character set is referred to by a number called a coded character set identifier (CCSID). The TPF system contains characters in one or more CCSIDs (or TPFCCSIDs).
When the CCSIDs of the TPF system matches another system, no translation is necessary. If the character sets are different, a translation mechanism must exist to transform each character from the remote CCSID to a corresponding character of the TPF system CCSID.
Suppose the TPF system uses CCSID 500 and the RS/6000 uses CCSID 819. To communicate with single-byte TPF system using 500, there must be translations between 819 and 500 available: 819 to 500 for the TPF system and 500 to 819 for the remote server.
CCSIDs that IBM specifies are in the CSNM table. Code set name table CSNM is part of the DLM CSNM. This table joins aspects of CCSIDs through calls to the CSNMC macro. The code set name for a CCSID is the name for the CCSID in the Character Data Representation Architecture Reference and Registry, the CDRA code set registry. The CCSID itself is often the numerical part of the registry name. Each CCSID has a two-letter code associated with it. This code is used with other two-letter codes to specify a translation. The STYLE parameter of the CSNMC macro indicates whether the CCSID is single-byte (S), double-byte (D), or mixed-byte (M). The SINGLE and DOUBLE parameters are required for mixed-byte CCSIDs. These parameters indicate the CCSIDs for the single-byte and double-byte components of a mixed-byte CCSID.
Figure 56. An Example of the CSNM Table of CCSIDs
CSNMC CODESET='IBM-037',CODE='EA',CCSID=037,STYLE=S CSNMC CODESET='IBM-284',CODE='EJ',CCSID=284,STYLE=S CSNMC CODESET='IBM-500',CODE='EO',CCSID=500,STYLE=S CSNMC CODESET='ISO8859-1',CODE='I1',CCSID=819,STYLE=S CSNMC CODESET='IBM-850',CODE='AA',CCSID=850,STYLE=S CSNMC CODESET='IBM-1027',CODE='EX',CCSID=1027,STYLE=S CSNMC CODESET='IBM-1047',CODE='EY',CCSID=1047,STYLE=S CSNMC CODESET='IBM-300',CODE='EN',CCSID=300,STYLE=D CSNMC CODESET='IBM-300',CODE='EN',CCSID=4396,STYLE=D CSNMC CODESET='IBM-930',CODE='EU',CCSID=930,STYLE=M, SINGLE=290,DOUBLE=300 CSNMC CODESET='IBM-932',CODE='AB',CCSID=932,STYLE=M, SINGLE=897,DOUBLE=301 CSNMC CODESET='IBM-939',CODE='EV',CCSID=5035,STYLE=M, SINGLE=1027,DOUBLE=300
Picking CCSID 500 for the TPF system CCSID and 819 for the RS/6000 system, for example, requires a translation table called EOI1 (500 to 819). The translation is done on inbound data and no translation is done on outbound data. Therefore, the TPF system needs the I1EO table.
Changes to the CSNM table are not needed unless you are adding a CCSID for a code page that is entirely new.
Translations are defined using the CSNM table by specifying the code set registry name, the two-letter code, the CCSID, and the STYLE parameter. A translation table is required that corresponds to this CCSID specification.
The tables for many CCSIDs are shipped as a part of the IBM C language support product. The name of the code page for the character set consists of the product identifier, EDCU, followed by the two-letter code of the character set. Two code page tables are joined to create a translation table. The names of the translation tables provided by IBM C language are of the form EDCUxxyy, where xx is the two-letter code for the TPF system table and yy is the two-letter code for the remote table. These tables are used as input to the GENXLT program.
On a TPF system iconv runs online. This requires the translation table it uses to be constructed in the form of a DLM and that its name conform to TPF naming standards.
The EDCGNXLT process uses TPF code sets and the remote code sets to create translation tables. Figure 57 shows a part of the EDCUEOI1 GENXLT translation table for IBM-500 to ISO8859-1 CCSID 819.
Figure 57. An Example of a GENXLT Translation Table
0x00 0x00 <NUL> 0x01 0x01 <SOH> 0x02 0x02 <STX> 0x03 0x03 <ETX> 0x04 0x9c <SEL> 0x05 0x09 <tab> 0x06 0x86 <RNL> 0x07 0x7f <DEL> 0x08 0x97 <GE> 0x09 0x8d <SPS> 0x0a 0x8e <RPT> 0x0b 0x0b <vertical-tab> 0x0c 0x0c <form-feed> 0x0d 0x0d <carriage-return>
·
·
·
0x7f 0x22 <quotation-mark> 0x80 0xd8 <O-slash> 0x81 0x61 <a> 0x82 0x62 <b> 0x83 0x63 <c> 0x84 0x64 <d> 0x85 0x65 <e> 0x86 0x66 <f> 0x87 0x67 <g> 0x88 0x68 <h> 0x89 0x69 <i> 0x8a 0xab <left-angle-quotes> 0x8b 0xbb <right-angle-quotes> 0x8c 0xf0 <eth> 0x8d 0xfd <y-acute> 0x8e 0xfe <thorn> 0x8f 0xb1 <plus-minus> 0x90 0xb0 <degree> 0x91 0x6a <j> 0x92 0x6b <k> 0x93 0x6c <l>
·
·
·
The EDUGNXLT program generates a translation table which must be built into a DLM for online use. The IBM-supplied translation DLMs are called CPGx (where x is 0, 1, ..., 9). (User-supplied translation DLMs can be used in the JCL that runs the ECUGNXLT job as well.) The name of the CPGx (or user-supplied) DLM is paired with the original set of two-letter codes and placed in a table in a segment called CPGS. For example, in the IBM-supplied table, I1EO is paired with CPG0, AAEY with CPG1, and ABEU with CPG9. These form an xxyy pair that is used at run time to find the CPGx name of the DLM that is required to perform the translation. The pairings for new translation tables are added to this table; CPGS is recompiled, linked, and loaded, along with the new DLM that contains the translation tables.
When deciding on mixed-byte character sets, it is critical to use CCSIDs
that are compatible. Just because a given CCSID refers to a single-byte
character set, it does not mean that the CCSID can be used wherever a
single-byte character set is needed. For example, consider Table 25, which displays compatible Japanese to ASCII or EBCDIC
CCSIDs.
Table 25. Compatible Single-, Double-, and Mixed-Byte CCSIDs
Character Set | Single CCSID | Double CCSID | Mixed CCSID |
---|---|---|---|
Japanese to ASCII | 897 | 301 | 932 |
Japanese to EBCDIC | 290 | 300 | 930 |
The 897 CCSID cannot be used in place of 290 because 897 is not compatible with the 930 CCSID. The 930 CCSID is constructed so that it is compatible with 290, not with 897. If you use incompatible CCSIDs, connections from the TPF system to remote systems may be refused. The purpose of the SINGLE and DOUBLE parameters is to help keep the component CCSIDs consistent. It is the user's responsibility to define mixed-byte CCSIDs with the CSNMC macro using single-byte and double-byte CCSIDs associated with the mixed-byte CCSIDs. For more information see Character Data Representation Architecture Reference and Registry.
Some sites may require code sets in addition to those provided with the TPF system product. Information and procedures for additional code sets are obtained from the MVS Language Environment (LE) product.
The first step is to identify the additional code sets that you need. To do this, review the following publications for the characteristics of the various code sets that are available and select compatible code sets.
To create a translation table you must find the code pages for the code sets corresponding to the application requester (on the TPF system) and the application server (on the remote server). These code pages reside either in the data sets for GENXLT support for LE or on the CD-ROM that accompanies the CDRA registry. If the code pages are on the CD-ROM bring them online according to the instructions that follow (see Creating Translation Tables That Do Not Exist in the LE Data Sets). Once online, the procedure for creating the translation table is the same.
When the data sets that contain the desired translation tables are found in LE GENXLT support, the code page data sets do not need to be loaded from the CDRA registry CD-ROM. The data sets are identified by using the two-letter codes that identifies each code page. This two-letter code is found in the CDRA registry. For instance, the two-letter code, EO, identifies code page 500 and the two-letter code, I1, identifies code page 819. The two-letter codes are joined (source-target) to identify a translation table (EDCUI1EO, where EDCU is a common file name prefix for translation tables).
Having located the data set with the correct translation table, the EDCGNXLT load module creates an object file that can be used on the online TPF system. EDCGNXLT is called using JCL, as shown in the example (in Figure 58).
Figure 58. A GENXLT Procedure for Preparing Translation Files
******************************************************************** * * * LANGUAGE ENVIRONMENT FOR MVS & VM * * * * EDCGNXLT--- INVOKE THE GENXLT UTILITY * * * * RELEASE LEVEL: vv.rr.mm (VERSION.RELEASE.MODIFICATION LEVEL) * * * ******************************************************************** * * INFILE=, < INPUT DATA SET CONTAINING A SOURCE TRANSLATION FILE REGSIZ='6144K', < GENXLT REGION SIZE OPT=, < GENXLT OPTIONS OUTFILE=, < OUTPUT DATA SET FOR GENERATED OBJECT LIBPRFX='PROD.CEE.V1R4M0' < PREFIX FOR LIBRARY DSN CONTAINING GENXLT MODULE *------------------------------------------------------------ * EDCGNXLT STEP: * INVOKE EDCGNXLT MODULE TO READ THE SOURCE TRANSLATION FILE * AND PRODUCE AN OBJECT SUITABLE TO BE LINKED AND LOADED ON TPF. *------------------------------------------------------------ EDCGNXLT EXEC PGM=EDCGNXLT,REGION=®SIZ,PARM='&OPT' STEPLIB DD DSNAME=&LIBPRFX .SCEERUN,DISP=SHR SYSIN DD DSN=&INFILE,DISP=SHR SYSPUNCH DD DSN=ACP.DRVE.TEST.OB2(CPG0),DISP=OLD, DCB=(BLKSIZE=400,DSORG=PO) SYSPRINT DD SYSOUT=*
In the previous example, the INFILE data set is an input translation table that correspond to the code pages of the application server of the remote server and the TPF system application requester (EDCUI1EO).
GENXLT support is documented in the Language Environment Programming Guide.
The following shows a connection from the TPF system to a server on a RS/6000 system. The TPF system code page is 500 and the RISC code page is 819. The translation table used as input on the INFILE statement in the JCL is found in the Language Environment Programming Guide. I1 identifies code page 819 and EO identifies code page 500. So, the input translation table is EDCUI1EO. For the TPF system the standard file name prefix EDCU is dropped and the object module is called I1EO. This prevents conflicts with any name spaces in the Language Environment. Figure 59 shows the modified JCL.
Figure 59. A GENXLT Procedure for Preparing Translation Files Filled Out
******************************************************************** * * * LANGUAGE ENVIRONMENT FOR MVS & VM * * * * EDCGNXLT--- INVOKE THE GENXLT UTILITY * * * * RELEASE LEVEL: vv.rr.mm (VERSION.RELEASE.MODIFICATION LEVEL) * * * ******************************************************************** * * INFILE=, < INPUT DATA SET CONTAINING A SOURCE TRANSLATION FILE REGSIZ='6144K', < GENXLT REGION SIZE OPT=, < GENXLT OPTIONS OUTFILE=, < OUTPUT DATA SET FOR GENERATED OBJECT LIBPRFX='PROD.CEE.V1R4M0' < PREFIX FOR LIBRARY DSN CONTAINING GENXLT MODULE *------------------------------------------------------------ * EDCGNXLT STEP: * INVOKE EDCGNXLT MODULE TO READ THE SOURCE TRANSLATION FILE * AND PRODUCE AN OBJECT SUITABLE TO BE LINKED AND LOADED ON TPF. *------------------------------------------------------------ EDCGNXLT EXEC PGM=EDCGNXLT,REGION=®SIZ,PARM='&OPT' STEPLIB DD DSNAME=&LIBPRFX .SCEERUN,DISP=SHR SYSIN DD DSN='INPUT.TRANSLATE.TABLE.SCEEGXLT(EDCUI1EO)',DISP=SHR SYSPUNCH DD DSN='MY.OBJECT.DATA.OB(I1EO)',DISP=OLD, DCB=(BLKSIZE=400,DSORG=PO) SYSPRINT DD SYSOUT=*
The object I1EO that contains the translation table is placed in user data set MY.OBJECT.DATA.OB by the EDCGNXLT program.
The translation table object files are built into individual DLMs. The names of the DLMs and object files must not match (so that the correct entry point is available). The CPG0 DLM name is reserved. The TPF system uses DLMs named CPGx and DLM build scripts named CPGxBS. Added DLMs name should be in the user file name space.
The translation table DLM is created using a build script and the CBLD tool, a part of SIP. The build script JCL must refer to the CSTRTD and CENTPT modules, followed by the name of the translation table object module. Figure 60 shows a typical translation table DLM build script.
Figure 60. Build Script for a Single-Byte Translation Table Object File
################################################################ DLM CPG1vv # Include start-up code for DLM #Object File Function #----------- -------- CENTPT # return entry point address I1E0 # GNXLT translation table 500-819
When the CBLD tool reads this build script as input, it produces JCL with an input to the linkage editor as shown in Figure 61.
Figure 61. DLM Build Script JCL for a Single-Byte Translation Table Object File
//PLKED.SYSIN DD * INCLUDE OBJLIB(CSTRTD40) INCLUDE OBJLIB(CENTPT) INCLUDE OBJLIB(I1E0) /* //LKED.SYSLMOD DD DISP=OLD,DSN=ACP.DEVP.TEST.LK(CPG1vv)
An additional object module, CHCS, is required when the complex converter is used. The complex converter is a program that handles the conversion between mixed-byte code pages. For example, conversion from Japanese EBCDIC to Japanese ASCII (for example, 939 to 930) requires the use of the complex converter. The complex converter is not used when single-byte code pages (for example, 500 to 819) are translated. Figure 62 shows the complex converter build script.
Figure 62. Build Script for a Mixed-Byte Translation Table Object File
################################################################ DLM CPG1vv # Include start-up code for DLM #Object File Function #----------- -------- CENTPT # return entry point address CHCS # complex converter EVEU # GNXLT translation table 939-930
Before loading the translation table DLM to an online TPF system, the name of the DLM and the various kinds of code page identifications are added to two system structures:
Figure 63. Table of Translation Table Object Names and DLMs
table_entry table[table_len] = { {"I1EO","CPG0"}, /*single 819 to single 500*/ {"EVEU","CPG1"}, /*mixed 939 to mixed 930*/ {"AAEY","CPG2"}, /*single 850 to 1047 */ {"AAEO","CPG3"}, /*single 850 to single 500*/ {"ABEX","CPG5"}, /*mixed 932 to single 1027*/ {"ABEL","CPG6"}, /*mixed 932 to single 290*/ {"ABEN","CPG7"}, /*mixed 932 to double 300*/ {"ACEX","CPG8"}, /*single 897 to single 1027*/ {"ABEU","CPG9"}, /*mixed 932 to mixed 930*/ {"0000","0000"}, /*end of table*/ };
Figure 64. CSNM Table Showing Code Set Information
CSNMC CODESET='IBM-500',CODE='EO',CCSID=500,STYLE=S CSNMC CODESET='IBM-819',CODE='I1',CCSID=838,STYLE=S CSNMC CODESET='IBM-930',CODE='EU',CCSID=930,STYLE=M, SINGLE=290,DOUBLE=300 CSNMC CODESET='IBM-939',CODE='EV',CCSID=5035,STYLE=M, SINGLE=1027,DOUBLE=300
The same information for any code sets that are used in a new translation table must appear in the CSNM table, whether a code set is in the table initially or is added by a user. Required information is available from the CDRA registry and the Language Environment Programming Guide.
Once these system structures are updated, the translation table DLMs can be linked and loaded using the TPF loader (TPFLDR) like any other DLM. System operators use the ZSQLD command to add or modify relational database definitions with CCSID and TPFCCSID parameters identifying the new translation tables.
A CD-ROM that contains all possible translation tables is shipped with the character data representation architecture (CDRA) registry. Using the instructions provided in CDRA registry, load a translation table from the CD-ROM. This translation table must be processed into a form that can be read by GENXLT. Figure 65 shows an example of a tool to put the translation table into the correct form.
Figure 65. Sample Tool to Convert CD-ROM Translation Table Data
/* */ arg fn ft fm 'pipe < 'fn ft fm , '| fblock 1 ', /*one char wide */ '| spec number from 0 1.3 1-1 4 ', /*add record number */ '| spec 1.3 d2x 1 4.1 9 ', /*convert it to hex */ '| spec 7.2 x2c 1.1 9.1 2 ', /*convert those to chars */ '| Specs 1-* C2X 1 ', /*convert it all to text */ '| spec /0x/ 1 1.2 next /0x/ 10 3.2 next /<comment>/ 15 ', '| > 'fn ' twocolum a '
Once the translation table is online and in a form acceptable to the EDCGNXLT program, the procedure proceeds as though the translation table was found in the data sets for GENXLT support.
CSTRTD
CENTPT
I1EO (or the name of the object created by GENXLT).
CSTRTD
CENTPT
CHCS
I1EO.