Advanced Topics in Parsing

This section includes parsing multiple strings and flow charts depicting a conceptual view of parsing.

Parsing Multiple Strings

Only ARG and PARSE ARG can have more than one source string. To parse multiple strings, you can specify multiple comma-separated templates. Here is an example:

parse arg template1, template2, template3

This instruction consists of the keywords PARSE ARG and three comma-separated templates. (For an ARG instruction, the source strings to parse come from arguments you specify when you call a program or CALL a subroutine or function.) Each comma is an instruction to the parser to move on to the next string.

Example:

/* Parsing multiple strings in a subroutine                      */
num='3'
musketeers="Porthos Athos Aramis D'Artagnon"
CALL Sub num,musketeers  /* Passes num and musketeers to sub     */
SAY total; say fourth /* Displays: "4" and " D'Artagnon"         */
EXIT

Sub:
parse arg subtotal, . . . fourth
total=subtotal+1
RETURN

Note that when a REXX program is started as a command, only one argument string is recognized. You can pass multiple argument strings for parsing:

If there are more templates than source strings, each variable in a leftover template receives a null string. If there are more source strings than templates, the language processor ignores leftover source strings. If a template is empty (two commas in a row) or contains no variable names, parsing proceeds to the next template and source string.

Combining String and Positional Patterns: A Special Case

There is a special case in which absolute and relative positional patterns do not work identically. We have shown how parsing with a template containing a string pattern skips over the data in the source string that matches the pattern (see page ***). But a template containing the sequence:

does not skip over the matching data. A relative positional pattern moves relative to the first character matching a string pattern. As a result, assignment includes the data in the source string that matches the string pattern.

/* Template containing string pattern, then variable name, then  */
/*  relative positional pattern does not skip over any data.     */
string='REstructured eXtended eXecutor'
parse var string var1 3 junk 'X' var2 +1 junk 'X' var3 +1 junk
say var1||var2||var3 /* Concatenates variables; displays: "REXX" */

Here is how this template works:

|var1  3|   |junk 'X'|   |var2 +1|   |junk  'X'|   |var3 +1 |  | junk |
*---*---*   *---*----*   *---*---*   *----*----*   *---*----*  *--*---*
    *           *            *            *            *          *
Put         Starting     Starting     Starting     Starting    Starting
characters  at 3, put    with first   with char-   with        with char-
1 through   characters   'X' put 1    acter after  second 'X'  acter
2 in var1.  up to (not   (+1)         first 'X'    put 1 (+1)  after sec-
(Stopping   including)   character    put up to    character   ond 'X'
point is    first 'X'    in var2.     second 'X'   in var3.    put rest
3.)         in junk.                  in junk.                 in junk.

var1='RE'   junk=        var2='X'     junk=        var3='X'    junk=
            'structured               'tended e'              'ecutor'
             e'

Parsing with DBCS Characters

Parsing with DBCS characters generally follows the same rules as parsing with SBCS characters. Literal strings and symbols can contain DBCS characters, but numbers must be in SBCS characters. See PARSE for examples of DBCS parsing.

Details of Steps in Parsing

The three figures that follow are to help you understand the concept of parsing. Please note that the figures do not include error cases.

The figures include terms whose definitions are as follows:

string start
is the beginning of the source string (or substring).
string end
is the end of the source string (or substring).
length
is the length of the source string.
match start
is in the source string and is the first character of the match.
match end
is in the source string. For a string pattern, it is the first character after the end of the match. For a positional pattern, it is the same as match start.
match position
is in the source string. For a string pattern, it is the first matching character. For a positional pattern, it is the position of the matching character.
token
is a distinct syntactic element in a template, such as a variable, a period, a pattern, or a comma.
value
is the numeric value of a positional pattern. This can be either a constant or the resolved value of a variable.

Figure 50. Conceptual Overview of Parsing
             *----------------------------------------*
             V                                        |
   *--------------------------------*                 |
   |START                           |                 |
   |Token is first one in template. |                 |
   |Length=length(source string)    |                 |
   |Match start=1. Match end=1.     |                 |
   *---------*----------------------*                 |
*----------> |                                        |
|            V                                        |
|  *-------------------*yes *--------------------*    |
|  |End of template?   *--->|Parsing complete.   |    |
|  *---------*---------*    *--------------------*    |
|            V no                                     |
|  *-------------------*                              |
|  |CALL Find Next     |                              |
|  | Pattern.          |                              |
|  *---------*---------*                              |
|            V                                        |
|  *-------------------*                              |
|  |CALL Word Parsing. |                              |
|  *---------*---------*                              |
|            V                                        |
|  *-------------------*                              |
|  |Step to next token.|                              |
|  *---------*---------*                              |
|            V                                        |
|  *-------------------* yes *--------------------*   |
|  |Token a comma?     *---->|Set next source     |   |
|  *---------*---------*     |string and template.*---*
|            | no            *--------------------*
*------------*

Figure 51. Conceptual View of Finding Next Pattern
       *------------------------------------------------*
       V                                                |
*-------------*    *--------------------------------*   |
|Start:       |yes |String start=match end.         |   |
|End of       *--->|Match start=length + 1.         |   |
|template?    |    |Match end=length + 1. Return.   |   |
*-----*-------*    *--------------------------------*   |
      V no                                              |
*-------------*    *--------------------------------*   |
|Token period |yes |                                |   |
|or variable? *--->|Step to next token.             *---*
*-----*-------*    *--------------------------------*
      V no
*-------------*    *---------*    *----------*   *-----------------------------------*
|Token a plus?|yes |Variable |yes |Resolve   |   |String start=match start.          |
|             *--->|form?    *--->|its value.*-->|Match start=min(length + 1,        |
*-----*-------*    *----*----*    *----------*^  | match start + value).             |
      | no              | no                  |  |Match end=match start. Return.     |
      V                 *---------------------*  *-----------------------------------*
*-------------*    *---------*    *----------*   *-----------------------------------*
|Token a      |yes |Variable |yes |Resolve   |   |String start=match start.          |
|minus?       *--->|form?    *--->|its value.*-->|Match start=max(1, match           |
*-----*-------*    *----*----*    *----------*^  | start - value).                   |
      | no              | no                  |  |Match end=match start. Return.     |
      V                 *---------------------*  *-----------------------------------*
*-------------*    *---------*    *----------*   *-----------------------------------*
|Token an     |yes |Variable |yes |Resolve   |   |String start=match end.            |
|equal?       *--->|form?    *--->|its value.*-->|Match start=min(length+1, value).  |
*-----*-------*    *----*----*    *----------*^  |Match end=match start. Return.     |
      | no              | no                  |  *-----------------------------------*
      V                 *---------------------*
*-------------*    *-----------------------------------*
|Token a      |yes |String start=match end.            |
|number?      *--->|Match start=min(length+1, value).  |
*-----*-------*    |Match end=match start. Return.     |
      V no         *-----------------------------------*
*-------------*
|Token a lit- |yes
|eral string? *--------------------------*
*-----*-------*                          |
      | no                               |
      V                                  V
*-------------*    *----------*   *---------------*    *-----------------------------*
|Token a var- |yes |Resolve   |   |Match found in |yes |String start=match end.      |
|iable string?*--->|its value.*-->|rest of string?*--->|Match start=match position.  |
*-----*-------*    *----------*   *------*--------*    |Match end=match position +   |
      | no                               | no          | pattern length.  Return.    |
      |                                  V             *-----------------------------*
      |                  *--------------------------------*
      |                  |String start=match end.         |
      |                  |Match start=length + 1.         |
      |                  |Match end=length + 1. Return.   |
      V                  *--------------------------------*
*-------------*          *--------------------------------*
|Token a      |yes       |Match start=length + 1.         |
| comma?      *--------->|Match end=length + 1. Return.   |
*-------------*          *--------------------------------*

Figure 52. Conceptual View of Word Parsing
*-------------------------*    *------------------------*
|Start:  Match end <=     |no  |                        |
|        string start?    *--->|String end=match start. |
*-----------*-------------*    *------------------------*
            V yes
*-------------------------*
|String end=length + 1.   |
*-----------*-------------*
            V
*----------------------------------------------------------------------*
|Substring=substr(source string,string start,(string end-string start))|
|Token=previous pattern.                                               |
*-----------*----------------------------------------------------------*
            V <-----------------------------------------------*
*-------------------------*no                                 |
|Any more tokens?         *-------------*                     |
*-----------*-------------*             |                     |
            V yes                       |                     |
*-------------------------*             |                     |
|Step to next token.      |             |                     |
*-----------*-------------*             |                     |
            V                           V                     |
*-------------------------*no  *------------------------*     |
|Token a variable or a    *--->|Return.                 |     |
|period?                  |    *------------------------*     |
*-----------*-------------*                                   |
            V yes                                             |
*-------------------------*no                                 |
|Any more tokens?         *-------------*                     |
*-----------*-------------*             |                     |
            V yes                       V                     |
*-------------------------*    *------------------------*     |
|Next token a variable or | no |Assign rest of substring|     |
|period?                  *--->|to variable.            |     |
*-----------*-------------*    *-------------*----------*     |
            V yes                            *--------------->|
*-------------------------* no *------------------------*     |
|Any substring left?      *--->|Assign null string to   |     |
*-----------*-------------*    |variable.               |     |
            V yes              *-------------*----------*     |
*-------------------------*                  *--------------->|
|Strip any leading blanks.|                                   |
*-----------*-------------*                                   |
            V                                                 |
*-------------------------* no *------------------------*     |
|Any substring left?      *--->|Assign null string to   |     |
*-----------*-------------*    |variable.               |     |
            |                  *-------------*----------*     |
            V yes                            *--------------->|
*-------------------------* no *------------------------*     |
|Blank found in substring?*--->|Assign rest of substring|     |
|                         |    |to variable.            |     |
*-----------*-------------*    *-------------*----------*     |
            V yes                            *--------------->|
*-----------------------------------------------------------* |
|Assign word from substring to variable and step past blank.| |
*-------------------*---------------------------------------* |
                    *-----------------------------------------*