gtpc2mjyC/C++ Language Support User's Guide

Defining a Grammar

This section lists the syntax rules for defining a grammar to IPRSE.

The topics in this section are:

Special Characters for Grammar Syntax

The following characters have special syntax definitions for grammars:

blank ( )
Parameter delimiter

braces ({})
Defines alternative parameters

brackets ([])
Defines optional parameters

comma (,)
Parameter delimiter

slash (/)
Parameter delimiter

vertical bar (|)
Delimiter for alternative parameters within { }

parentheses (())
Defines a list of parameters

period (.)
Special delimiter that ties 2 or more parameters together

plus sign (+)
Optional parameter character

asterisk (*)
Optional string or wildcard list

hyphen (-)
Keyword delimiter

equal sign (=)
Keyword delimiter

less-than sign (<)
Grammar option delimiter, translate input value to upper case grammar token suffix (see Translating Input String Values to Upper Case)

greater-than sign (>)
Grammar option delimiter.

Specifying Parser Options in the Grammar

At the beginning of the grammar you can specify the following parser options:

IPRSE_NOSTRICT

IPRSE_STRICT

IPRSE_NOMIXED_CASE

IPRSE_MIXED_CASE

Each option must be enclosed in angle brackets (<>), with no spaces between the angle brackets and the option, and all options must precede any other grammar tokens. For example, the following grammar specifies the IPRSE_NOSTRICT and IPRSE_MIXED_CASE options:

const char grammar[] =
    "<IPRSE_NOSTRICT><IPRSE_MIXED_CASE>"
    "ZXXXX NUMbers-d* LETters-c*";

The <tpfparse.h> header file defines the following symbolic names for grammar parser options:

#define IPRSE_STRICT_GRAMMAR        "<IPRSE_STRICT>"
#define IPRSE_NOSTRICT_GRAMMAR      "<IPRSE_NOSTRICT>"
#define IPRSE_MIXED_CASE_GRAMMAR    "<IPRSE_MIXED_CASE>"
#define IPRSE_NOMIXED_CASE_GRAMMAR  "<IPRSE_NOMIXED_CASE>"

Thus, the preceding grammar example can also be coded as follows:

const char grammar[] =
    IPRSE_NOSTRICT_GRAMMAR IPRSE_MIXED_CASE_GRAMMAR
    "ZXXXX NUMbers-d* LETters-c*";

Parser options specified at the beginning of a grammar override any conflicting options specified in the IPRSE_parse function options parameter. For example, in the following code fragment the IPRSE_MIXED_CASE option specified in the grammar (the second parameter) overrides the IPRSE_NOMIXED_CASE option specified in the options parameter (the fourth parameter):

int count = IPRSE_parse("a b c",                      /* input string */
                        "<IPRSE_MIXED_CASE> A B C",        /* grammar */
                        &result,
                        IPRSE_ALLOC | IPRSE_NOMIXED_CASE,  /* options */
                        error_header);

All grammar options must be specified at the beginning of the grammar; there are no default grammar options. If a grammar specifies two conflicting options (<IPRSE_NOSTRICT> and <IPRSE_STRICT>, or <IPRSE_NOMIXED_CASE> and <IPRSE_MIXED_CASE>), the last option specified overrides any previous ones.

Parameters

In the grammar, there can be two types of parameters: positional and keyword. Any positional parameters must come before the keywords.

Input strings must either match characters or match character types, depending on the type of parameter in the grammar. In the grammar, characters are letters (A-Z) or digits (0-9), and character types are a, c, d, u, w, x, *, and +, which are described later in this section. The sections on positional parameters and keyword parameters explain how matching characters and character types are used by the grammar for the parameter.

The following are general rules for parameters in the grammar:

Positional Parameters

A positional parameter is a parameter that must be entered in a specific position of the input string syntax and before the keyword parameters.

The following are examples of positional parameters:

1) Grammar: ZDSMG ACTION SDA
Input string: ZDSMG ACTION SDA
2) Grammar: P1 cc.dd (xx)*
Input string: P1 AB.01 F0.E1.D2

A positional parameter can:

  1. Match specific characters

    Examples:

    1) Grammar: ABC
       Input string must be: ABC
    
    2) Grammar: Abc
       Input string can be: A or AB or ABC
    
  2. Match character types: a, c, d, u, w, x, + and *

    Examples:

    1) Grammar: acd*
       Input string can be: AX1 or 0Y12 or ZZ123 and so on
    
    2) Grammar: d+++
       Input string can be one to four digits: 1 or 21 or 345 or 5555
       and so on
    
  3. Be a list.

    Example:

    Grammar: acd.xw
    Input string can be:
     For the whole list:  0A1.F*
     For the first item:  AB9.
     For the second item:  .0A
    
  4. Be a wildcard list.

    Example:

    Grammar: (cd)*
    Input string: A1, or F2.C3, or D4.E5.Z2, or any combination of cd
    repeated up to 20 times. Each item in the input string must be delimited
    from the following item by a period (.).
    

Keyword Parameters

A keyword parameter is a parameter whose value is determined by having a value assigned to the keyword name.

Keyword parameters must be entered after all the positional parameters are entered (if there are any positional parameters to be entered), and then the keyword parameters can be entered in any order. The following is an example of keyword and positional parameters:

ZSIPC ALTER INTERVAL TIME-xx PRIM=ccc ALTERN-xx

There are two forms of keywords: regular and self-defining. Their grammar form and their range of possible values is different. Regular keywords have a range of many values, whereas self-defining keywords only have 2 possible values.

Regular Keywords

Regular keywords are keywords that can have a range of values.

A regular keyword parameter can:

  1. Match specific characters

    The keyword name must match specific characters.

    Example:

    Grammar is: Key=cd.x
     
    Input string for keyword name portion only:
     K
     KE
     KEY
    
  2. Match character types: a, c, d, u, w, x, + and *

    The keyword value can use character types.

    Examples:

    1) Grammar is: Key=cd
       Keyword value: cd
     
    Input string for cd can be:
     A1
     M1
     B5
     and so on...
    
    2) Grammar is: KEY=d+++
       Keyword value: d+++
     
    Input string for d+++
    can be 1 to 4 digits: 1 or 21, or 345, or 5555 and so on.
    
  3. Be a list.

    Only the keyword value can be a list.

    Example:

    Grammar is: KEYword=acd.xw
    Keyword value is a list: acd.xw
     
    Input string can be:
     For the whole list:  0A1.F*
     For the first item:  AB9.
     For the second item:  .0A
    

Self-defining Keywords

Self-defining keywords are keywords that return 1 of only 2 values:

Self-defining keywords have the following form:

   (NO)Keyword

Use matching characters for self-defining keywords.

Example:

1) Grammar is: (NO)WAlk
   Input can be:
     NOWALK - gives a value of N.
     WALK - gives a value of Y.
     NOWA - gives a value of N.
     and so on....
2) Grammar: (NO)ERASE
   Possible input strings:
     ERASE - the value assigned is Y.
     NOERASE - the value assigned is N.

Translating Input String Values to Upper Case

When the IPRSE_parse function is called with the IPRSE_MIXED_CASE option, the parser accepts input values in lower case. The values returned in the parser results can be translated to upper case by adding a less-than sign (<) at the end of each grammar token for which the matching value should be translated; otherwise the values returned in the parser results are the same as in the input string. In the following example, assume that the IPRSE_MIXED_CASE option is in effect.

Grammar: c c< c
Input: x y z
Result values: x Y z

Programming Considerations for Grammars

Do not write ambiguous grammars, such as:

Examples of Grammar Definitions

  1. Grammar: cc++

    Accepts any character string between 2 and 4 letters. If the IPRSE_NOMIXED_CASE option is used the input string must contain all uppercase letters.

    Possible input strings: ABCD or ABC or AB and so on.
    
  2. Grammar: ccd+

    Accepts 2 letters followed by 1 or 2 decimal digits. If the IPRSE_NOMIXED_CASE option is used both letters must be uppercase.

    Possible input strings: XY23 or XY2 and so on.
    
  3. Grammar: cc*

    Accepts any character string with at least 2 letters. If the IPRSE_NOMIXED_CASE option is used the input string must contain all uppercase letters.

    Possible Input Strings: AB, or ABCD, or ABCDEFGHILLLO
    
  4. Grammar: cc.dd

    Accepts the following input strings:

    1. Two letters followed by 2 decimal digits:
                  AA.12
                  XY.66
      
    2. A period followed by 2 decimal digits:
                    .45
      
    3. Two letters followed by a period:
                  AB.
      

      If the IPRSE_NOMIXED_CASE option is used the input string must contain all uppercase letters.

  5. Grammar: (cc)*

    Accepts 2 letters or any number of multiples of 2 letters with periods between each 2 letters.

    Input string: XX.XX.AB.CD or LA or DO.RE.MI and so on.
    

    If the IPRSE_NOMIXED_CASE option is used the input string must contain all uppercase letters.

  6. Example
    Grammar    www
    
    Accepts 3 characters, including letters, decimal digits, underscores (_), and asterisks (*). If the IPRSE_NOMIXED_CASE option is used all of the letters must be in uppercase. An asterisk can match one or more w characters.
    Input Strings:
    *X or ABC or *1B
    * or ** or ***
    A* or *B* or 1*2 or 0** and so on.