gtpc2mjy | C/C++ Language Support User's Guide |
This section lists the syntax rules for defining a grammar to IPRSE.
The topics in this section are:
The following characters have special syntax definitions for grammars:
At the beginning of the grammar you can specify the following parser options:
IPRSE_NOSTRICT
IPRSE_STRICT
IPRSE_NOMIXED_CASE
IPRSE_MIXED_CASE
Each option must be enclosed in angle brackets (<>), with no spaces between the angle brackets and the option, and all options must precede any other grammar tokens. For example, the following grammar specifies the IPRSE_NOSTRICT and IPRSE_MIXED_CASE options:
const char grammar[] = "<IPRSE_NOSTRICT><IPRSE_MIXED_CASE>" "ZXXXX NUMbers-d* LETters-c*";
The <tpfparse.h> header file defines the following symbolic names for grammar parser options:
#define IPRSE_STRICT_GRAMMAR "<IPRSE_STRICT>" #define IPRSE_NOSTRICT_GRAMMAR "<IPRSE_NOSTRICT>" #define IPRSE_MIXED_CASE_GRAMMAR "<IPRSE_MIXED_CASE>" #define IPRSE_NOMIXED_CASE_GRAMMAR "<IPRSE_NOMIXED_CASE>"
Thus, the preceding grammar example can also be coded as follows:
const char grammar[] = IPRSE_NOSTRICT_GRAMMAR IPRSE_MIXED_CASE_GRAMMAR "ZXXXX NUMbers-d* LETters-c*";
Parser options specified at the beginning of a grammar override any conflicting options specified in the IPRSE_parse function options parameter. For example, in the following code fragment the IPRSE_MIXED_CASE option specified in the grammar (the second parameter) overrides the IPRSE_NOMIXED_CASE option specified in the options parameter (the fourth parameter):
int count = IPRSE_parse("a b c", /* input string */ "<IPRSE_MIXED_CASE> A B C", /* grammar */ &result, IPRSE_ALLOC | IPRSE_NOMIXED_CASE, /* options */ error_header);
All grammar options must be specified at the beginning of the grammar; there are no default grammar options. If a grammar specifies two conflicting options (<IPRSE_NOSTRICT> and <IPRSE_STRICT>, or <IPRSE_NOMIXED_CASE> and <IPRSE_MIXED_CASE>), the last option specified overrides any previous ones.
In the grammar, there can be two types of parameters: positional and keyword. Any positional parameters must come before the keywords.
Input strings must either match characters or match character types, depending on the type of parameter in the grammar. In the grammar, characters are letters (A-Z) or digits (0-9), and character types are a, c, d, u, w, x, *, and +, which are described later in this section. The sections on positional parameters and keyword parameters explain how matching characters and character types are used by the grammar for the parameter.
The following are general rules for parameters in the grammar:
Example:
Grammar: [C,A] Input strings of C,A or nothing
Example:
Grammar: {A|B} Input strings must be either: A or B, but not both
A period (.) can be used to delimit parameters that use character types.
The parser deletes multiple delimiters between parameters. The syntax does not include a null positional parameter.
Examples:
A B,C contains 3 parameters A, B and C A , B contains 2 parameters A and B A ,, B contains 2 parameters A and B.
In the grammar, leading uppercase letters indicate that they must be matched in the input string, and trailing lowercase letters indicate that they are optional for the input string. If you are creating a grammar for a parameter with matching characters, always begin it with an uppercase letter. (In the input string, all letters must be entered in uppercase unless the IPRSE_MIXED_CASE option is specified.)
Unlike w (see below), u treats asterisks (*) the same as any other character; asterisks in the input string must match us in the grammar one-for-one.
An asterisk (*) in the input string behaves as a wildcard character. In the input string, one asterisk can match any number of w character types in the grammar. If w is coded in a grammar, it cannot be followed by a different character type.
An example of parameters in a list follows:
Grammar: ccccccc.ddddd Input String: WARNING.12225
An example of a wildcard list follows:
Grammar: (ccccc)* Input String: PARMA.PARMB.PARMC
The components of a list can be arbitrarily long when the + or * matching character is coded in a list grammar. You can restrict the total length of a list to a maximum length by coding a colon (:) followed by the maximum length at the end of the grammar for the list parameter. For example, the following grammar:
a*.a*:20
accepts an input string consisting of two alphanumeric strings separated by a period. Each alphanumeric string may be of any length as long as the total length of the input list does not exceed 20 characters.
Similarly, the following grammar:
(a*)*:42
accepts an input string consisting of any number of alphanumeric strings separated by periods, as long as the total length of the input list does not exceed 42 characters.
A positional parameter is a parameter that must be entered in a specific position of the input string syntax and before the keyword parameters.
The following are examples of positional parameters:
1) Grammar: ZDSMG ACTION SDA Input string: ZDSMG ACTION SDA
2) Grammar: P1 cc.dd (xx)* Input string: P1 AB.01 F0.E1.D2
A positional parameter can:
Examples:
1) Grammar: ABC Input string must be: ABC
2) Grammar: Abc Input string can be: A or AB or ABC
Examples:
1) Grammar: acd* Input string can be: AX1 or 0Y12 or ZZ123 and so on
2) Grammar: d+++ Input string can be one to four digits: 1 or 21 or 345 or 5555 and so on
Example:
Grammar: acd.xw Input string can be: For the whole list: 0A1.F* For the first item: AB9. For the second item: .0A
Example:
Grammar: (cd)* Input string: A1, or F2.C3, or D4.E5.Z2, or any combination of cd repeated up to 20 times. Each item in the input string must be delimited from the following item by a period (.).
A keyword parameter is a parameter whose value is determined by having a value assigned to the keyword name.
Keyword parameters must be entered after all the positional parameters are entered (if there are any positional parameters to be entered), and then the keyword parameters can be entered in any order. The following is an example of keyword and positional parameters:
ZSIPC ALTER INTERVAL TIME-xx PRIM=ccc ALTERN-xx
There are two forms of keywords: regular and self-defining. Their grammar form and their range of possible values is different. Regular keywords have a range of many values, whereas self-defining keywords only have 2 possible values.
Regular keywords are keywords that can have a range of values.
Keyword name=keyword value or Keyword name-keyword value
The hyphen (-) and the equal sign (=) are keyword delimiters. No blanks are allowed between the keyword name, the delimiter (- or =), and the keyword value.
The keyword name can use only matching characters. The keyword value can use only character types, and lists (but not wildcard lists).
The following are examples of valid keyword parameters:
KEY-aaaa KEY=aaaaaa
The following are examples of keyword parameters that are not valid:
KEY - xxxxxxx blank between KEY and hyphen; and blank between hyphen and xxxxxx KEY -xxxxxx blank between KEY and hyphen KEY= xxxxxx blank between equal sign and xxxxxx KEY--xxxxxx multiple occurrence of hyphen.
A regular keyword parameter can:
The keyword name must match specific characters.
Example:
Grammar is: Key=cd.x Input string for keyword name portion only: K KE KEY
The keyword value can use character types.
Examples:
1) Grammar is: Key=cd Keyword value: cd Input string for cd can be: A1 M1 B5 and so on...
2) Grammar is: KEY=d+++ Keyword value: d+++ Input string for d+++ can be 1 to 4 digits: 1 or 21, or 345, or 5555 and so on.
Only the keyword value can be a list.
Example:
Grammar is: KEYword=acd.xw Keyword value is a list: acd.xw Input string can be: For the whole list: 0A1.F* For the first item: AB9. For the second item: .0A
Self-defining keywords are keywords that return 1 of only 2 values:
Self-defining keywords have the following form:
(NO)Keyword
Use matching characters for self-defining keywords.
Example:
1) Grammar is: (NO)WAlk Input can be: NOWALK - gives a value of N. WALK - gives a value of Y. NOWA - gives a value of N. and so on....
2) Grammar: (NO)ERASE Possible input strings: ERASE - the value assigned is Y. NOERASE - the value assigned is N.
When the IPRSE_parse function is called with the IPRSE_MIXED_CASE option, the parser accepts input values in lower case. The values returned in the parser results can be translated to upper case by adding a less-than sign (<) at the end of each grammar token for which the matching value should be translated; otherwise the values returned in the parser results are the same as in the input string. In the following example, assume that the IPRSE_MIXED_CASE option is in effect.
Grammar: c c< c Input: x y z Result values: x Y z
Do not write ambiguous grammars, such as:
Accepts any character string between 2 and 4 letters. If the IPRSE_NOMIXED_CASE option is used the input string must contain all uppercase letters.
Possible input strings: ABCD or ABC or AB and so on.
Accepts 2 letters followed by 1 or 2 decimal digits. If the IPRSE_NOMIXED_CASE option is used both letters must be uppercase.
Possible input strings: XY23 or XY2 and so on.
Accepts any character string with at least 2 letters. If the IPRSE_NOMIXED_CASE option is used the input string must contain all uppercase letters.
Possible Input Strings: AB, or ABCD, or ABCDEFGHILLLO
Accepts the following input strings:
AA.12 XY.66
.45
AB.
If the IPRSE_NOMIXED_CASE option is used the input string must contain all uppercase letters.
Accepts 2 letters or any number of multiples of 2 letters with periods between each 2 letters.
Input string: XX.XX.AB.CD or LA or DO.RE.MI and so on.
If the IPRSE_NOMIXED_CASE option is used the input string must contain all uppercase letters.
Grammar wwwAccepts 3 characters, including letters, decimal digits, underscores (_), and asterisks (*). If the IPRSE_NOMIXED_CASE option is used all of the letters must be in uppercase. An asterisk can match one or more w characters.
Input Strings: *X or ABC or *1B * or ** or *** A* or *B* or 1*2 or 0** and so on.