Table of Contents
If you want to develop a DLF converter for an application whose logging data model isn't adequately represented by one of the existing DLF schema, you'll need to develop a new one.
If you are familiar with SQL, a DLF schema is similar to a table schema description. A DLF file can be seen as a table, where each log record is represented by a table row. Each log record in the same DLF schema shares the same fields.
In this chapter, we will create a new schema for logging of FTP session. That DLF schema could serve for an improved DLF converter for log files generated by Microsoft Internet Information Server™. Lire currently has a DLF converter for these log files but the current ftp DLF schema is modelled after the xferlog log file which only represents file transfers whereas the log generated by Microsoft Internet Information Server™ contains more detailed information on the ftp session.
Here is an example of such a log file:
#Software: Microsoft Internet Information Server 4.0 #Version: 1.0 #Date: 2001-11-29 00:01:32 #Fields: time c-ip cs-method cs-uri-stem sc-status 00:01:32 10.0.0.1 [56]created spacedat/091001092951LGW_Data.zip 226 00:01:32 10.0.0.1 [56]created spacedat/html/bx01g01.gif 226 00:01:32 10.0.0.1 [56]created spacedat/html/catlogo.gif 226 00:01:32 10.0.0.1 [56]QUIT - 226 00:03:32 10.0.0.1 [58]USER badm 331 00:03:32 10.0.0.1 [58]PASS - 230
As you can see, this log file contains other information beyond the simple upload/download represented in the standard FTP schema. It a session identifier, the command executed, as well as the result code of the action. Our new schema should be able to represent these things.
To create a DLF schema, you have to create a XML file named after your schema identifier: ftpproto.xml. Schema name should be made of alphanumeric characters. This schema identifier is case sensitive. You schema identifer shouldn't contains hyphens (-) or underscore characters (_). (The hyphen is used for a special purpose).
All DLF schemas starts and ends the same way:
<?xml version="1.0" encoding="ascii"?> <!DOCTYPE lire:dlf-schema PUBLIC "-//LogReport.ORG//DTD Lire DLF Schema Markup Language V1.1//EN" "http://www.logreport.org/LDSML/1.1/ldsml.dtd"> <lire:dlf-schema xmlns:lire="http://www.logreport.org/LDSML/" superservice="ftpproto" timestamp="time" > <!-- Other elements will go here --> </lire:dlf-schema>
The first lines contains the usual XML declaration and DOCTYPE declarations, you'll find in many XML documents. The real stuff starts at the lire:dlf-schema. What is important for your schema are the value of the superservice and timestamp attributes. The first one contains your schema identifier. It is called “superservice” for historical reasons. The other one should contains the name of the field which order the record by their event type. (See the section called “The Field Types” for more information.)
The last line in the above excerpt would be the last thing in the file and closes the lire:dlf-schema element.
The next things that goes into the schema file are the schema's title and description. Both are intended for developers to read and should be informative of the scope of the schema:
<!-- Starting lire:dlf-schema element was omitted --> <lire:title>DLF Schema for FTP Protocol</lire:title> <lire:description> <para>This DLF schema should be used for FTP servers that have detailed information on the FTP connection in their log files. </para> <para>Each record represents a command done by the client during the FTP session. </para> </lire:description>
The content of the lire:description elements are DocBook elements. If you don't know DocBook, you just need to know that paragraphs are delimited using the para elements.
The only remaining things in the schema definitions are the field specifications. Here is the definition of the first one:
<lire:field name="time" type="timestamp" label="Timestamp"> <lire:description> <para>This field contains the timestamp at which the command was issued. </para> </lire:description> </lire:field>
As you can see, the fields are defined using the lire:field element which has three attributes:
This attribute contains the name of the field. This name should contains only alphanumeric characters. It can also make use of the underscore character.
This attribute contains the type of the field. The available types will described shortly.
This should contains the column label that should be used by default in your report for data coming from this field. This label should be short but descriptive.
The field's description is held in the lire:description element which contains DocBook markup. The field's description should be descriptive enough so that someone implementing a DLF converter for this schema knows what goes where.
The main types available for fields are:
This should be use for field which contains a value to indicate a particular point in time. All timestamp values are represented in the usual UNIX convention: number of seconds since January 1st 1970.
Each DLF schema must contains at least one field of this kind and its name should be in the lire:dlf-schema's timestamp attribute.
This type should be used for fields which contains an hostname or IP address.
It is important to mark such fields, because it will possible eventually to resolve automatically IP addresses to hostname.
Type for boolean values.
Type for numeric values.
You shouldn't use this type when the values are limited in number and are semantically related to an enumeration like result code. You should use the string type for this.
You should only use the number type for values which you'll want to report in classes instead on the individual values.
This type should be use for numeric values which are quantities in bytes. The more specific typing is useful for display purpose.
This type should be use for numeric values which are quantities of time. The more specific typing is useful for display purpose.
This is the type which can be use for all other purpose.
If you read the specifications, you'll find other types which are used. These additional types don't bring anything over the basic ones defined above and you shouldn't use them.
In addition to the time field defined above, here are the remaining field defintions which make our complete ftpproto schema:
<lire:field name="sessid" type="string" label="Session"> <lire:description> <para>This field should contains an identifier that can used to related the commands done in the same FTP session. This identifier can be reused, but shouldn't be while the FTP session isn't closed. </para> </lire:description> </lire:field> <lire:field name="command" type="string" label="Command"> <lire:description> <para>This field contains the FTP command executed. The FTP protocol command names (STOR, RETR, APPE, USER, etc.) should be used. </para> </lire:description> </lire:field> <lire:field name="result" type="string" label="Result"> <lire:description> <para>This should contains the FTP result code after executing the command. </para> </lire:description> </lire:field> <lire:field name="cmd_args" type="string" label="Argument"> <lire:description> <para>This field should contains the parameters to the FTP command. </para> </lire:description> </lire:field> <lire:field name="size" type="bytes" label="Bytes Transferred"> <lire:description> <para>When the command involves a transfer like for the RETR or STOR command, it should contains the number of bytes transferred. </para> </lire:description> </lire:field> <lire:field name="elapsed" type="duration" label="Elasped"> <lire:description> <para>This field contains the number of seconds executing the command took. </para> </lire:description> </lire:field>
Making available the new schema to the Lire framework is pretty easy: just copy the file to one of the directories set in the lr_schemas_path configuration variable. By default, this variable contains the directories datadir/lire/schemas and HOME/.lire/schemas. Like all other configuration variables, its value can be changed using the lire tool.
Since we want our schema to be available for other users as well, we will install it in the system directory:
&root-prompt; install -m 644 ftproto.xml /usr/local/share/lire/schemas
(In this case, Lire was installed under /usr/local.