To extract data from a data source, you need to know the
structure of data, create an extract, transform, and load (ETL) catalog
that reproduces this structure in IBM® Cognos® Data
Manager, and
then set up the processes required to transform the data into the
star schema or metadata format required by the data mart or IBM Cognos Framework
Manager.
To help you simplify the process, IBM Rational® Insight has sample
catalogs for extracting data from the data services, loading the data
to the operational data store in IBM Rational Insight
data warehouse, and
building the conformed data marts. This section is an overview of
the sample catalog.
An IBM Cognos Data
Manager catalog
is a central repository that describes how information will be extracted,
transformed, and loaded from external sources. Broadly, a Data Manager
catalog describes:
- The normalization and mapping of external data into a relational
schema
- The transformation of data from operational data stores and their
mapping and insertion into the fact and dimensions tables of the data
mart.
- The job streams required to organize the sequence of builds
- The metadata that describe the target data mart or warehouse
- The common reusable components (dimensions, connections, and user-defined
functions)
The sample catalog of
Rational Insight facilitates
the extraction, transformation, and loading of data from the following
products:
- IBM Rational ClearQuest®
- IBM Rational ClearCase®
- IBM Rational RequisitePro®
- IBM Rational Quality Manager
- IBM Rational Team Concert™
- IBM Rational Test Manager
- Microsoft® Project
These sample catalogs contain predefined job streams and builds
that map data extracted from these sources into corresponding tables,
and transform these tables into facts and dimensions. Every sample
catalog contains predefined libraries, builds, and metadata.
Example
Consider the sample ETL build used
in conjunction with RequisitePro to
extract requirement management data. The build loads data from the
source as follows:
- All types of requirements into the REQUIREMENT table
of the operational data store
- Requirement types information into the REQUIREMENT_CLASSIFICATION table
- Requirement revision data into the REQUIREMENT_REVISION_HISTORY table
- Requirement parent-child relationship into the REQUIREMENT_REQUIREMENT_LOOKUP table
- Requirement traceability data into REQUIREMENT_HIERARCHY table
- Projects data into the PROJECT table
- Users data into RESOURCE
The sample ETL model for IBM Rational RequisitePro also contains predefined
ETL job streams that organize jobs into a predefined categories.
A job stream is defined for each external data source. Every job
stream can be scheduled independently of others.
The
OOTBE2E job
is the top-level job that defines the overall flow. It runs jobs to
load data from different external operational data sources into corresponding
tables, and then executes the job to build the data mart in the data
warehouse. The
Build_Star job populates the data
mart. It does the following:
- Starts the Dimensions job to populate the dimension
tables
- Starts the Facts to populate the fact tables
- Starts the HistoricalLookups to populate the
historical lookup tables.
You can customize these builds and job streams depending on
your specific business situation and implementation of these products.