Work Product Descriptor (Artifact): Data Model
This artifact describes logical entities, attributes identifying and describing those entities, and the relationships between them.
Purpose

It serves the following purposes:

  • Promotes understanding and communication between stakeholders and the database modeler
  • Provides an implementation-free set of requirements as input to physical database design
  • Validates the data stores and flows in a functional model against the analysis of data storage and processing needs
Relationships
Fulfilled Slots
RolesResponsible: Modified By:
Input ToMandatory:
  • None
Optional: External:
  • None
Output From
Description
Main Description

The figure below illustrates that a data model has an association to an overall logical data design that is obtained via the progression of the data model through different states - entity relationship model, conceptual data model, and logical data model.

ERD - CDM - LDM transitions

Though the above figure diagrammatically shows development of a logical data design as a cascading serious of steps, the delineation between these steps is not that clear-cut.  In reality, a logical data design expands, contracts, and interplays across these data model states throughout the development.

Entity Relationship Model (ERM)

An ERM level of a logical data design presents a high-level view of the significant information of interest to the business and establishes the foundation from which data design will progress.  It depicts the significant business notions and concepts associated with an application as entities, potentially some elaboration as to the important business characteristics of those notions/concepts via the attributes listed for each entity, and the relationship between these business notions/concepts in the form or data relationships. 

Conceptual Data Model (CDM)

A CDM level of a logical data design depicts a high level statement of the main entities needed to support an application, along with known (but not necessarily complete) listings of the attributes associated with the entities in the data design.  It is an expansion of the information conveyed in the ERM level of a logical data model as usually discovered via top-down analysis.  Development of a logical data design to a CDM level is closely integrated with the development of  an application's process models.

Logical Data Model (LDM)

A LDM is an implementation-independent data model and generally represents the final deliverable in the logical data design efforts of a project.  Using the CDM as input, it reflects the dynamic nature of the entities in the logical data design and optimizes the entities in the data model toward ensuring each strongly and uniquely represents a business notion.

 

Brief Outline

This artifact typically includes the following constructs:

  • Entities
  • Attributes
    • Primary keys that can be used to uniquely identify entity occurrences
    • Foreign keys
    • Alternate keys
    • Data types
    • Valid values (domains)
  • Relationships
Notation

There are three commonly used logical data modeling notations. While they all reflect common data modeling concepts, each has its own graphical syntax briefly described below.  For a more detailed description of these notations, see the references.

Information Engineering (IE)

  • Entities are represented by a box with the entity name placed inside.

                                                           IE Entity        

  • Attributes are handled differently according to the variation of IE used.  They can be described outside of the model in a separate document, or placed inside the entity box, with identifying attributes above a horizontal line, and non-identifying attributes below it.

                                                          IE Entity with Attributes

  • Relationships are described by a line between two entities. Solid lines indicate an identifying relationship, while dashed lines indicate a non-identifying relationship. Terminating symbols on each end describe optionality and cardinality rules.

                 IE Relationships                  IE Multiplicity

Integration Definition for Information Modeling (IDEF1X)

  • Entities are represented with square or round-cornered boxes. Round corners indicate dependence for identification on another entity. Each entity is assigned a label which is placed above the box.

                                                  IDEF1X Entity

  • To include attributes, entity boxes are divided with identifying attributes appearing above the line, and non-identifying attributes below it.

                                                                IDEF1X Attributes

  • Relationships between entities (connection relationships) are described with lines and terminating symbols. A solid line indicates an identifying foreign key. Dashed lines indicate that it is non-identifying. There are a number of terminating symbols that are combined to describe optionality and cardinality. These symbols cannot be interpreted independently of each other. It is the combination that describes the rules.

                      IDEF1X Relationships       IDEF1X Multiplicity

The Unified Modeling Language

  • Entities are represented by class boxes divided into three compartments. The top compartment contains the name of the class (entity), the middle compartment contains its attributes, and when present, the bottom compartment contains behavioral descriptions.

                                                                 UML Class with Attributes

  • Relationships between entities are described with UML associations

                 UML Relationships   UML Multiplicity

Selected Representation

Rational Data Architect is used for developing the logical data model. Information Engineering (IE) notation is its selected notation.

Properties
Optional
Planned
Illustrations
Examples
Key Considerations
  • Focus on staying within the boundaries of the solution under development.
  • Any existing data models should be used as a starting point.  These models may be broad, enterprise models or more specific application data models. If an existing model is used as a starting point, it should not be simply accepted "as is." Instead, some time should be invested to ensure that it is of sufficient quality and that it provides an accurate representation of business data requirements for the solution.
  • Draw from standard data model constructs in shaping and forming the logical data model as they can assist in providing a good framework from which the data details of the solution being developed can be initially understood and further refined.
  • Database refactoring can be difficult, so it is important to balance how much future design is included in your data model versus what is left for later refactoring.
  • The later instance of the model are typically normalized to third normal form.
Tailoring
Impact of not havingWithout this artifact, stored business information is captured solely by a functional model which describes where data stores are created and placed to serve the data needs of the functional processes. There is no unified view of all data, and data normalization is not possible. The physical database design can only be developed from a functional model, leaving data ambiguities and redundancies resulting from the lack of normalization.  This leads to an inefficient physical database design which could be missing critical data or lead to inadvertent data duplication and inconsistency.
Reasons for not needing

Some reasons for not needing:

  • The solution does not require a database.
  • The database is simple enough that no data models are needed
  • The database is simple enough that a physical database design can be based on the identified set of persistent classes and their associations.
Representation Options

Informal data models can be created using whiteboards or drawing tools. However, if the model needs to be maintained, a data modeling tool is advised.

When selecting which data modeling notation should be used, consider the following pros and cons:

  • Information Engineering (IE) notation is a widely accepted standard for logical data models. Simple and concise, it is very readable by even non-technical stakeholders. Some variations of this notation describe attributes in a separate document rather than in the model itself.
  • Integration Definition for Information Modeling (IDEF1X) is a federal standard notation originally developed for physical data modeling. It is very complex, and can result in models that are difficult to review with stakeholders.
  • The Unified Modeling Language (UML) does not have explicit data modeling constructs as part of its notation. Class diagrams are frequently used for logical data modeling.
More Information
Checklists
Guidelines
Supporting Materials
Estimation Considerations