bdfd1m0cDatabase Administration

Business Application

Assume that you want to keep passenger and flight information on the database.

To do this, you could have a single table containing rows holding all the necessary information resulting from your data analysis. A table of this kind is shown in Table 1.

Table 1. Flight Table (Nonnormalized Form)

Date Flight number Start Destination Aircraft type Seat number Seat class Passenger number Passenger name Passenger address Passenger facts
Da1 Fl1 St1 De1 At1 Se1
Se2
Se3
Se4
Cl2
Cl2
Cl2
Cl3
Pn1
Pn2
Pn3
Pn4
Na1
Na2
Na3
Na4
Ad1
Ad2
Ad3
Ad4
Ft1
Ft2
 
 
Da1 Fl2 St2 De2 At1 Se5 Cl2 Pn3 Na3 Ad3  

This table is not in normalized form and has several undesirable features. For example:

First Normal Form

To get a table into first normal form, remove multiple occurrences of attribute values from the same row by creating a new row for each value. This ensures that every attribute in the table has only one value for each row.

Table 2 is in first normal form and contains the same data as the nonnormalized table.

Table 2. Flight Table (Nonnormalized Form)

Date Flight number Start Destination Aircraft type Seat number Seat class Passenger number Passenger name Passenger address Passenger facts
Da1 Fl1 St1 De1 At1 Se1
Se2
Se3
Se4
Cl2
Cl2
Cl2
Cl3
Pn1
Pn2
Pn3
Pn4
Na1
Na2
Na3
Na4
Ad1
Ad2
Ad3
Ad4
Ft1
Ft2
 
 
Da1 Fl2 St2 De2 At1 Se5 Cl2 Pn3 Na3 Ad3  

Second Normal Form

To get a table into second normal form, remove all attributes that are not dependent, either directly or indirectly, on the value of the primary key and put them into other tables.

In Table 2, the attribute destination directly depends on the primary key. Given the value of the primary key, especially the date and flight number, you can see that the aircraft will be flying to a particular destination. Without the primary key value, it would not make sense to speak of a flight destination at all.

In contrast, the seat class attribute is only indirectly dependent on the primary key. It really depends on the aircraft type and seat number attributes. Aircraft type is not part of the primary key, though it is itself dependent on the primary key. Because seat class is dependent on aircraft type, it is said to be indirectly dependent on the primary key.

Removing Independent Attributes

Looking back to the flight table in first normal form (Table 2), you can see that it contains some attributes that are not dependent, either directly or indirectly, on the primary key. The primary key in this table is the combination of the date, flight number, and seat number attributes.

Even if the flight table's primary key had no value at all, the following attributes would still be meaningful:

Because they are not dependent on the primary key, you can remove these attributes from the flight table and put them into a separate table of their own.

Table 3 shows an example of such a table. Note that the primary key for this table is passenger number. This has been included in the new table because its value is always unique. For example, there may be more than one passenger named Smith, or more than one vegetarian passenger, but there can never be more than one passenger number Pn1.

Because it is unique, passenger number becomes the primary key of the new table. It also remains in the flight table because it is dependent on the primary key.

Table 3 is already in second normal form because every attribute in the table is either directly or indirectly dependent on the primary key.

Table 3. Passenger Table

Passenger number Passenger name Passenger address Passenger facts
Pn1 Na1 Ad1 Ft1
Pn2 Na2 Ad2 Ft2
Pn3 Na3 Ad3  
Pn4 Na4 Ad4  

Now that you have removed the attributes that are independent on the primary key and have put them in the passenger table (Table 3), the flight table is in second normal form, as you can see in Table 4. The primary key is still the combination of the date, flight number, and seat number attributes that are shown in bold type.

Table 4. Flight Table (Second Normal Form)

Date Flight number Start Destination Aircraft type Seat number Seat class Passenger number
Da1 Fl1 St1 De1 At1 Se1 Cl2 Pn1
Da1 Fl1 St1 De1 At1 Se2 Cl2 Pn2
Da1 Fl1 St1 De1 At1 Se3 Cl2 Pn3
Da1 Fl1 St1 De1 At1 Se4 Cl3 Pn4
Da1 Fl2 St2 De2 At1 Se5 Cl2 Pn3

Third Normal Form

The flight table (Table 4) has been improved by being put into second normal form, but it still contains some indirect dependencies, and also some duplicate data.

For example, the seat class attribute depends on aircraft type and seat number. It does not depend directly on the other two attributes of the primary key. Because of this, you can remove seat class and record it in a separate table.

The new table will also include the aircraft type and seat number attributes. Together, they form the primary key of the new table, which is shown in Table 5.

Table 5. Aircraft Table

Aircraft type Seat number Seat class
At1 Se1 Cl2
At1 Se2 Cl2
At1 Se3 Cl2
At1 Se4 Cl3
At1 Se5 Cl2

The primary key here is the combination of aircraft type and seat number.

The indirect dependency has now been removed from the flight table, but with the table in second normal form, there is still some duplication, as you can see in Table 6.

Table 6. Flight Table (Second Normal Form)

Date Flight number Start Destination Aircraft type Seat number Passenger number
Da1 Fl1 St1 De1 At1 Se1 Pn1
Da1 Fl1 St1 De1 At1 Se2 Pn2
Da1 Fl1 St1 De1 At1 Se3 Pn3
Da1 Fl1 St1 De1 At1 Se4 Pn4
Da1 Fl2 St2 De2 At1 Se5 Pn3

Because the values for seat number and passenger number must be unique for each row, the table must contain a separate row for each of these values. To accommodate this, the other values in the table must be duplicated. For example, flight number Fl1 is repeated four times when only once would be enough.

Because seat number is already held in the aircraft table (Table 5) and passenger number is held in the passenger table (Table 3), and there is no indirect dependency through either of these attributes, you can remove them both from the flight table.

Because you need a table to record the passengers who are on any particular flight, you could place these removed attributes in a seat table. This table would show the seats booked on the flight and the passenger number for the person booked on each seat. Table 7 is an example of this kind of table:

Table 7. Seat Table

Date Flight number Seat number Passenger number
Da1 Fl1 Se1 Pn1
Da1 Fl1 Se2 Pn2
Da1 Fl1 Se3 Pn3
Da1 Fl1 Se4 Pn4
Da1 Fl2 Se5 Pn3

The primary key in this table is the combination of date, flight number, and seat number.

The flight table is now in third normal form. Each of its attributes is directly dependent on the primary key and there is no unnecessary duplication of data in the table.

As you can see in Table 8, the value for date is the same in both rows of the table. This duplication is necessary because there will certainly be more than one flight each day.

Table 8. Flight Table (Third Normal Form)

Date Flight number Start Destination Aircraft type
Da1 Fl1 St1 De1 At1
Da1 Fl2 St2 De2 At1

The passenger number attribute has been removed from the flight table along with seat number and seat class because passenger number was dependent on seat number and, without the seat number, no passenger number can be assigned.

The primary key in the flight table is now the combination of the date and flight number attributes.

Resulting Tables

You now have four complete and related tables (Table 9-Table 12) in third normal form:


Table 9. Flight Table (Third Normal Form)

Date Flight number Start Destination Aircraft type
Da1 Fl1 St1 De1 At1
Da1 Fl2 St2 De2 At1

Table 10. Passenger Table

Passenger number Passenger name Passenger address Passenger facts
Pn1 Na1 Ad1 Ft1
Pn2 Na2 Ad2 Ft2
Pn3 Na3 Ad3  
Pn4 Na4 Ad4  

Table 11. Aircraft Table

Aircraft type Seat number Seat class
At1 Se1 Cl2
At1 Se2 Cl2
At1 Se3 Cl2
At1 Se4 Cl3
At1 Se5 Cl2

Table 12. Seat Table

Date Flight number Seat number Passenger number
Da1 Fl1 Se1 Pn1
Da1 Fl1 Se2 Pn2
Da1 Fl1 Se3 Pn3
Da1 Fl1 Se4 Pn4
Da1 Fl2 Se5 Pn3

Duplicating Data across Tables

Looking back at the four normalized tables (Table 9-Table 12), you may notice that there is some duplication of data across them. For example, the passenger number attribute is held in both the passenger table and the seat table. Seat number is held in the seat table and the aircraft table, and there are further duplications as well.

This duplication occurs in normalized tables because they are generally being developed to be used in a relational database. Relational systems use these common attributes as links between one table and the next. The links provide paths around the database so that every table can access the data held in every other table. This means that, apart from the linking attributes, no data needs to be duplicated in the database.

In the example hierarchical database discussed in this publication, links of this kind are not needed because the tables are joined by pointers. Pointers provide a means of linking one table or file with another. Use the DBIDX and DBDIX TPFDF macros to maintain pointers.