First Normal Form (1NF)
Details about employees are the major cause of the table complexity and inelegance (see previous figure). These columns allow accommodating up to four employees. And why only four? Why not five or six or more? We cannot rely on one example report. Even if we have one of the requirements in formal specification that maximum number of employees per project must be four, this will not be optimal solution, because during the time business rules and practice can be changed.
So one problem with our repeating group of employees is that we have to set an arbitrary maximum number of repetitions, large enough to accommodate the grates number that might ever occur in practice.
Consider also the problem of updating data within the repeating group. Suppose we wanted to delete the second employee for a particular project. Would we shuffle the third and fourth employees back into slots two and three, or would our application have to deal with intermediate gaps?
First Normal Form deals with problems related to the repeating groups of data. Process of transforming from UNF to 1NF is:
- Identifying repeating groups of data.
- Removing repeating groups to a new table.
- Determining the primary key of the new table.
Identifying Repeating Groups
Repeating groups is a set of columns repeated a number of times. This repeating groups result with data model that is inflexible, complex and with poor data re-usability.
In previous figure, employee data represent such a repeating group and table can be represent as in figure shown below, where we can use curly braces to indicate what columns represent repeating group.
Removing Repeating Groups
This step is to split original table into multiple tables (one for the basic data and one for each repeating group) as follows:
- Remove each separate set of repeating group columns to a new table (one new table for each set) so that each occurrence of the group becomes a row in its new table.
- Include the key of the original table in each new table, to serve as a cross-reference (foreign key).
- If the sequence of occurrences withing a repeating group has business significance, introduce a
Sequence
column to the corresponding new table. - Name each new table.
Next figure shows the two new tables that result from applying these rules to the Project
table.
Determining the Primary Key of the New Table
In order to find the key for the new table, we must find the minimum combination of columns needed to uniquely identify one row. The key from the original unnormalized table always becomes part of the key of the new table. That way we create a compound key.
In this example, combination of Project Code
and Employee No
is unique for each row in the newly created table, so this is primary key as it is shown in next figure.
Problems With Tables in First Normal Form
If we look at table Project_Employee
we can see that every row that represents one project will contain the facts about employee name and his department. Furthermore, if one employee is assigned to multiple projects, these facts are repeated. This can lead us to number of problems - if we want to change for example employee department, we must scan trough the whole Project_Employee
table and update all rows which contain that particular employee, or what about if the project is canceled. Normally we delete the row from Project
table and all associated rows in table Project_Employee
. But if we have some employee that has been assigned to only that one project, then we would lose information about him. These kind of problems we can solve by transforming our tables in Second and Third Normal Form.