| Lesson 10 | Achieving (2NF) Second Normal Form |
| Objective | Describe the process for achieving 2NF. |
The preceding lessons established what Second Normal Form requires: a table in 2NF must be in First Normal Form and must contain no partial dependencies - every non-key attribute must depend on the entire primary key, not just part of it. This lesson describes the practical process for achieving that result.
Putting a table into Second Normal Form involves five steps. The process draws on both the functional dependency analysis covered in the preceding lessons and on knowledge of the business objects in the ER diagram:
(A, B) and you find
A → C, that notation immediately shows that C depends on only part of the
key.
The five-step process above implements two general requirements that define what 2NF achieves in structural terms:
The following example illustrates the decomposition process. The original table stores student
enrollment data with a composite primary key of (StudentID, CourseID):
| StudentID | CourseID | StudentName | CourseName | Instructor |
|---|---|---|---|---|
| 101 | CS101 | Alice | Intro to DB | Dr. Smith |
| 102 | CS101 | Bob | Intro to DB | Dr. Smith |
The primary key is the composite (StudentID, CourseID). Examining the functional
dependencies reveals a partial dependency:
CourseID → CourseName, Instructor - CourseName and Instructor depend
only on CourseID, not on the full composite key. This violates 2NF.
Applying Step 5, the table is decomposed by moving CourseName and
Instructor - along with the part of the key they depend on (CourseID) -
into a new Course table:
| StudentCourse | |
|---|---|
| StudentID | CourseID |
| 101 | CS101 |
| 102 | CS101 |
| Course | ||
|---|---|---|
| CourseID | CourseName | Instructor |
| CS101 | Intro to DB | Dr. Smith |
After decomposition, every non-key column in each table depends on the entire primary key of
that table. In StudentCourse, the composite key (StudentID, CourseID) is the full
key and there are no non-key columns. In Course, CourseID is the primary key and
both CourseName and Instructor depend on it fully. Neither table
contains a partial dependency. Both are in 2NF.
The key rule: A table is in 2NF if it is in 1NF and there are no partial dependencies on any candidate key.
A natural question arises at this point in the normalization process: which should drive database design - functional dependencies or entities? This is genuinely a chicken-and-egg problem, and the answer is that it does not matter as long as the result is consistent.
Normalization appears to share little with the entity relationship model. However, a relational design produced by normalization and an ER design transformed into relational tables produce almost identical results. The two approaches reinforce each other because they are analyzing the same underlying reality from different angles.
In the normalization approach, the designer starts with a real-world situation to be modeled, lists the data items that are candidates to become column names in relational tables, and documents the rules that govern how these items relate to one another. The objective is to represent all of these data items as attributes of tables that satisfy the constraints defined by normal forms[1]. Each normal form definition is more restrictive than the previous, and each eliminates a specific class of anomaly.
What is most important is consistency between the ER diagram and the functional dependencies identified in the relations. It makes no difference whether you begin by looking for functional dependencies during normalization or by identifying entities during data modeling. Both paths arrive at the same well-structured schema.
In most cases, database design is an iterative process - as described in the context of the database life cycle: create an initial design, check it, modify it, and check it again. Functional dependencies and entities can be compared at any stage of this process. The test is simple: are the functional dependencies identified during normalization consistent with the entities created during data modeling? When the answer is yes, the design is on solid ground.
In formal terms, a table T in a database schema with functional dependency set F is in Second Normal Form when the following condition holds for every functional dependency in F:
X --> A implied by F that lies in T
where A is a single attribute that is not in X and is nonprime (not part of any candidate key), X is not a proper subset of any key K of T. In plain terms: no non-prime attribute may depend on a proper subset of any candidate key.
A database schema is in 2NF when all tables it contains satisfy this condition.
Before moving on, complete the following exercise to reinforce your understanding of the process
for achieving Second Normal Form:
Second Normal Form -
Exercise
The next lesson describes the limitations of second normal form.