Normal Forms   «Prev  Next»

Lesson 10 Achieving (2NF) Second Normal Form
Objective Describe the process for achieving 2NF.

Achieving Second Normal Form

The preceding lessons established what Second Normal Form requires: a table in 2NF must be in First Normal Form and must contain no partial dependencies - every non-key attribute must depend on the entire primary key, not just part of it. This lesson describes the practical process for achieving that result.

The Process for Achieving 2NF

Putting a table into Second Normal Form involves five steps. The process draws on both the functional dependency analysis covered in the preceding lessons and on knowledge of the business objects in the ER diagram:

  1. Verify the table is in 1NF. Before evaluating for 2NF, confirm that the table satisfies First Normal Form: every column contains atomic values, there are no repeating groups, and every row is uniquely identified by a primary key. A table that is not yet in 1NF cannot be meaningfully analyzed for partial dependencies.
  2. Identify the functional dependencies in the table. Determine which attributes depend on which other attributes. Use the techniques from the preceding lesson: analyze business rules, examine sample data, evaluate attribute semantics, and identify candidate keys. Document every dependency you find.
  3. Represent those functional dependencies using determinant → attributes notation. Writing the dependencies explicitly in arrow notation makes partial dependencies visible. If the primary key is composite (A, B) and you find A → C, that notation immediately shows that C depends on only part of the key.
  4. Identify fields that do not belong in the table. Use your knowledge of the business objects in the ER diagram to identify any fields that are not either:
    1. inherent characteristics of the entity represented by the table, or
    2. foreign keys used to link the table to other tables.
    A field that fails both tests is almost certainly not functionally dependent on the full primary key. It belongs to a different entity and should be in a different table.
  5. Separate fields into new tables linked by foreign keys. Move any field that is not an inherent characteristic or foreign key of the current table's entity into a new table. The new table's primary key is the part of the original composite key that determined the moved fields. Add a foreign key in the original table that references the new table's primary key. Repeat this step until no partial dependencies remain in any table.

General Requirements for Second Normal Form

The five-step process above implements two general requirements that define what 2NF achieves in structural terms:

  1. Remove subsets of data that apply to multiple rows of a table and place them in separate tables. Each such subset represents data about a different entity - data that is being redundantly stored because it was mixed into the wrong table.
  2. Create relationships between the new tables and the original table through foreign keys. The foreign key preserves the connection between the entities while eliminating the redundancy that existed when their data was combined.

Worked Example: StudentCourse Table

The following example illustrates the decomposition process. The original table stores student enrollment data with a composite primary key of (StudentID, CourseID):

StudentID CourseID StudentName CourseName Instructor
101 CS101 Alice Intro to DB Dr. Smith
102 CS101 Bob Intro to DB Dr. Smith

The primary key is the composite (StudentID, CourseID). Examining the functional dependencies reveals a partial dependency:

  • CourseID → CourseName, Instructor - CourseName and Instructor depend only on CourseID, not on the full composite key. This violates 2NF.

Applying Step 5, the table is decomposed by moving CourseName and Instructor - along with the part of the key they depend on (CourseID) - into a new Course table:

StudentCourse
StudentID CourseID
101 CS101
102 CS101

Course
CourseID CourseName Instructor
CS101 Intro to DB Dr. Smith

After decomposition, every non-key column in each table depends on the entire primary key of that table. In StudentCourse, the composite key (StudentID, CourseID) is the full key and there are no non-key columns. In Course, CourseID is the primary key and both CourseName and Instructor depend on it fully. Neither table contains a partial dependency. Both are in 2NF.

The key rule: A table is in 2NF if it is in 1NF and there are no partial dependencies on any candidate key.

Functional Dependencies versus Entities

A natural question arises at this point in the normalization process: which should drive database design - functional dependencies or entities? This is genuinely a chicken-and-egg problem, and the answer is that it does not matter as long as the result is consistent.

Normalization appears to share little with the entity relationship model. However, a relational design produced by normalization and an ER design transformed into relational tables produce almost identical results. The two approaches reinforce each other because they are analyzing the same underlying reality from different angles.

In the normalization approach, the designer starts with a real-world situation to be modeled, lists the data items that are candidates to become column names in relational tables, and documents the rules that govern how these items relate to one another. The objective is to represent all of these data items as attributes of tables that satisfy the constraints defined by normal forms[1]. Each normal form definition is more restrictive than the previous, and each eliminates a specific class of anomaly.

What is most important is consistency between the ER diagram and the functional dependencies identified in the relations. It makes no difference whether you begin by looking for functional dependencies during normalization or by identifying entities during data modeling. Both paths arrive at the same well-structured schema.

In most cases, database design is an iterative process - as described in the context of the database life cycle: create an initial design, check it, modify it, and check it again. Functional dependencies and entities can be compared at any stage of this process. The test is simple: are the functional dependencies identified during normalization consistent with the entities created during data modeling? When the answer is yes, the design is on solid ground.

Formal Definition of Second Normal Form

In formal terms, a table T in a database schema with functional dependency set F is in Second Normal Form when the following condition holds for every functional dependency in F:


X --> A  implied by F that lies in T

where A is a single attribute that is not in X and is nonprime (not part of any candidate key), X is not a proper subset of any key K of T. In plain terms: no non-prime attribute may depend on a proper subset of any candidate key.

A database schema is in 2NF when all tables it contains satisfy this condition.

Second Normal Form - Exercise

Before moving on, complete the following exercise to reinforce your understanding of the process for achieving Second Normal Form:
Second Normal Form - Exercise

The next lesson describes the limitations of second normal form.

[1] Normal form: A defined standard structure for relational databases in which a relation may not be nested within another relation. Each successive normal form is more restrictive than the previous and eliminates a specific class of data anomaly.

SEMrush Software 11 SEMrush Banner 11