Third Normal Form   «Prev  Next»

Lesson 9 Denormalization
Objective Explain the Rationale behind Denormalizing Tables

Rationale behind Denormalizing Tables

Some database theoreticians argue that you should normalize database tables all the way to Fifth Normal Form, though doing so requires extensive theoretical knowledge and practice.
Other authorities argue that you should only normalize to 3NF and, where possible, live with any anomalies that might crop up.
If the anomalies greatly hurt database performance or cause consistent data loss, then you should normalize the tables to normal forms higher than 3NF. Once you have normalized a table to a given normal form, you can change the table to a lower normal form. This process is denormalization[1]. You may want to do this for two reasons:
  1. To improve database performance
  2. To reduce unnecessary complexity

When to Denormalize

As an example, consider the following table:
EmployeeID GenreID GenreManager
101 Sci_Fi Rawdaw
104 YA-13 Thomason
105 YA-13 Thomason
106 Sci_Fi Rawdah
108 Fantasy Klein
108 Sci_Fi Rawdah

You could break this table into the following two tables:
GenreManagement (GenreManager, GenreID) 
GenreEmployees (EmployeeID, GenreManager) 

The problem is that whenever you want to list all of the employees on a genre team with the team name and their manager's name in the same record, you will need to recombine the two tables. While it is unlikely that combining these particular tables would bog down a system, if the tables were large enough database performance might be affected. Suppose you can live with the anomalies presented by the original table. After all, if the last member of a genre team leaves then the team, for all practical purposes, no longer exists. Whenever the first employee joins the team, you can re-enter the manager's information.

When Not to Denormalize

The important thing to remember when denormalizing a table is that denormalization is a speed tweak, meaning that it avoids combining tables for commonly used searches and queries. The problem is that an adjustment might improve performance for one set of operations but hurt performance for others. If you just wanted to create a list of all GenreManagers and the corresponding GenreIDs, you could do so quite easily:
GenreManagement (GenreManager, GenreID) 
If you did so in the denormalized version of the table:
GenreTeam (EmployeeID, GenreManager, GenreID)

then the RDBMS would need to search every record in the table to ensure it didn't miss a GenreManager and GenreID.
Since there is more than one employee on most GenreTeams, the operation would be slower than if the tables had been normalized. The next lesson wraps up this module.

[1]denormalization: To move a table to a lower normal form (e.g., from 3NF to 2NF).