Relational Databases  «Prev  Next»
Lesson 5 Normalization
Objective Gain an overview of normalization.

What is Database Normalization?

Given any pool of entities and attributes, you can group them into relations in many ways. The process of normalization creates relations that avoid most of the problems that arise from bad relational design. There are at least two ways to approach normalization.
  1. The first is to work from an ER diagram. If the diagram is drawn correctly, then there are some simple rules you can use to translate it into relations that will avoid most relational design problems. The drawback to this approach is that it can be difficult to determine whether your design is correct.
  2. The second approach is to use the theoretical concepts behind good design to create your relations. This is a bit more difficult than working from an ER diagram, but it often results in a better design.

In practice, you may find it useful to use a combination of both approaches. First, create an ER diagram and use it to design your relations. Then, check those relations against the theoretical rules for good design and make any changes necessary to meet the rules.

Normalization Rules
When you create new databases, you will often hear phrases such as
  1. This database really needs to be normalized
  2. We need to split up these tables.
  3. This table needs to be dissected into component parts.
These are referring to a methodology that makes using relational tables much easier and more flexible. At the same time, this approach is most efficient since the process eliminates duplications in the information you are storing.
This fact is key to why a normalized database structure will be used, in most of the work you do with SQL.
When you normalize a table, you take out the duplicate information and place it in a related table.
Normalization is the process of efficiently organizing data in a database. There are two goals of the normalization process:
  1. Eliminating redundant data (for example, redundant data occurs when storing the duplicate data in more than one table) and
  2. ensuring that data dependencies make sense (only storing related data in a table).
Both of these are important objectives as they reduce the amount of space a database consumes and ensure that data is stored logically. Consider the example in the next lesson.