| Lesson 5 || Normalization |
| Objective || Gain an overview of normalization. |
What is Database Normalization?
Given any pool of entities and attributes, you can group them into relations in many ways.
The process of normalization creates relations that avoid most of the problems that arise from bad relational design. There are at least two ways to approach normalization.
- The first is to work from an ER diagram. If the diagram is drawn correctly, then there are some simple rules you can use to translate it into relations that will avoid most relational design problems. The drawback to this approach is that it can be difficult to determine whether your design is correct.
- The second approach is to use the theoretical concepts behind good design to create your relations. This is a bit more difficult than working from an ER diagram, but it often results in a better design.
In practice, you may find it useful to use a combination of both approaches. First, create an ER diagram and use it to design your relations. Then, check those relations against the theoretical rules for good design and make any changes necessary to meet the rules.
Relational Database Design
Relational database design takes a statement of requirements and produces a database definition to address those requirements.
The definition consists of a collection of constraint definitions. The goal of logical database design is to produce a design that is independent of all considerations having to do with either
- physical implementation or
- specific applications.
The latter objective being desirable for the reason that it is generally not the case that all uses to which the database will be put are known at design time. The production and format of a precise and complete requirements statement usually takes the form of a collection of
and entity relationship model
presented using an agreed notation.
Business rules are expressed in an intuitive and somewhat informal style thought to be good enough for the purpose at hand.
The fact is, that even when the requirements are 100% clear there are usually some design choices to be made. In other words, there can be several significantly different designs to implement any given requirement statement.
What common kinds of alternative might the designer encounter and in each case what considerations should guide the designer in arriving at the preferred choice?
When you create new databases, you will often hear phrases such as
- Question: Does this database really need to be normalized?
- Do we need to split up these tables?
- Does this table need to be dissected into component parts?
These are referring to a methodology that makes using relational tables much easier and more flexible.
At the same time, this approach is most efficient since the process eliminates duplications in the information you are storing.
This fact is key as to why a normalized database structure will be used, in most of the work you do with SQL.
When you normalize a table, you take out the duplicate information and place it in a related table. Normalization is the process of efficiently organizing data in a database. There are two goals of the normalization process:
- Eliminating redundant data (for example, redundant data occurs when storing the duplicate data in more than one table) and
- ensuring that data dependencies make sense (only storing related data in a table).
Both of these are important objectives as they reduce the amount of space a database consumes and ensure that data is stored logically.
Consider the example in the next lesson.