RelationalDBDesign RelationalDBDesign



Normal Forms   «Prev  Next»
Lesson 3 Why we need Normalization
Objective Explain the purpose of normalization.

Objective of Database Normalization

Relational databases are built to store and access data efficiently and the design of a database has a huge impact on its performance.
The purpose of normalization is to maximize the efficiency of a database.

Consider a database where all the data is stored in a single table.

ORDER
CustID CustName Address Phone OrderDate OrderNo ItemNo Title Price Shipped
101 C.Frye Portland 230.1251 01/11/2017 101 105 MX 14.95 YES
101 C.Frye Portland 230.1251 01/11/2017 102 108 CX 15.95 YES
102 D. Loge DC 555.1234 01/12/2017 103 105 MX 14.95 YES
102 D. Loge DC 555.1234 01/13/2017 104 109 DX 13.95 YES

This table tracks information relating to
  1. a merchant’s orders, and
  2. contains all of a customer's information and
  3. all of the item's information in every record.
Objective: Reduce data redundancy

Data Modeling Technique

Normalization is a data modeling technique, the goal of which is to organize data elements in such a way that they are stored in one place and one place only (with the exception of foreign keys, which are shared). Data sets, or entities (in relational modeling vocabulary), are business concepts, and data elements, or attributes, are the business data. Every data element must belong to one and only one data set (with the exception of shared data values, called foreign keys), and every data set must own at least one data element. The test to make sure you have done this correctly is often referred to as the process of normalization. If you have been diligent about creating atomic[1] data elements, then this process will be much simpler.
Do not try to normalize a model before you have taken it to the greatest level of detail possible, having called out all its data elements separately and having understood each one well. Seven normal forms are widely accepted in the modeling community, sequenced in order of increasing data organization discipline.

This type of design has several drawbacks:
  1. More than one entity is represented in the table. In this case, the table represents Customers, Orders, and OrderItems.
  2. The table contains redundant data. Every time a customer places an order, their contact information (Name, Address, and Phone) is needlessly repeated.
  3. Any query requires the RDBMS to search the entire table. To find every unique customer in the database, one would need to search every line. There is no location set aside to store data about Customers and only Customers.
  4. Updating information, such as customer names, requires the RDBMS to change multiple rows. If D. Loge moved from Washington to Seattle, the RDBMS would need to update every row containing his information.
  5. Deleting a customer’s only order also removes all contact information for that customer. If D. Loge cancelled the only two orders he placed, no record of him as a potential customer would remain in the database.

Database Normal Forms Explained

Tables that conform to the rules of normalization are said to be in one of several normal forms[2], depending on which rules the table adheres to. In this module and the next we will concentrate on the first, second, and third normal forms. There is a logical progression from one normal form to the next, so it is possible to resolve basic design flaws before tackling potentially more difficult problems. This means that a table in a normal form is automatically in all lower normal forms. That is, a table in second normal form is automatically in first normal form. The next lesson describes first normal form and provides instructions for producing a table in first normal form.

Normalization Process - Quiz

Before you move on to the next lesson, click the quiz link below to reinforce your understanding of normalization.
Normalization Process - Quiz

[1] atomic transaction: An atomic transaction is an indivisible and irreducible series of database operations such that either all occur, or nothing occurs. A guarantee of atomicity prevents updates to the database occurring only partially, which can cause greater problems than rejecting the whole series outright. As a consequence, the transaction cannot be observed to be in progress by another database client.
[2]normal forms: (Entered in the database under normal form.) A set of theoretical rules for the design of database tables.