Normal Forms   «Prev  Next»

Lesson 3 Why we need Normalization
Objective Explain the purpose of normalization.

Database Normalization Objective

Relational databases are built to store and access data efficiently and the design of a database has a huge impact on its performance.
The purpose of normalization is to maximize the efficiency of a database.

Consider a database where all the data is stored in a single table.
ORDER
CustID CustName Address Phone OrderDate OrderNo ItemNo Title Price Shipped
101 C.Frye Portland 230.1251 01/11/2017 101 105 MX 14.95 YES
101 C.Frye Portland 230.1251 01/11/2017 102 108 CX 15.95 YES
102 D. Loge DC 555.1234 01/12/2017 103 105 MX 14.95 YES
102 D. Loge DC 555.1234 01/13/2017 104 109 DX 13.95 YES

  1. a merchant’s orders, and
  2. contains all of a customer's information and
  3. all of the item's information in every record.

Drawbacks of this Design

This type of design has several drawbacks:
  1. More than one entity is represented in the table. In this case, the table represents Customers, Orders, and OrderItems.
  2. The table contains redundant data. Every time a customer places an order, their contact information (Name, Address, and Phone) is needlessly repeated.
  3. Any query requires the RDBMS to search the entire table. To find every unique customer in the database, one would need to search every line. There is no location set aside to store data about Customers and only Customers.
  4. Updating information, such as customer names, requires the RDBMS to change multiple rows. If D. Loge moved from Washington to Seattle, the RDBMS would need to update every row containing his information.
  5. Deleting a customer's only order also removes all contact information for that customer. If D. Loge cancelled the only two orders he placed, no record of him as a potential customer would remain in the database.

Relational database models and the Process of normalization provide reliable and efficient data structures

Relational database models and the process of normalization contribute to creating reliable and efficient data structures by organizing data into structured and interconnected tables that minimize redundancy, ensure data integrity, and optimize query performance. Here's how each plays a role:
  1. Relational Database Models:
    The relational model organizes data into tables (also known as relations), where each table represents a specific entity (e.g., customers, products, orders). The model adheres to several principles that promote data reliability and efficiency:
    • Structured Organization: Data is stored in clearly defined rows and columns, making it easy to manage and retrieve. Each row represents a unique instance of the entity, and each column represents an attribute of that entity.
    • Relationships between Entities: Tables can be linked through primary keys (unique identifiers) and foreign keys (references to primary keys in other tables). This allows for easy navigation and retrieval of related data across tables without duplicating information.
    • Data Integrity and Consistency: Enforcing constraints like primary keys, foreign keys, and unique constraints ensures that data remains accurate, consistent, and reliable across the database. For instance, a foreign key constraint prevents orphan records and maintains valid relationships between tables.
  2. Process of Normalization: Normalization is a step-by-step process that restructures database tables to eliminate redundancy, ensure data integrity, and improve efficiency. The process involves breaking large, complex tables into smaller, well-organized ones. Key steps in normalization include:
    • First Normal Form (1NF): Ensures that the table has atomic (indivisible) values, and there are no repeating groups or arrays. This ensures that each column contains only one value per row, simplifying data access and manipulation.
    • Second Normal Form (2NF): Eliminates partial dependencies by ensuring that all non-key attributes depend on the entire primary key, rather than a part of it. This reduces redundancy and ensures that data relevant to an entity is stored together.
    • Third Normal Form (3NF): Eliminates transitive dependencies, ensuring that non-key attributes are only dependent on the primary key. This prevents anomalies when updating, inserting, or deleting data and ensures data is stored in the correct place.

    Through normalization, you:
    • Reduce Redundancy: Redundant or duplicate data is eliminated, saving storage space and improving performance by minimizing the need for updates across multiple locations.
    • Prevent Data Anomalies: By organizing data properly, normalization prevents update, insert, and delete anomalies, which can lead to inconsistent or corrupt data.

Improve Query Efficiency: Well-structured, normalized tables allow queries to be more efficient, as the database engine can quickly find and join the appropriate data without unnecessary duplication or complexity.
How They Provide Reliable and Efficient Data Structures
  1. Data Integrity: Both relational models and normalization ensure that data is accurate, consistent, and adheres to business rules (through constraints and key relationships). This makes the database reliable in terms of data quality.
  2. Avoidance of Redundancy: By structuring data according to normalization principles, you minimize unnecessary duplication, which can lead to inconsistencies and data integrity problems. This keeps the database efficient and easy to maintain.
  3. Query Optimization: Normalization leads to smaller, well-structured tables that are easier to index and optimize for querying, reducing the time required to retrieve and manipulate data.
  4. Maintainability: A normalized database is easier to update and maintain over time since changes to the data (e.g., adding, updating, or deleting records) need only occur in one place, reducing the risk of errors and inconsistencies.

Conclusion Relational database models, combined with the process of normalization, create a reliable database by ensuring data integrity and accuracy. They provide efficiency by eliminating redundancy, optimizing queries, and making the database easier to maintain, all of which contribute to a robust and high-performance data structure.
  • Normalization Data Modeling Techniques: Normalization is a data modeling technique, the goal of which is to organize data elements in such a way that they are stored in one place and one place only (with the exception of foreign keys, which are shared). Data sets, or entities in relational modeling are business concepts consisting of attributes. Every data element must belong to one and only one data set (with the exception of foreign keys), and every data set must own at least one data element. The test to make sure you have done this correctly is often referred to as the process of normalization. If you have been diligent about creating atomic[1] data elements, then this process will be much simpler. Do not try to normalize a model before you have taken it to the greatest level of detail possible, having called out all its data elements separately and having understood each one well. Seven normal forms are widely accepted in the modeling community, sequenced in order of increasing data organization discipline.

Database Normal Forms Explained

Normal forms (NF): A Normal Form of a relation defines the type of modification anomalies it eliminates. There are
  1. First normal form (1NF),
  2. Second normal form (2NF),
  3. Third normal form (3NF),
  4. Boyce-Codd normal form (BCNF),
  5. Fourth normal form (4NF),
  6. Domain/Key normal form (DK/NF)
  7. and Fifth normal form (5NF).


Tables that conform to the rules of normalization are said to be in one of several normal forms[2], depending on which rules the table adheres to. In this module and the next we will concentrate on the first, second, and third normal forms. There is a logical progression from one normal form to the next, so it is possible to resolve basic design flaws before tackling potentially more difficult problems. This means that a table in a normal form is automatically in all lower normal forms. That is, a table in second normal form is automatically in first normal form. The next lesson describes first normal form and provides instructions for producing a table in first normal form.

Normalization Process - Quiz

Before you move on to the next lesson, click the quiz link below to reinforce your understanding of normalization.
Normalization Process - Quiz

Database Analysis for Design
[1]atomic transaction: An atomic transaction is an indivisible and irreducible series of database operations such that either all occur, or nothing occurs. A guarantee of atomicity prevents updates to the database occurring only partially, which can cause greater problems than rejecting the whole series outright. As a consequence, the transaction cannot be observed to be in progress by another database client.
[2]normal forms: (Entered in the database under normal form.) A set of theoretical rules for the design of database tables.

SEMrush Software 3 SEMrush Banner 3