Database Normalization
Normalization is a process used in database design to organize data into tables and to minimize data redundancy. The process involves breaking down a large table into smaller, more manageable tables, which are linked together by common attributes. The result of normalization is a set of tables that are well-structured, with each table containing a specific set of related data.
The normalization process involves several steps, which are typically referred to as normal forms. The most commonly used normal forms are the first normal form (1NF), second normal form (2NF), and third normal form (3NF). In general, higher normal forms represent more advanced levels of normalization, with each level building upon the previous level.
During the normalization process, each table is designed to represent a single entity or concept, such as a customer, order, or product. The table is then divided into multiple tables, with each table containing only information that is related to the specific entity or concept it represents. This results in tables that are more focused and easier to manage.
For example, consider a table that contains customer information, such as name, address, and phone number, as well as order information, such as order number and date. This table violates the first normal form because it contains multiple values for a single attribute (i.e., customer name and order number). To normalize this table, it would be divided into two tables: one for customer information and one for order information. Each table would contain only the relevant attributes, resulting in more focused and manageable tables.
The process of normalization can be complex and time-consuming, particularly for large and complex databases. However, the benefits of normalization include reduced data redundancy, increased data integrity, and easier maintenance of the database over time.
In conclusion, normalization creates tables by breaking down a larger table into smaller, more focused tables that contain only information related to a specific entity or concept. The result is a well-structured database with reduced data redundancy and increased data integrity.
Database normalization is the process of organizing the columns and tables of a relational database to minimize
data redundancy.
Normalization involves decomposing a table into less redundant tables without losing information and defining foreign keys in the old table and referencing the primary keys of the new ones. The objective is to isolate data so that additions, deletions, and modifications of an attribute can be made in just one table and then propagated through the rest of the database using the defined foreign keys.
Edgar F. Codd introduced the concept of
normalization and what we now know as the First normal form (1NF) in 1970.
Codd went on to define the (2NF) Second normal form and (3NF) Third normal form in 1971, and Codd and Raymond F. Boyce defined the Boyce-Codd Normal Form (BCNF) in 1974.
A
relational database table is often described as "normalized" if it meets Third Normal Form.
Most Third Normal Form tables are free of insertion, update, and deletion anomalies.
The relational model separates the logical design from the physical design: DBMS performance is a matter of
- physical design using indexes,
- view materialization,
- big buffers.
It is not a matter of changing the logical design.
Database Design and Theory