Normal Forms   «Prev  Next»

Lesson 2First Step in Normalization
ObjectiveDefine normalization and recognize anomalies that it prevents.

Database Normalization Defined

Normalization is a disciplined method for structuring tables so that each fact is stored once, updates are consistent, and queries remain reliable. You preserve the information captured during conceptual modeling (entities, attributes, relationships) while minimizing redundant storage and update anomalies.

  • Design approach: Works with both top-down (start from the ERD) and bottom-up (start from attributes). In practice, teams blend both.
  • Goal: Information preservation with minimal redundancy, leading to simpler maintenance and predictable behavior.

Insertion, Update, and Deletion Anomalies

Tables that combine multiple subjects or repeat groups typically suffer from:

Visual Walkthrough of the Process

All data in one wide table with repeated attributes
1) A novice design often places many subjects into a single, wide table.
Breaking a wide table into multiple tables
2) First decomposition splits repeated groups into separate rows/tables.
Further decomposition guided by business rules
3) Apply business rules and normal form tests to separate subjects cleanly.
A set of focused tables reflecting business objects
4) Result: focused tables that each model one business object, linked by keys.

Core Definitions

Worked Example: UNF → 1NF → 2NF → 3NF

Scenario: Orders, customers, and products are jammed into one table with repeating line items.

-- UNF: single row contains multiple line items (repeating groups)
OrderUNF(OrderID, OrderDate, CustomerID, CustomerName,
         Item1_ProductID, Item1_ProductName, Item1_Qty, Item1_Price,
         Item2_ProductID, Item2_ProductName, Item2_Qty, Item2_Price, ...)

-- 1NF: eliminate repeating groups (one line per row)
OrderLine1NF(OrderID, OrderDate, CustomerID, CustomerName,
             ProductID, ProductName, Qty, Price)

-- FDs (illustrative)
-- OrderID → OrderDate, CustomerID
-- CustomerID → CustomerName
-- ProductID → ProductName
-- (OrderID, ProductID) → Qty, Price

-- 2NF: if key is (OrderID, ProductID), remove partial dependencies
Order(OrderID, OrderDate, CustomerID)
Customer(CustomerID, CustomerName)
Product(ProductID, ProductName)
OrderLine(OrderID, ProductID, Qty, Price)

-- 3NF: remove transitive dependencies between non-key attributes
-- If ProductID → ListPrice and OrderLine.Price stores the actual sale price,
-- keep ListPrice in Product and retain OrderLine.Price for history (business rule).

Normalization complements ER modeling: if each ER entity represents a single business object, the mapped tables tend to satisfy 1NF by design. FDs discovered during normalization often refine attribute placement and keys in the logical model.

Quick Checklist


Practice Notes

SEMrush Software 2 SEMrush Banner 2