| Lesson 2 | First Step in Normalization |
| Objective | Define normalization and recognize anomalies that it prevents. |
Normalization is a disciplined method for structuring tables so that each fact is stored once, updates are consistent, and queries remain reliable. You preserve the information captured during conceptual modeling (entities, attributes, relationships) while minimizing redundant storage and update anomalies.
Tables that combine multiple subjects or repeat groups typically suffer from:
X determine attribute Y (written X → Y).Scenario: Orders, customers, and products are jammed into one table with repeating line items.
-- UNF: single row contains multiple line items (repeating groups)
OrderUNF(OrderID, OrderDate, CustomerID, CustomerName,
Item1_ProductID, Item1_ProductName, Item1_Qty, Item1_Price,
Item2_ProductID, Item2_ProductName, Item2_Qty, Item2_Price, ...)
-- 1NF: eliminate repeating groups (one line per row)
OrderLine1NF(OrderID, OrderDate, CustomerID, CustomerName,
ProductID, ProductName, Qty, Price)
-- FDs (illustrative)
-- OrderID → OrderDate, CustomerID
-- CustomerID → CustomerName
-- ProductID → ProductName
-- (OrderID, ProductID) → Qty, Price
-- 2NF: if key is (OrderID, ProductID), remove partial dependencies
Order(OrderID, OrderDate, CustomerID)
Customer(CustomerID, CustomerName)
Product(ProductID, ProductName)
OrderLine(OrderID, ProductID, Qty, Price)
-- 3NF: remove transitive dependencies between non-key attributes
-- If ProductID → ListPrice and OrderLine.Price stores the actual sale price,
-- keep ListPrice in Product and retain OrderLine.Price for history (business rule).
Normalization complements ER modeling: if each ER entity represents a single business object, the mapped tables tend to satisfy 1NF by design. FDs discovered during normalization often refine attribute placement and keys in the logical model.