| Lesson 4 | First normal form |
| Objective | Explain the requirements for first normal form. |
What 1NF means: Each row–column intersection holds a single, indivisible value; there are no repeating groups. A stable key uniquely identifies each row. These rules keep data consistent and make queries predictable.
Why you care: Designs that violate 1NF are hard to search, update, and validate. Values get crammed into one cell (lists, CSVs), or columns get cloned to simulate “more of the same thing” (Author1, Author2, …). Normalization fixes this without losing information.
| Book | ||||||||
| BookID | ISBN | Title | Author1 | Author2 | Date | Pages | Publisher | City |
| 1 | 978-0321765723 | The Lord of the Rings | J.R.R. Tolkien | Glenn Gould | 1954 | 1216 | Allen Unwin | London |
| 2 | 978-0743273565 | The Hitchhiker's Guide to the Galaxy | Douglas Adams | Byron Janis | 1979 | 224 | Pan Books | London |
Thing1, Thing2, … to store multiple values).Identify a primary key so each row is unique. Atomicity + no repeating groups + a clear key give you 1NF.
A repeating group occurs when a single row can hold multiple values for the same attribute (e.g., multiple authors, multiple phone numbers). In ER terms, it corresponds to a multivalued attribute. The fix is to represent each value as a separate row in a related table.
When lists live in cells, you lose referential integrity and searchability. Consider the employee table below with children’s names and birthdates stored as comma-separated lists.
| EmpNum | First Name | Last Name | Child Names | Child Birthdates |
| 1001 | Jane | Doe | Mary, Sam | 1/9/02, 5/15/04 |
| 1002 | John | Doe | Lisa, David | 1/9/00, 5/15/01 |
| 1003 | Jane | Smith | John, Pat, Lee, Mary | 10/5/04, 10/12/00, 6/6/2006, 8/21/04 |
| 1004 | John | Smith | Michael | 7/4/06 |
| 1005 | Jane | Jones | Edward, Martha | 10/21/05, 10/15/99 |
Solution: eliminate repeating groups by moving them to a related table keyed by the parent entity.
The schema below replaces Author1/Author2 with a proper relationship. It preserves all facts, supports any number of authors, and allows ordering.
-- Entities
CREATE TABLE Book(
BookID INT PRIMARY KEY,
ISBN VARCHAR(20) UNIQUE,
Title VARCHAR(255),
PublishedYear INT,
Pages INT,
Publisher VARCHAR(120),
City VARCHAR(120)
);
CREATE TABLE Author(
AuthorID INT PRIMARY KEY,
AuthorName VARCHAR(200)
);
-- Relationship (1 book : many authors), preserves author order
CREATE TABLE BookAuthor(
BookID INT NOT NULL,
AuthorID INT NOT NULL,
AuthorOrder INT NOT NULL,
PRIMARY KEY(BookID, AuthorID),
FOREIGN KEY(BookID) REFERENCES Book(BookID),
FOREIGN KEY(AuthorID) REFERENCES Author(AuthorID)
);
-- Example: querying authors in the intended order
SELECT b.Title, a.AuthorName
FROM Book b
JOIN BookAuthor ba ON ba.BookID = b.BookID
JOIN Author a ON a.AuthorID = ba.AuthorID
ORDER BY b.BookID, ba.AuthorOrder;
Replace comma-separated values with one row per child:
CREATE TABLE Employee(
EmpNum INT PRIMARY KEY,
FirstName VARCHAR(80),
LastName VARCHAR(80)
);
CREATE TABLE EmployeeChild(
EmpNum INT NOT NULL,
ChildName VARCHAR(120) NOT NULL,
BirthDate DATE NOT NULL,
PRIMARY KEY(EmpNum, ChildName),
FOREIGN KEY(EmpNum) REFERENCES Employee(EmpNum)
);
-- Now filtering is simple and indexable:
SELECT e.EmpNum, e.FirstName, e.LastName, c.ChildName, c.BirthDate
FROM Employee e
JOIN EmployeeChild c ON c.EmpNum = e.EmpNum
WHERE c.BirthDate < DATE '2005-01-01';
With 1NF satisfied, review 2NF (remove partial dependencies on composite keys) and 3NF (remove transitive dependencies among non-keys). This progression eliminates common anomalies while preserving information.