Lesson 6	Limitations of First Normal Form (1NF)
Objective	Describe the limitations of first normal form and explain why databases in 1NF alone are still prone to redundancy and anomalies.

Limitations of First Normal Form (1NF)

First Normal Form (1NF) is the foundational level of database normalization. It establishes two structural requirements that every relation in a well-designed database must satisfy:

Every column contains only atomic (indivisible) values - a single value per cell, not a list, array, or compound structure.
There are no repeating groups within a single row - no Phone1, Phone2, Phone3 columns, no comma-separated lists of values, no multi-valued attributes embedded in a single field.

These requirements eliminate the most obvious structural problems that arise when relational data is stored in flat, spreadsheet-like structures. A table that violates 1NF cannot be reliably queried, indexed, or joined in standard SQL. Achieving 1NF is therefore a prerequisite for everything that follows in the normalization process.

However, 1NF is only a prerequisite - it is not a destination. 1NF says nothing about redundancy, functional dependencies, or data anomalies. A table can satisfy every requirement of 1NF and still contain serious design problems that will cause data integrity failures in production. Understanding what 1NF does not guarantee is as important as understanding what it does.

What 1NF Does Not Address

The 1NF requirements concern the shape of the data - atomicity and the absence of repeating groups. They say nothing about the meaning of the data or the relationships between columns. Specifically, 1NF does not address:

Data redundancy - the same information stored in multiple rows of the same table.
Functional dependencies - whether non-key columns depend on the full primary key or only part of it.
Transitive dependencies - whether non-key columns depend on other non-key columns rather than directly on the primary key.
Data anomalies - the insertion, update, and deletion problems that arise when the above issues are present.

A table can be perfectly 1NF-compliant and still suffer from all four of these problems simultaneously. The normalization levels that follow 1NF - Second Normal Form (2NF) and Third Normal Form (3NF) - were designed specifically to address these remaining issues.

An Order Table That Satisfies 1NF but Is Problematic

Consider a flat order table of the kind commonly found in legacy systems and early database designs:


Order (
    CustID, CustName, CustAddress, CustPhone,
    OrderNo, OrderDate,
    ItemNo, ItemTitle, ItemPrice, Qty, LineTotal, ShippedVia, ShippedDate
)

This schema has no repeating groups and every cell holds a single value - it satisfies 1NF. Yet it mixes data about three distinct real-world entities in a single table: customers, orders, and items. A customer can place many orders, and each order can contain many items, so the same customer data appears on every row for every item on every order that customer has ever placed.

The table exhibits all four classic normalization problems:

Problem	Description	Consequence
Insertion Anomaly	Cannot add a new customer without placing an order. Cannot add a new inventory item until someone orders it.	Legitimate data cannot be entered until unrelated data exists.
Update Anomaly	Changing a customer's phone number requires updating every row for that customer - one row per item on every order they have placed.	If any row is missed, the database contains contradictory phone numbers for the same customer.
Deletion Anomaly	Cancelling an order deletes the only record of the customer if it was their only order, and deletes item details if no other order contains that item.	Accidental and irreversible loss of master data when operational data is removed.
Redundancy	Customer name, address, and phone are stored on every row for every item on every order. Item title and price are stored on every row that includes that item.	Wasted storage, slower queries, and increased risk of inconsistency.

Each of these problems is a direct consequence of mixing data about multiple entities in a single table. The 1NF requirements did nothing to prevent this - they only ensured that the columns are atomic and that there are no repeating groups within a row.

Partial Dependencies and Composite Keys

The correct primary key for the order-line data in this flat table is composite: (OrderNo, ItemNo). Each row represents one item on one order, so the combination of order number and item number uniquely identifies each row.

Examining the functional dependencies in this table reveals the core problem. A functional dependency exists when the value of one column or set of columns determines the value of another column. In the flat Order table:

OrderNo → OrderDate, CustID, CustName, CustAddress - these columns depend only on the order part of the composite key, not on the item part.
ItemNo → ItemTitle, ItemPrice - these columns depend only on the item part of the composite key, not on the order part.
CustID → CustName, CustAddress, CustPhone - these columns depend on CustID, which is itself a non-key column.

The first two examples are partial dependencies: a non-prime attribute (a column that is not part of the primary key) depends on only part of a composite key rather than the full key. The third example is a transitive dependency: a non-prime attribute depends on another non-prime attribute rather than directly on the primary key.

1NF does nothing to eliminate either type of dependency. Partial dependencies are the target of Second Normal Form; transitive dependencies are the target of Third Normal Form. A table in 1NF with a composite key and partial dependencies is not in 2NF. A table in 2NF with transitive dependencies is not in 3NF.

Why Partial Dependencies Cause Anomalies

The connection between partial dependencies and data anomalies is direct. Because CustName and CustAddress depend on OrderNo rather than on the full key (OrderNo, ItemNo), the same customer information is stored once per item on each order. A customer who places a single order for five items has their name and address stored five times. A customer who places ten orders with five items each has their name and address stored fifty times.

This redundancy is the root cause of the update anomaly: changing the customer's address requires finding and updating all fifty rows. It is the root cause of the deletion anomaly: deleting the last item row on the last order deletes the customer's information entirely. And it is the root cause of the insertion anomaly: there is no way to record a customer's information until they have placed at least one order.

How Higher Normal Forms Resolve These Problems

The solution is decomposition: separating the flat table into multiple tables, each describing exactly one real-world entity. Each table has its own primary key and contains only columns that depend on that key. The tables are then linked through foreign keys.

Applying 2NF and 3NF to the flat Order table produces four tables:


Customer (CustID PK, CustName, CustAddress, CustPhone)

Item (ItemNo PK, Title, CurrentPrice, ...)

Order (OrderNo PK, OrderDate, CustID FK REFERENCES Customer)

OrderLine (
    OrderNo    REFERENCES Order,
    ItemNo     REFERENCES Item,
    Qty,
    PriceAtSale,    -- historical price at time of sale
    ShippedVia,
    ShippedDate,
    PRIMARY KEY (OrderNo, ItemNo)
)

This decomposition eliminates every anomaly that the flat table suffered from:

Customer data stored once. The Customer table contains one row per customer. Updating a phone number requires changing exactly one row. There is no risk of inconsistency.
Item master data stored once. The Item table contains one row per product. Pricing changes are made in one place and are reflected in all future queries.
Historical prices preserved. The PriceAtSale column in OrderLine records the price at the time of the transaction. Changing the current price in the Item table does not alter the historical record of what was charged on past orders.
Independent insertion. Customers can be added to the Customer table before they place any orders. Items can be added to the Item table before anyone orders them.
Safe deletion. Cancelling an order removes rows from the Order and OrderLine tables. The Customer and Item records are unaffected - they exist independently of any specific transaction.
Referential integrity enforced. Foreign key constraints guarantee that every OrderLine row references a valid Order, and every Order references a valid Customer. The RDBMS enforces these constraints automatically.

When 1NF Alone May Be Acceptable

In a small number of specialized contexts, stopping at 1NF may be defensible:

Logging and audit tables that record events as they occur and are never updated. Since no updates are performed, update anomalies cannot arise. Since rows are only appended, deletion of one row does not affect the logical integrity of the others.
Data warehouse fact tables that are loaded from normalized source systems, are queried but not updated, and are optimized for read performance rather than write correctness.
Temporary staging tables used during ETL processes. These tables hold data only long enough to transform and load it into a normalized target schema.

In all three cases the justification is that the anomalies that normalization prevents do not arise because the tables are not subject to the operations - insert, update, delete - that trigger those anomalies. In virtually all transactional OLTP systems, however, stopping at 1NF is poor design that will cause data integrity problems in production.

Diagnostic Checklist

Use the following checklist to assess whether a table needs further normalization:

No repeating groups or multi-valued attributes in any column - 1NF is satisfied.
Data about more than one real-world entity is mixed in a single table - decomposition is needed.
The primary key is composite and some non-key columns depend on only part of it - the table is not in 2NF.
A non-key column determines the value of another non-key column - the table is not in 3NF.
Update, delete, or insert operations require touching many rows or risk losing data - the table needs further normalization regardless of which normal form it currently satisfies.

Summary

First Normal Form is the essential starting point for database normalization, but it is only the first step. It guarantees that every cell contains a single, atomic value and that there are no repeating groups within a row. It does not guarantee that the table is free of redundancy, partial dependencies, transitive dependencies, or data anomalies.

A 1NF-compliant table that mixes data about multiple entities - customers, orders, and items in a single flat structure - will exhibit insertion anomalies, update anomalies, deletion anomalies, and data redundancy. These problems are not structural violations of 1NF; they are logical violations of higher normal forms.

Moving to Second Normal Form eliminates partial dependencies by ensuring that every non-key column depends on the full primary key. Moving to Third Normal Form eliminates transitive dependencies by ensuring that every non-key column depends directly on the primary key rather than on another non-key column. Both steps are almost always necessary for a robust, maintainable relational database design.

The next lesson formally defines Second Normal Form and shows exactly how to eliminate partial dependencies.