| Lesson 8 | Boyce-Codd Normal Form (BCNF) |
| Objective | Understand and apply Boyce-Codd Normal Form to eliminate anomalies caused by non-superkey determinants in functional dependencies. |
Boyce-Codd Normal Form (BCNF), sometimes informally called “3.5NF,” is a stricter version of Third Normal Form (3NF). While 3NF removes most redundancy by eliminating partial and transitive dependencies, BCNF goes one step further: it requires that every determinant in a functional dependency be a superkey. This prevents certain subtle anomalies that can still occur in 3NF schemas.
Let a relation R have a set of functional dependencies (FDs). R is in Boyce-Codd Normal Form if, for every non-trivial functional dependency
X → Y that holds in R:
X is a superkey of R.
A dependency X → Y is non-trivial if Y is not a subset of X.
In other words, in BCNF you are not allowed to have an attribute (or set of attributes) determining something else unless that determinant uniquely identifies each row in the table.
A relation in BCNF is always in 3NF, but the reverse is not guaranteed. The key difference lies in how they treat functional dependencies whose left-hand side is not a superkey:
X → A, either
X is a superkey, orA is a prime attribute (part of some candidate key).X → Y, X must be a superkey—no exceptions.As a result, some relations can satisfy 3NF while still violating BCNF. These cases usually appear when a non-superkey attribute determines another attribute, even if that dependent attribute participates in a candidate key.
Consider a relation that records which instructor teaches which student a given subject:
| Student | Instructor | Subject |
|---|---|---|
| Alice | Dr. Smith | Math |
| Bob | Dr. Smith | Math |
| Alice | Dr. Jones | Physics |
Assume the following:
{Student, Subject} uniquely identifies each row.Instructor → Subject (each instructor teaches exactly one subject).Under these assumptions, the relation is in 3NF:
{Student, Subject} → Instructor has a superkey on the left-hand side.Instructor → Subject, the right-hand side (Subject) is a prime attribute because it is part of a candidate key.However, the relation is not in BCNF, because:
Instructor is not a superkey, yet it determines Subject.This leads to anomalies:
To bring the design into BCNF, decompose the original relation based on the non-superkey dependency Instructor → Subject:
1. InstructorSubject – captures which subject each instructor teaches:
| Instructor | Subject |
|---|---|
| Dr. Smith | Math |
| Dr. Jones | Physics |
Instructor2. StudentInstructor – associates students with instructors:
| Student | Instructor |
|---|---|
| Alice | Dr. Smith |
| Bob | Dr. Smith |
| Alice | Dr. Jones |
{Student, Instructor}Instructor references InstructorSubject(Instructor).After decomposition:
InstructorSubject.| Property | Third Normal Form (3NF) | Boyce-Codd Normal Form (BCNF) |
|---|---|---|
| Definition | For every FD X → A, either X is a superkey, or A is a prime attribute. |
For every non-trivial FD X → Y, X must be a superkey. |
| Redundancy | May leave some redundancy when non-superkey determinants determine prime attributes. | Eliminates redundancy caused by all non-superkey determinants. |
| Decomposition | Always has a lossless, dependency-preserving decomposition. | Always has a lossless decomposition, but may require sacrificing dependency preservation. |
| Strength | Weaker condition; every BCNF schema is in 3NF. | Stronger condition; some 3NF schemas are not in BCNF. |
| Proposed by | Edgar F. Codd. | Raymond F. Boyce and Edgar F. Codd. |
BCNF is particularly useful when you discover that certain attributes (such as instructor, department, or room) determine other attributes but are not themselves keys. If you leave these dependencies in place, you risk subtle anomalies and duplicated facts spread across multiple rows. Decomposing into BCNF:
BCNF focuses on functional dependencies. Beyond BCNF, Fourth Normal Form (4NF) tackles multi-valued dependencies, and Fifth Normal Form (5NF) addresses join dependencies. In most operational systems, designing to 3NF or BCNF is sufficient. Higher normal forms are typically reserved for specialized designs or advanced academic treatment.
As a design guideline: aim for at least 3NF, consider BCNF when you detect non-superkey determinants, and then selectively denormalize later if performance tuning requires it.