In the first course in this series
Database Design, you
- identified business objects in your environment,
- represented those objects as entities, and
- created an Entity-Relationship diagram (ER diagram) that displayed every entity in the database
as well as the relationships among those entities.
By following the process outlined in that course, you created a sound ER diagram that could be translated into efficient database tables.
Even if you follow the ER diagram development procedure from the first course "Database Design", you may still create tables that have problems.
Some of the challenges you may face during
data modeling[1] are:
- Deleting data results in the inadvertent deletion of other data
- Updating table data takes too long or is done incompletely
- Tables contain redundant data
- Searching for specific data takes too long
Fortunately, there is a way to check your tables to ensure they are designed properly.
Normalization is a way to break large tables into smaller, more efficient tables without losing any information.
(In technical terms, a lossless decomposition.) Because RDBMSs are built on a solid foundation of set theory, there is a well-defined set of rules you can follow to normalize database tables. The rest of this module (and the next) describe those rules. The Slide Show below illustrates how the process breaks large tables into smaller, more efficient ones.
Normalization works hand in hand with the ERD development process you studied in the first course in this series.
If you concentrate on having every entity in your ERD represent a single business object, you will go a long way toward normalizing your database's tables.
Remember, entities in the ERD become tables in the database. If every entity represents only a single business object, then every table will represent only a single business object. The next lesson explains the purpose of normalization.