Database Analysis   «Prev  Next»

Lesson 2 Table joins
Objective Explain how table joins are formed.

Joining Tables using SQL

When designing a database you must ensure that every table represents a single entity. The other side of that design goal is that, while you will occasionally need to look up data about a single entity (such as a CD or Distributor), you will often need to combine records from more than one table to derive the information you need. Designers create tables with data about more than one entity in order to include frequently needed information in the same table.
Joins are operations that link table records based on data in common fields and enable you to meet your design goal of representing a single entity in each table and still produce meaningful information based on data from more than one table.
As an example, consider the ER diagram for the course project, which has separate entities representing Customers, CDs, Distributors, Orders, and order Line Items.

ERD Diagram consisting of CUSTOMER, ORDER, LINE ITEM, DISTRIBUTOR, and CD table.

Entity Diagram

Every entity in the project ER diagram has a field in common with at least one other entity. For example, the CD and Distributor entities have the DistID field in common. You can use these common fields to create joins between tables based on those entities. Remember that in the Stories on CD database, every CD has exactly one Distributor. Consider the sample records from the truncated CD and Distributor tables in the following illustration.

Create a join between the CD and DISTRIBUTOR table

You can create a join between records in these two tables whenever the records have the same value. The new table created by the join is called a virtual table because it exists only in the computer's memory and is not part of the database's permanent structure. This would look like the one illustrated below.

Create a VirtualTable from the CD and DISTRIBUTOR table.

Primary Key Field

It is very important that the field used to create the join be a primary-key field in one of the joined tables.
Because the DistID field is the primary-key field of the Distributor table, every DistID value will point to a single distributor. If a value in the DistID field occurred in records for more than one distributor, the results would contain more than one distributor for each CD, which is invalid. The next lesson describes the different types of joins.