| Lesson 13 | Concatenated Primary Keys |
| Objective | Define concatenated primary keys and explain when to use them. |
A concatenated primary key, also called a composite primary key, is a primary key made up of two or more columns in the same table. Instead of relying on a single column to uniquely identify each row, the database uses the combination of values across several columns as the key.
Recall that a primary key must uniquely identify every row and must never contain NULL values. In
many tables a single column - OrderNo, CustomerID, or a system-generated
integer - is sufficient. However, there are important cases where no single column can guarantee
uniqueness. In those situations, a concatenated primary key is the appropriate design choice.
Consider a Line Item table that records the individual items on each customer order for Stories on CD:
| Line Item | ||||||
| OrderNo | CDNo | Quantity | SellingPrice | Discount | LineCost | Shipped |
In the Line Item table, OrderNo identifies the customer order and CDNo
identifies the product being sold. Neither column alone can uniquely identify a row:
OrderNo value
appears on several rows - one for each item on that order.CDNo value appears on rows for different orders.
Because both columns contain duplicate values, neither column alone can serve as the primary key.
The key must treat OrderNo and CDNo as a pair.
A concatenated primary key uses existing columns in the table to form a unique identifier. In the
Line Item table, each row is uniquely identified by the combination of OrderNo and
CDNo. As long as the same CD does not appear more than once on the same order - which
would mean the customer ordered the same CD twice in a single transaction - the pair
(OrderNo, CDNo) is unique for every row.
The same principle applies in any intersection table that resolves a many-to-many relationship.
A StudentCourse table that records which students are enrolled in which courses
has the same structure: neither StudentID nor CourseID alone is unique,
but the combination of the two is.
The Line Item table definition below uses a concatenated primary key declared with the
CONSTRAINT syntax:
CREATE TABLE LineItem (
OrderNo INTEGER NOT NULL,
CDNo INTEGER NOT NULL,
Quantity INTEGER NOT NULL,
SellingPrice NUMERIC(8,2) NOT NULL,
Discount NUMERIC(5,2),
LineCost NUMERIC(8,2),
Shipped CHAR(1) NOT NULL,
CONSTRAINT pk_lineitem PRIMARY KEY (OrderNo, CDNo)
);
The PRIMARY KEY (OrderNo, CDNo) constraint instructs the RDBMS to enforce two rules:
(OrderNo, CDNo) must be unique across all rows in the
table.OrderNo nor CDNo may contain a NULL
value.These are the same requirements as a single-column primary key - uniqueness and non-nullability - applied across the combination of columns rather than a single column.
The order items table in Figure 2-1 below provides the original course example of a concatenated primary key. Because there is more than one item on a single order, and because the same item can appear on more than one order, order numbers repeat and item numbers repeat. Neither column alone can serve as the primary key.
The combination of an order number and an item number, however, is unique. We can therefore concatenate the two columns to form the table's primary key. It would also be possible to concatenate all three columns (order number, item number, and quantity) and still produce a unique primary key. But the quantity column is not necessary to ensure uniqueness - including it adds complexity to foreign key relationships without benefit. The minimum columns needed is the correct design.
This example establishes two additional properties of a well-designed primary key:
Four rules govern the design of a concatenated primary key:
OrderNo and CDNo are sufficient. Adding Quantity to the
key would still produce unique rows in most cases, but it is unnecessary, complicates foreign
key references in other tables, and introduces semantic ambiguity.
In many modern database designs a surrogate key - an auto-incrementing integer or a GUID generated
by the system - serves as the primary key for most tables. Even in those designs, concatenated
keys remain common in intersection tables that resolve many-to-many relationships. A
StudentCourse table with (StudentID, CourseID) as its primary key, or an
OrderLine table with (OrderNo, CDNo), are typical examples.
The StudentCourse diagram below illustrates the concatenated primary key concept in the context of a many-to-many relationship between students and courses:
STUDENT_ID repeats because one
student can enroll in multiple courses. COURSE_ID repeats because multiple
students can enroll in the same course.(STUDENT_ID, COURSE_ID)
appears only once - a student is enrolled in a specific course exactly once.STUDENT_ID and COURSE_ID belong to the primary
key together.The practical rules covered in this lesson rest on the formal language of relational theory. In that theory, a table is called a relation[2], a column is called an attribute, and a row is called a tuple.
The term tuple reflects that a row is an ordered collection of attribute values. A relation with two attributes contains pairs of values (2-tuples), a relation with three attributes contains triples (3-tuples), and so on. Mathematicians use the general term n-tuple for these ordered sets.
In pure relational theory, relations impose no inherent ordering on rows or columns. The order in
which rows are stored or retrieved, and the order in which columns are defined, are both considered
semantically irrelevant. In practice, database applications return rows in a predictable order
controlled by an ORDER BY clause, and present columns in the order in which they were
defined in the table.
The key point for this lesson: a concatenated primary key is still a primary key on a relation. It uniquely identifies tuples in that relation just as a single-column primary key does. The relational model places no restriction on the number of attributes that may participate in the primary key.
Use a concatenated primary key when:
OrderLine(OrderNo, CDNo) or StudentCourse(StudentID, CourseID).As you design each table, verify that the key - single-column or concatenated - enforces the business rule that each row represents one and only one entity instance, and that the database can enforce that rule without relying on application code.
The next lesson describes all-key relations[1].
Before moving on, click the Quiz link below to check your understanding of key columns and primary
keys:
Primary Key - Quiz