Relational Constructs   «Prev  Next»

Lesson 13 Concatenated Primary Keys
Objective Define concatenated primary keys and explain when to use them.

Concatenated Primary Keys in a Relational Database

A concatenated primary key, also called a composite primary key, is a primary key made up of two or more columns in the same table. Instead of relying on a single column to uniquely identify each row, the database uses the combination of values across several columns as the key.

Recall that a primary key must uniquely identify every row and must never contain NULL values. In many tables a single column - OrderNo, CustomerID, or a system-generated integer - is sufficient. However, there are important cases where no single column can guarantee uniqueness. In those situations, a concatenated primary key is the appropriate design choice.

When a Single Column Is Not Enough

Consider a Line Item table that records the individual items on each customer order for Stories on CD:

Line Item
OrderNo CDNo Quantity SellingPrice Discount LineCost Shipped

In the Line Item table, OrderNo identifies the customer order and CDNo identifies the product being sold. Neither column alone can uniquely identify a row:

Because both columns contain duplicate values, neither column alone can serve as the primary key. The key must treat OrderNo and CDNo as a pair.


Defining a Concatenated Primary Key

A concatenated primary key uses existing columns in the table to form a unique identifier. In the Line Item table, each row is uniquely identified by the combination of OrderNo and CDNo. As long as the same CD does not appear more than once on the same order - which would mean the customer ordered the same CD twice in a single transaction - the pair (OrderNo, CDNo) is unique for every row.

The same principle applies in any intersection table that resolves a many-to-many relationship. A StudentCourse table that records which students are enrolled in which courses has the same structure: neither StudentID nor CourseID alone is unique, but the combination of the two is.

Creating a Concatenated Primary Key in SQL

The Line Item table definition below uses a concatenated primary key declared with the CONSTRAINT syntax:

CREATE TABLE LineItem (
    OrderNo      INTEGER        NOT NULL,
    CDNo         INTEGER        NOT NULL,
    Quantity     INTEGER        NOT NULL,
    SellingPrice NUMERIC(8,2)   NOT NULL,
    Discount     NUMERIC(5,2),
    LineCost     NUMERIC(8,2),
    Shipped      CHAR(1)        NOT NULL,
    CONSTRAINT pk_lineitem PRIMARY KEY (OrderNo, CDNo)
);

The PRIMARY KEY (OrderNo, CDNo) constraint instructs the RDBMS to enforce two rules:

  1. The pair (OrderNo, CDNo) must be unique across all rows in the table.
  2. Neither OrderNo nor CDNo may contain a NULL value.

These are the same requirements as a single-column primary key - uniqueness and non-nullability - applied across the combination of columns rather than a single column.

The Order Items Example: Combining Two Columns

The order items table in Figure 2-1 below provides the original course example of a concatenated primary key. Because there is more than one item on a single order, and because the same item can appear on more than one order, order numbers repeat and item numbers repeat. Neither column alone can serve as the primary key.

The combination of an order number and an item number, however, is unique. We can therefore concatenate the two columns to form the table's primary key. It would also be possible to concatenate all three columns (order number, item number, and quantity) and still produce a unique primary key. But the quantity column is not necessary to ensure uniqueness - including it adds complexity to foreign key relationships without benefit. The minimum columns needed is the correct design.

Figure 2-1: Order items table showing OrderNo and CDNo as individual columns with
duplicate values, and the combination of both columns forming a unique concatenated primary key
Figure 2-1: Combining 2 columns is required to provide a unique identifier to act as a primary key. Neither OrderNo nor CDNo is unique alone, but the pair (OrderNo, CDNo) uniquely identifies each line item row.

This example establishes two additional properties of a well-designed primary key:

  1. A concatenated primary key should be made up of the smallest number of columns necessary to ensure uniqueness. Do not add columns that are not required.
  2. Whenever possible, the columns used in a concatenated primary key should be meaningless identifiers - values that carry no business meaning that could change over time.

Design Rules for Concatenated Primary Keys

Four rules govern the design of a concatenated primary key:

  1. Use the smallest number of columns that ensure uniqueness. Only include columns that are necessary to make the key unique. In the Line Item example, OrderNo and CDNo are sufficient. Adding Quantity to the key would still produce unique rows in most cases, but it is unnecessary, complicates foreign key references in other tables, and introduces semantic ambiguity.
  2. Prefer stable, meaningless identifiers. The columns in a primary key - concatenated or not - should be stable over time and should not encode business meaning that might change. Names, addresses, and phone numbers are meaningful data that users update in the normal course of business. They are poor choices for primary key columns.
  3. Avoid data that end users edit in normal business operations. If a column's value is expected to change - a product description, a customer name, a status code - that column should not participate in the primary key. Changing a primary key value requires cascading updates to every foreign key that references it.
  4. Ensure the combination is unique and never NULL. The combined values across all key columns must uniquely identify each row, and none of the key columns may contain NULL. The RDBMS enforces both rules through the PRIMARY KEY constraint.

In many modern database designs a surrogate key - an auto-incrementing integer or a GUID generated by the system - serves as the primary key for most tables. Even in those designs, concatenated keys remain common in intersection tables that resolve many-to-many relationships. A StudentCourse table with (StudentID, CourseID) as its primary key, or an OrderLine table with (OrderNo, CDNo), are typical examples.

The StudentCourse Intersection Table

The StudentCourse diagram below illustrates the concatenated primary key concept in the context of a many-to-many relationship between students and courses:

StudentCourse intersection table showing STUDENT_ID and COURSE_ID each with duplicate
values individually, but the pair (STUDENT_ID, COURSE_ID) forming a unique concatenated primary key
with a key symbol spanning both columns
The StudentCourse table demonstrates why a concatenated primary key is required.
  1. No single column is unique. STUDENT_ID repeats because one student can enroll in multiple courses. COURSE_ID repeats because multiple students can enroll in the same course.
  2. The combination is unique. Each pair (STUDENT_ID, COURSE_ID) appears only once - a student is enrolled in a specific course exactly once.
  3. Both columns form the primary key. The key symbol spans both columns, indicating that both STUDENT_ID and COURSE_ID belong to the primary key together.

Relations, Attributes, and Tuples

The practical rules covered in this lesson rest on the formal language of relational theory. In that theory, a table is called a relation[2], a column is called an attribute, and a row is called a tuple.

The term tuple reflects that a row is an ordered collection of attribute values. A relation with two attributes contains pairs of values (2-tuples), a relation with three attributes contains triples (3-tuples), and so on. Mathematicians use the general term n-tuple for these ordered sets.

In pure relational theory, relations impose no inherent ordering on rows or columns. The order in which rows are stored or retrieved, and the order in which columns are defined, are both considered semantically irrelevant. In practice, database applications return rows in a predictable order controlled by an ORDER BY clause, and present columns in the order in which they were defined in the table.

The key point for this lesson: a concatenated primary key is still a primary key on a relation. It uniquely identifies tuples in that relation just as a single-column primary key does. The relational model places no restriction on the number of attributes that may participate in the primary key.

Summary: When to Use a Concatenated Primary Key

Use a concatenated primary key when:

  • No single column can uniquely identify each row.
  • The combination of several existing columns is unique and will remain stable over time.
  • You want to avoid embedding meaningful, user-editable data in the primary key.
  • You are modeling an intersection table in a many-to-many relationship, such as OrderLine(OrderNo, CDNo) or StudentCourse(StudentID, CourseID).

As you design each table, verify that the key - single-column or concatenated - enforces the business rule that each row represents one and only one entity instance, and that the database can enforce that rule without relying on application code.

The next lesson describes all-key relations[1].

Primary Key - Quiz

Before moving on, click the Quiz link below to check your understanding of key columns and primary keys:
Primary Key - Quiz

[1] All-key relation: A relation in which every field is a member of the relation's primary key.
[2] Relation: Another word for table. Relations represent entities in database design.

SEMrush Software 13 SEMrush Banner 13