| Lesson 15 | Database Foreign keys |
| Objective | Define foreign keys for tables in a relational database |
A foreign key is a column or group of columns in one table that references a key in another table. In most introductory database designs, the referenced key is the primary key of the parent table. The table that contains the foreign key is called the child table, and the table being referenced is called the parent table.
The purpose of a foreign key is to define and enforce a relationship between tables. It tells the relational database management system that values in the child table must correspond to valid values in the parent table. This rule is part of referential integrity, which prevents child rows from pointing to parent rows that do not exist.
A foreign key is not merely a column that happens to have the same name as a column in another table. Matching column names can make a design easier to read, but the declared constraint is what creates the formal relationship. The database designer defines the relationship, and the RDBMS enforces it through the foreign key constraint.
The CUSTOMER and ORDER tables provide a simple example of a foreign key relationship. The CUSTOMER table stores one row for each customer. The ORDER table stores one row for each order. Since one customer can place many orders, the relationship between CUSTOMER and ORDER is one-to-many.
In this design, CUSTOMER.CustID is the primary key of the CUSTOMER table. It uniquely identifies each customer. In the ORDER table, OrderNo is the primary key because it uniquely identifies each order. The ORDER table also contains CustID, but there it has a different role: it is a foreign key that points back to the customer who placed the order.
CUSTOMER(CustID, CustLast, CustFirst, CustStreet, CustApt, CustCity, CustState, CustZip, CustPhone)
ORDER(OrderNo, CustID, OrderDate)
The same column name, CustID, appears in both tables, but the meaning depends on the table. In CUSTOMER, CustID identifies a customer. In ORDER, CustID identifies which customer placed the order.
A foreign key relationship has two sides. The parent table contains the referenced key. The child table contains the foreign key. In the CUSTOMER and ORDER example, CUSTOMER is the parent table because it contains the primary key CustID. ORDER is the child table because it contains a foreign key named CustID.
The rule can be stated this way: every non-null value in ORDER.CustID must match an existing value in CUSTOMER.CustID. If the database contains a customer with CustID 101, then an order can reference CustID 101. If there is no customer with CustID 9999, then the database should reject an order that attempts to use CustID 9999.
CUSTOMER
CustID = 101
ORDER
OrderNo = 5001
CustID = 101
This order belongs to customer 101. The ORDER row does not repeat the customer's last name, first name, address, city, state, ZIP code, or phone number. It stores the customer identifier, and the database can join ORDER to CUSTOMER when the full customer information is needed.
Referential integrity means that relationships between tables remain logically valid. A foreign key constraint prevents orphan rows. An orphan row is a child row that references a parent row that does not exist.
For example, suppose an order is inserted with this value:
CustID = 9999
If customer 9999 does not exist in the CUSTOMER table, the ORDER row would be invalid. The order would claim to belong to a customer that the database does not know about. A foreign key constraint prevents that inconsistency.
Referential integrity also affects delete and update operations. If a customer already has orders, the database must decide what happens when someone attempts to delete that customer or change the customer's key value. That behavior should be controlled by explicit business rules.
The following SQL example creates a CUSTOMER table and a CustomerOrder table. The table name CustomerOrder is used instead of ORDER because ORDER is a reserved keyword in SQL.
CREATE TABLE Customer (
CustID INTEGER PRIMARY KEY,
CustLast VARCHAR(50) NOT NULL,
CustFirst VARCHAR(50) NOT NULL,
CustStreet VARCHAR(100),
CustApt VARCHAR(20),
CustCity VARCHAR(60),
CustState CHAR(2),
CustZip VARCHAR(10),
CustPhone VARCHAR(20)
);
CREATE TABLE CustomerOrder (
OrderNo INTEGER PRIMARY KEY,
CustID INTEGER NOT NULL,
OrderDate DATE NOT NULL,
CONSTRAINT fk_customer_order_customer
FOREIGN KEY (CustID)
REFERENCES Customer(CustID)
);
The primary key of Customer is CustID. The primary key of CustomerOrder is OrderNo. The foreign key constraint says that CustomerOrder.CustID must reference a valid value in Customer.CustID.
The constraint name fk_customer_order_customer is optional in some database systems, but naming constraints is a good design practice. A named constraint is easier to identify when the database reports an error or when a DBA needs to modify the schema.
In a one-to-many relationship, one row in the parent table can be associated with many rows in the child table. A customer can place many orders, but each order in this simplified design belongs to one customer.
The foreign key usually appears on the many side of the relationship. ORDER is the many side because many order rows may reference the same customer row. CUSTOMER is the one side because each CustID identifies one customer.
This does not mean the customer data is copied into every order. Only the foreign key value is stored in the ORDER table. The rest of the customer data remains in the CUSTOMER table. This reduces redundancy and keeps customer facts in one place.
A foreign key can be mandatory or optional. A mandatory foreign key requires a value. An optional foreign key allows NULL.
In the CUSTOMER and ORDER example, CustID in the order table should usually be mandatory:
CustID INTEGER NOT NULL
This means every order must be assigned to a customer. For most order-processing systems, an order without a customer would be incomplete or invalid.
Optional relationships are also common. For example, an order might have an optional SalesRepID if a sales representative can be assigned later:
SalesRepID INTEGER NULL
Whether a foreign key is nullable depends on the business rule. The database design should reflect whether the relationship is required or optional.
When a parent row is referenced by child rows, the database needs a rule for what happens when the parent row is deleted or its key value is updated. These rules are called referential actions.
Common referential actions include:
For a customer-order database, deleting a customer who has order history should usually be restricted unless the organization has a deliberate archive, anonymization, or cascade policy. In many business systems, preserving order history is more important than physically deleting the customer row.
Foreign keys and joins are related, but they are not the same thing. A foreign key is a schema constraint. A join is a query operation.
A foreign key protects the validity of the relationship. A join uses matching values to combine rows from related tables.
SELECT
c.CustID,
c.CustLast,
c.CustFirst,
o.OrderNo,
o.OrderDate
FROM Customer c
JOIN CustomerOrder o
ON o.CustID = c.CustID;
This query returns customer and order data together. The join condition uses the relationship between Customer.CustID and CustomerOrder.CustID. The foreign key constraint helps ensure that each order points to a valid customer.
A foreign key column and the referenced primary key column do not have to use the same name. Matching names are common because they make diagrams and SQL easier to read, but names alone do not define the relationship.
The following example uses OrderedByID as the foreign key column:
CREATE TABLE CustomerOrder (
OrderNo INTEGER PRIMARY KEY,
OrderedByID INTEGER NOT NULL,
OrderDate DATE NOT NULL,
FOREIGN KEY (OrderedByID)
REFERENCES Customer(CustID)
);
OrderedByID is still a foreign key because it references Customer.CustID. The declared constraint establishes the relationship.
A foreign key can reference more than one column. This is called a composite foreign key. Composite foreign keys are used when the referenced key is made from multiple columns.
For example, if an order-line table is identified by the combination of OrderNumber and ItemNumber, another table could reference both columns together:
FOREIGN KEY (OrderNumber, ItemNumber)
REFERENCES OrderLine(OrderNumber, ItemNumber)
The important rule is that the set of foreign key columns in the child table must match the referenced key structure in the parent table. This connects foreign keys to earlier lessons about concatenated primary keys and all-key relations.
A foreign key is a column or group of columns in one table that references a key in another table. It defines a relationship between a child table and a parent table. In the CUSTOMER and ORDER example, ORDER.CustID is a foreign key that references CUSTOMER.CustID.
Foreign keys enforce referential integrity by preventing invalid references. They help ensure that an order cannot be assigned to a customer that does not exist. They also make relationships between tables explicit in the database schema.
The most important idea is that a foreign key is a declared database constraint, not merely a matching column name. The database designer defines the relationship, and the relational database management system enforces it.