| Lesson 16 | Referential integrity |
| Objective | Define the Concept of Referential Integrity for Database Tables |
Referential integrity is the database rule that keeps relationships between tables logically valid. In a relational database, one table often stores a key value that refers to a row in another table. Referential integrity ensures that this reference points to a row that actually exists.
In practical terms, referential integrity means that every non-null foreign key value in a child table must match an existing key value in the parent table. The parent table contains the referenced key. The child table contains the foreign key. This rule prevents records from becoming disconnected from the data they are supposed to reference.
The CUSTOMER and ORDER tables provide a clear example. A customer can place many orders, but each order should belong to a valid customer. The ORDER table stores a CustID value so the database knows which customer placed the order. Referential integrity makes sure that the CustID stored in ORDER refers to a real customer row in CUSTOMER.
A referential integrity rule is built from a parent-child relationship. The parent table contains the key that is being referenced. The child table contains the foreign key that points to the parent table.
In the CUSTOMER and ORDER example, CUSTOMER is the parent table because CustID is the primary key that identifies each customer. ORDER is the child table because it contains a CustID column that refers back to CUSTOMER.
CUSTOMER(CustID, CustLast, CustFirst, CustStreet, CustApt, CustCity, CustState, CustZip, CustPhone)
CUSTOMER_ORDER(OrderNo, CustID, OrderDate)
The relationship can be summarized this way:
For a row in the ORDER table to make sense, there must be an associated row in the CUSTOMER table. An order should not point to a customer that does not exist. If the database allowed that situation, the order row would be an orphan record.
The following example is valid because customer 101 exists:
CUSTOMER
CustID = 101
CUSTOMER_ORDER
OrderNo = 5001
CustID = 101
The following example should be rejected if there is no customer with CustID 9999:
CUSTOMER_ORDER
OrderNo = 5002
CustID = 9999
Referential integrity prevents this invalid row from being inserted. The database system checks the child-table value against the parent-table key before allowing the change.
Referential integrity ensures the integrity of references between tables as defined by primary key and foreign key constraints. A primary key identifies each row in the parent table. A foreign key stores a value in the child table that must match a referenced key value in the parent table.
This rule is important because relational databases do not store relationships as physical pointers in the same way that some older file systems or navigational database systems did. The relationship is logical. It is represented by matching values and enforced by constraints.
A foreign key constraint is metadata defined in the database schema. It tells the RDBMS which child-table column references which parent-table column. Once the constraint exists, the database engine can reject inserts, updates, or deletes that would break the relationship.
Together, these constraints establish and enforce referential integrity between related tables.
The following SQL example defines a Customer table and a CustomerOrder table. The name CustomerOrder is used instead of ORDER because ORDER is a reserved SQL keyword.
CREATE TABLE Customer (
CustID INTEGER PRIMARY KEY,
CustLast VARCHAR(50) NOT NULL,
CustFirst VARCHAR(50) NOT NULL,
CustStreet VARCHAR(100),
CustApt VARCHAR(20),
CustCity VARCHAR(60),
CustState CHAR(2),
CustZip VARCHAR(10),
CustPhone VARCHAR(20)
);
CREATE TABLE CustomerOrder (
OrderNo INTEGER PRIMARY KEY,
CustID INTEGER NOT NULL,
OrderDate DATE NOT NULL,
CONSTRAINT fk_customer_order_customer
FOREIGN KEY (CustID)
REFERENCES Customer(CustID)
);
The constraint named fk_customer_order_customer enforces the rule that every CustomerOrder.CustID value must reference an existing Customer.CustID value. If a user or application attempts to create an order for a nonexistent customer, the database should reject the operation.
Referential integrity is easiest to understand by looking at insert operations. First, insert a customer:
INSERT INTO Customer (CustID, CustLast, CustFirst)
VALUES (101, 'Rivera', 'Elena');
Now an order can reference that customer:
INSERT INTO CustomerOrder (OrderNo, CustID, OrderDate)
VALUES (5001, 101, DATE '2026-05-19');
This insert is valid because customer 101 already exists. The child row points to a valid parent row.
By contrast, the following insert should fail if customer 9999 does not exist:
INSERT INTO CustomerOrder (OrderNo, CustID, OrderDate)
VALUES (5002, 9999, DATE '2026-05-19');
Without referential integrity, this invalid order could be stored. With referential integrity, the database rejects it and protects the consistency of the relationship.
An orphan record is a child-table row that references a parent-table row that does not exist. In the CUSTOMER and ORDER example, an orphan order would be an order whose CustID does not match any CustID in CUSTOMER.
Orphan records are dangerous because they make the database inconsistent. A report might show an order with no valid customer. A join between customers and orders might fail to return the orphan order. An application might be unable to display the customer information for the order because the referenced customer row is missing.
Referential integrity prevents orphan records by requiring valid references at the database level. This is stronger than relying only on application code because multiple applications, scripts, imports, and users may interact with the same database.
Referential integrity also controls what happens when a parent row is deleted or when a parent key value is updated. The database must protect the child rows that depend on that parent row.
Common referential actions include:
In a customer-order database, RESTRICT or NO ACTION is often safer than CASCADE. A business usually wants to preserve order history. Deleting a customer should not automatically delete past orders unless the organization has a deliberate archival, anonymization, or cascade policy.
Referential integrity applies to non-null foreign key values. A relationship can be mandatory or optional depending on whether the foreign key allows NULL.
A mandatory relationship requires a value:
CustID INTEGER NOT NULL
This means every order must reference a customer.
An optional relationship allows NULL:
SalesRepID INTEGER NULL
A NULL foreign key means the child row is not currently assigned to a parent row for that relationship. If the value is NULL, there is no parent reference to check. If the value is not NULL, it must match a valid parent key.
Referential integrity is related to joins, but it is not the same thing as a join. Referential integrity is a constraint in the schema. A join is a query operation.
The foreign key constraint keeps the relationship valid. The join uses matching values to retrieve related rows.
SELECT
c.CustID,
c.CustLast,
c.CustFirst,
o.OrderNo,
o.OrderDate
FROM Customer c
JOIN CustomerOrder o
ON o.CustID = c.CustID;
This query combines customer and order data. The join condition uses the matching CustID values. The foreign key constraint helps ensure that the rows being joined represent valid relationships.
Referential integrity connects directly to the previous lessons on primary keys and foreign keys. A primary key identifies rows in the parent table. A foreign key stores references in the child table. Referential integrity enforces the validity of those references.
The same principle applies when a key contains more than one column. If a child table references a composite primary key, then the foreign key must include the corresponding set of columns. This connects referential integrity to concatenated primary keys and all-key relations.
The essential design rule is simple: do not allow a child row to reference a parent row that does not exist.
Referential integrity is the rule that every non-null foreign key value in a child table must reference an existing key value in the parent table. It protects the logical consistency of relationships between database tables.
In the CUSTOMER and ORDER example, ORDER.CustID must match a valid CUSTOMER.CustID value. If an order attempts to reference a nonexistent customer, the database should reject the row. This prevents orphan records.
Referential integrity is enforced through foreign key constraints. It affects inserts, updates, and deletes, and it should be designed according to business rules. It is one of the most important mechanisms relational databases use to maintain accurate, consistent, and trustworthy data.