Relational Constructs   «Prev  Next»

Lesson 16 Referential integrity
Objective Define the Concept of Referential Integrity for Database Tables

Concept of Referential Integrity

Referential integrity is the database rule that keeps relationships between tables logically valid. In a relational database, one table often stores a key value that refers to a row in another table. Referential integrity ensures that this reference points to a row that actually exists.

In practical terms, referential integrity means that every non-null foreign key value in a child table must match an existing key value in the parent table. The parent table contains the referenced key. The child table contains the foreign key. This rule prevents records from becoming disconnected from the data they are supposed to reference.

The CUSTOMER and ORDER tables provide a clear example. A customer can place many orders, but each order should belong to a valid customer. The ORDER table stores a CustID value so the database knows which customer placed the order. Referential integrity makes sure that the CustID stored in ORDER refers to a real customer row in CUSTOMER.

CUSTOMER table contains primary key CustID and ORDER table contains foreign key CustID
Referential integrity requires ORDER.CustID to reference an existing CUSTOMER.CustID value.

Parent and Child Tables

A referential integrity rule is built from a parent-child relationship. The parent table contains the key that is being referenced. The child table contains the foreign key that points to the parent table.

In the CUSTOMER and ORDER example, CUSTOMER is the parent table because CustID is the primary key that identifies each customer. ORDER is the child table because it contains a CustID column that refers back to CUSTOMER.

CUSTOMER(CustID, CustLast, CustFirst, CustStreet, CustApt, CustCity, CustState, CustZip, CustPhone)

CUSTOMER_ORDER(OrderNo, CustID, OrderDate)

The relationship can be summarized this way:

Associated Records

For a row in the ORDER table to make sense, there must be an associated row in the CUSTOMER table. An order should not point to a customer that does not exist. If the database allowed that situation, the order row would be an orphan record.

The following example is valid because customer 101 exists:

CUSTOMER
CustID = 101

CUSTOMER_ORDER
OrderNo = 5001
CustID  = 101

The following example should be rejected if there is no customer with CustID 9999:

CUSTOMER_ORDER
OrderNo = 5002
CustID  = 9999

Referential integrity prevents this invalid row from being inserted. The database system checks the child-table value against the parent-table key before allowing the change.

What is Referential Integrity?

Referential integrity ensures the integrity of references between tables as defined by primary key and foreign key constraints. A primary key identifies each row in the parent table. A foreign key stores a value in the child table that must match a referenced key value in the parent table.

This rule is important because relational databases do not store relationships as physical pointers in the same way that some older file systems or navigational database systems did. The relationship is logical. It is represented by matching values and enforced by constraints.

A foreign key constraint is metadata defined in the database schema. It tells the RDBMS which child-table column references which parent-table column. Once the constraint exists, the database engine can reject inserts, updates, or deletes that would break the relationship.

  1. Primary key constraint: The primary key constraint ensures that each key value in the parent table is unique and not null. This gives the child table a reliable target to reference.
  2. Foreign key constraint: The foreign key constraint ensures that each non-null value in the child table matches a valid referenced value in the parent table.

Together, these constraints establish and enforce referential integrity between related tables.

SQL Example: Enforcing Referential Integrity

The following SQL example defines a Customer table and a CustomerOrder table. The name CustomerOrder is used instead of ORDER because ORDER is a reserved SQL keyword.

CREATE TABLE Customer (
    CustID      INTEGER PRIMARY KEY,
    CustLast    VARCHAR(50) NOT NULL,
    CustFirst   VARCHAR(50) NOT NULL,
    CustStreet  VARCHAR(100),
    CustApt     VARCHAR(20),
    CustCity    VARCHAR(60),
    CustState   CHAR(2),
    CustZip     VARCHAR(10),
    CustPhone   VARCHAR(20)
);

CREATE TABLE CustomerOrder (
    OrderNo    INTEGER PRIMARY KEY,
    CustID     INTEGER NOT NULL,
    OrderDate  DATE NOT NULL,

    CONSTRAINT fk_customer_order_customer
        FOREIGN KEY (CustID)
        REFERENCES Customer(CustID)
);

The constraint named fk_customer_order_customer enforces the rule that every CustomerOrder.CustID value must reference an existing Customer.CustID value. If a user or application attempts to create an order for a nonexistent customer, the database should reject the operation.

Valid and Invalid Inserts

Referential integrity is easiest to understand by looking at insert operations. First, insert a customer:

INSERT INTO Customer (CustID, CustLast, CustFirst)
VALUES (101, 'Rivera', 'Elena');

Now an order can reference that customer:

INSERT INTO CustomerOrder (OrderNo, CustID, OrderDate)
VALUES (5001, 101, DATE '2026-05-19');

This insert is valid because customer 101 already exists. The child row points to a valid parent row.

By contrast, the following insert should fail if customer 9999 does not exist:

INSERT INTO CustomerOrder (OrderNo, CustID, OrderDate)
VALUES (5002, 9999, DATE '2026-05-19');

Without referential integrity, this invalid order could be stored. With referential integrity, the database rejects it and protects the consistency of the relationship.

Preventing Orphan Records

An orphan record is a child-table row that references a parent-table row that does not exist. In the CUSTOMER and ORDER example, an orphan order would be an order whose CustID does not match any CustID in CUSTOMER.

Orphan records are dangerous because they make the database inconsistent. A report might show an order with no valid customer. A join between customers and orders might fail to return the orphan order. An application might be unable to display the customer information for the order because the referenced customer row is missing.

Referential integrity prevents orphan records by requiring valid references at the database level. This is stronger than relying only on application code because multiple applications, scripts, imports, and users may interact with the same database.

Database Design for Mere Mortals

Referential Actions: Delete and Update Behavior

Referential integrity also controls what happens when a parent row is deleted or when a parent key value is updated. The database must protect the child rows that depend on that parent row.

Common referential actions include:

  • RESTRICT or NO ACTION: Prevent the parent row from being deleted or changed while child rows still reference it.
  • CASCADE: Automatically delete or update related child rows when the parent row is deleted or updated.
  • SET NULL: Set the child foreign key value to NULL when the parent row is deleted or changed, if NULL is allowed.
  • SET DEFAULT: Set the child foreign key to a default value, if the database supports this action and the default value is valid.

In a customer-order database, RESTRICT or NO ACTION is often safer than CASCADE. A business usually wants to preserve order history. Deleting a customer should not automatically delete past orders unless the organization has a deliberate archival, anonymization, or cascade policy.

Optional and Mandatory Relationships

Referential integrity applies to non-null foreign key values. A relationship can be mandatory or optional depending on whether the foreign key allows NULL.

A mandatory relationship requires a value:

CustID INTEGER NOT NULL

This means every order must reference a customer.

An optional relationship allows NULL:

SalesRepID INTEGER NULL

A NULL foreign key means the child row is not currently assigned to a parent row for that relationship. If the value is NULL, there is no parent reference to check. If the value is not NULL, it must match a valid parent key.

Referential Integrity and Joins

Referential integrity is related to joins, but it is not the same thing as a join. Referential integrity is a constraint in the schema. A join is a query operation.

The foreign key constraint keeps the relationship valid. The join uses matching values to retrieve related rows.

SELECT
    c.CustID,
    c.CustLast,
    c.CustFirst,
    o.OrderNo,
    o.OrderDate
FROM Customer c
JOIN CustomerOrder o
    ON o.CustID = c.CustID;

This query combines customer and order data. The join condition uses the matching CustID values. The foreign key constraint helps ensure that the rows being joined represent valid relationships.

Relationship to Primary Keys and Foreign Keys

Referential integrity connects directly to the previous lessons on primary keys and foreign keys. A primary key identifies rows in the parent table. A foreign key stores references in the child table. Referential integrity enforces the validity of those references.

The same principle applies when a key contains more than one column. If a child table references a composite primary key, then the foreign key must include the corresponding set of columns. This connects referential integrity to concatenated primary keys and all-key relations.

The essential design rule is simple: do not allow a child row to reference a parent row that does not exist.

Summary

Referential integrity is the rule that every non-null foreign key value in a child table must reference an existing key value in the parent table. It protects the logical consistency of relationships between database tables.

In the CUSTOMER and ORDER example, ORDER.CustID must match a valid CUSTOMER.CustID value. If an order attempts to reference a nonexistent customer, the database should reject the row. This prevents orphan records.

Referential integrity is enforced through foreign key constraints. It affects inserts, updates, and deletes, and it should be designed according to business rules. It is one of the most important mechanisms relational databases use to maintain accurate, consistent, and trustworthy data.

[1] referential integrity: A database satisfies the referential integrity rule if and only if for every tuple containing a reference there exists a referent.

SEMrush Software