| Lesson 3 | Relational Theory |
| Objective | Entity relationship diagram as relational notation. |
For some students relational notation will seem quite natural and for others it will take some getting used to. This course will represent entities using both graphical and textual styles, so if you prefer diagrams to relational notation you will always have a visual reference alongside the text. The goal is that you learn to read relational notation fluently. Doing so will allow you to study beyond this course and take full advantage of resources that use relational notation to present database design material.
Relational databases play a critical role in many important computer applications. As is the case whenever large amounts of money are at stake, people have spent enormous time and effort building, studying, and refining relational database technology. Database researchers typically approach relational databases from one of three points of view, and understanding these perspectives helps explain why the field has developed so many overlapping and sometimes contradictory terms.
In the preceding course Relational Database Design, you created an Entity Relationship diagram to track the products offered by and orders placed with Stories on CD, Inc. - a fictional company that sells books on CD principally by mail order. In this module you will use a modified version of that ER diagram[1] to derive the database objects used to represent CDs, distributors, and orders in the database.
When you begin to design the structure of your database, you represent entities as relations. A relation to store data about items stocked by Stories on CD, Inc. might be expressed in relational notation as follows:
CD (CDNo, CDTitle, DistID, RetailPrice, AgeGroup, Description)
In this notation, CDNo is the primary key (underlined) and DistID is a foreign
key referencing the Distributor table (underlined with a dotted line in formal notation). All other
attributes follow the primary key in the listing.
The same entities can be documented using Mermaid.js for visual ERD representation, which renders directly into responsive SVG diagrams and allows version control alongside the schema definition:
erDiagram
CD {
int CDNo "PK"
string CDTitle
int DistID "FK"
decimal RetailPrice
string AgeGroup
string Description
}
DISTRIBUTOR {
int DistID "PK"
string DistName
string Contact
string Address
}
CD ||--o{ DISTRIBUTOR : "distributed by"
When transitioning from conceptual models to physical schemas, providing explicit DDL is practical. The SQL DDL equivalent for the CD entity is:
CREATE TABLE CD (
CDNo INT PRIMARY KEY,
CDTitle VARCHAR(255) NOT NULL,
DistID INT,
RetailPrice DECIMAL(10, 2),
AgeGroup VARCHAR(50),
Description TEXT,
FOREIGN KEY (DistID) REFERENCES DISTRIBUTOR(DistID)
);
Relational notation is a precise, standardized way to represent the structure of data during the data modeling process. Its purpose is to describe entities (tables), their attributes (columns), and relationships (associations) in a format that is compact, unambiguous, and ready for implementation.
The rules for writing relational notation are straightforward:
TableName (PK_attribute, attribute2, FK_attribute)Relational notation serves five purposes in the data modeling process:
Customer(CustID, Name, Address)
Order(OrderID, OrderDate, CustID)
Here CustID is the primary key in Customer, and CustID in
Order is a foreign key referencing Customer.
CustID as a
foreign key in the Order relation expresses a 1:M relationship - one customer can place many orders.
M:N relationships require an intersection table, which also appears in the relational notation.
| Function | Description |
|---|---|
| Structure Definition | Lists tables and their attributes in a compact textual format |
| Key Identification | Marks primary and foreign keys explicitly |
| Relationship Mapping | Shows how tables relate through foreign key placement |
| Bridge to SQL | Provides the direct foundation for CREATE TABLE statements |
| Validation Aid | Helps normalize and verify model correctness before implementation |
Note the structural similarities between the CD relation shown above and the updated StreamingAsset entity in the project ER diagram. The StreamingAsset entity reflects a modernized version of the original domain, adapted for digital streaming platforms:
In relational notation this entity would be expressed as:
StreamingAsset (TrackID, Title, PlatformID, StreamPrice, AudienceRating, GenreTags, Duration, ReleaseDate, Description)
The structure mirrors the CD relation: a primary key leads the listing, a foreign key references another table (Platform), and descriptive attributes follow. This consistency of form across different domains is one of relational notation's primary strengths.
The difference between relational data modeling and conceptual modeling lies primarily in their level of abstraction, purpose, and the artifacts they produce. Understanding the distinction is important because the two activities happen at different stages of the DBLC and serve different audiences.
Conceptual modeling is a high-level, technology-agnostic process that defines what the system should contain. It focuses on understanding and representing business requirements and real-world entities without concern for how those entities will be implemented in a specific DBMS.
The purpose of conceptual modeling is to capture business concepts and rules, define entities, attributes, and relationships at a semantic level, and act as a communication bridge between business stakeholders and technical teams. The primary artifact is the Entity-Relationship Diagram (ERD) at a conceptual level, sometimes supplemented by UML class diagrams. Conceptual models specify no primary keys, foreign keys, or data types - they are platform-independent and emphasize business meaning over implementation detail.
Conceptual modeling is the important first phase in designing a successful database application. A database application encompasses both the database itself and the associated programs that implement queries and updates. A banking application, for example, includes programs for deposits and withdrawals with user-friendly GUI forms for end users. The design and testing of those application programs has traditionally been considered part of software engineering, but database design methodologies and software engineering methodologies are strongly intertwined in most modern design tools.
This module concentrates on database structures and constraints during conceptual design, using the Entity-Relationship model - a widely used high-level conceptual data model whose concepts are employed by most database design tools. The ER model provides data-structuring concepts, constraints, and ER diagram notation for designing conceptual schemas.
Relational data modeling is a logical and technical activity that transforms the conceptual model into a format ready for implementation in a relational database system. It is a collection of processes and techniques that records the inventory, shape, size, contents, and rules of data elements within a business process scope. Despite the name, a database is not merely a collection of data - it is a collection of facts, or more precisely, true propositions about the domain it models.
The purpose of relational data modeling is to define how data will be stored using the relational model, focusing on tables, columns, keys, and constraints, and to normalize data for efficiency, integrity, and non-redundancy. The artifacts produced include relational schema diagrams, SQL DDL scripts, and fully defined normalized schemas with primary keys, foreign keys, data types, and constraints. Relational data modeling is platform-aware - the choices made at this stage can be mapped directly to SQL for a specific target DBMS.
Key elements that relational data modeling specifies and conceptual modeling does not:
| Feature | Conceptual Modeling | Relational Data Modeling |
|---|---|---|
| Abstraction Level | High | Medium to low |
| Focus | Business concepts | Database structure |
| Artifacts | ERDs, entity and relationship specs | Tables, keys, normalized schema, DDL |
| Users | Business analysts, data architects | Database designers, developers |
| Keys and Constraints | Not specified | Fully defined (PK, FK, CHECK, etc.) |
| Technology Specific | No | Yes (RDBMS-oriented) |
| Normalization | Not applicable | Highly relevant |
A concrete example illustrates the transition between the two levels:
Conceptual: "A Customer places Orders. Each Order contains one or more Products."
Relational:
CREATE TABLE Customer (
CustomerID INT PRIMARY KEY,
Name VARCHAR(100)
);
CREATE TABLE Orders (
OrderID INT PRIMARY KEY,
CustomerID INT,
FOREIGN KEY (CustomerID) REFERENCES Customer(CustomerID)
);
The relational notation examples above focus on the primary key and attribute listing. Other elements of relational constructs - the data domain of each field, the primary key of each relation, and any foreign keys - will each be covered in detail later in this module. Translating an ER diagram into relational notation[2] is a standard step in the logical design stage of the database life cycle.
Once relational notations have been created from the entities in your ER diagram, you can analyze them with an eye toward three goals:
It is possible to perform the analysis in this module and the next using entities as they appear in the ER diagram, but representing entities in relational notation makes the task easier in three important ways:
Being able to read and write relational notation is a valuable and transferable skill. Relations are expressed in a shorthand called relational notation - a textual interpretation of the ER diagram that makes the structure of the database explicit, compact, and implementation-ready. The next lesson outlines the characteristics of tables, which are the objects that store database data.