Relational Constructs   «Prev  Next»

Lesson 3 Relational Theory
Objective Entity relationship diagram as relational notation.

Representing Entities Using Relational Notation

For some students relational notation will seem quite natural and for others it will take some getting used to. This course will represent entities using both graphical and textual styles, so if you prefer diagrams to relational notation you will always have a visual reference alongside the text. The goal is that you learn to read relational notation fluently. Doing so will allow you to study beyond this course and take full advantage of resources that use relational notation to present database design material.

Relational databases play a critical role in many important computer applications. As is the case whenever large amounts of money are at stake, people have spent enormous time and effort building, studying, and refining relational database technology. Database researchers typically approach relational databases from one of three points of view, and understanding these perspectives helps explain why the field has developed so many overlapping and sometimes contradictory terms.

Three Groups of Database Researchers

  1. The Theoretical Group approaches relational databases from a database-theoretical standpoint. These researchers think in terms of provability, mathematical set theory, and propositional logic. Their vocabulary reflects this foundation - they use terms such as relational algebra, Cartesian product, and tuple relational calculus. This approach is intellectually rigorous and can be intimidating to practitioners, but it provides the formal foundation that makes relational database theory coherent and provable. These researchers focus on logical design and idealized database principles.
  2. The Practical Group approaches the topic from a less formal, more intuitive standpoint. Their terminology is less mathematically precise but more accessible to working developers. They use terms you have likely encountered before: table, row, and column. These practitioners focus on physical database design and pay close attention to the concrete implementation details involved in actually building a database and getting optimal performance from it.
  3. The File and Disk Group thinks in terms of flat files and the underlying disk structures used to hold data. Though this perspective is less common today, the terms it contributed - file, record, and field - made their way into database nomenclature and have remained. Many who still use these terms are programmers and developers who approach the database from a consumer perspective: "How do I get my data?" These differing viewpoints have produced several potentially confusing ways to describe the same relational concepts, particularly because different groups sometimes use the same term with different meanings. The term relation itself is used differently across these three groups.

Relational Constraints and Analysis

In the preceding course Relational Database Design, you created an Entity Relationship diagram to track the products offered by and orders placed with Stories on CD, Inc. - a fictional company that sells books on CD principally by mail order. In this module you will use a modified version of that ER diagram[1] to derive the database objects used to represent CDs, distributors, and orders in the database.

When you begin to design the structure of your database, you represent entities as relations. A relation to store data about items stocked by Stories on CD, Inc. might be expressed in relational notation as follows:

CD (CDNo, CDTitle, DistID, RetailPrice, AgeGroup, Description)

In this notation, CDNo is the primary key (underlined) and DistID is a foreign key referencing the Distributor table (underlined with a dotted line in formal notation). All other attributes follow the primary key in the listing.

The same entities can be documented using Mermaid.js for visual ERD representation, which renders directly into responsive SVG diagrams and allows version control alongside the schema definition:


erDiagram
    CD {
        int CDNo "PK"
        string CDTitle
        int DistID "FK"
        decimal RetailPrice
        string AgeGroup
        string Description
    }

    DISTRIBUTOR {
        int DistID "PK"
        string DistName
        string Contact
        string Address
    }

    CD ||--o{ DISTRIBUTOR : "distributed by"

When transitioning from conceptual models to physical schemas, providing explicit DDL is practical. The SQL DDL equivalent for the CD entity is:


CREATE TABLE CD (
    CDNo        INT PRIMARY KEY,
    CDTitle     VARCHAR(255) NOT NULL,
    DistID      INT,
    RetailPrice DECIMAL(10, 2),
    AgeGroup    VARCHAR(50),
    Description TEXT,
    FOREIGN KEY (DistID) REFERENCES DISTRIBUTOR(DistID)
);

Relational Notation Explained

Relational notation is a precise, standardized way to represent the structure of data during the data modeling process. Its purpose is to describe entities (tables), their attributes (columns), and relationships (associations) in a format that is compact, unambiguous, and ready for implementation.

The rules for writing relational notation are straightforward:

  • Begin with the relation (table) name.
  • List the primary key attribute(s) first, underlined.
  • List all remaining attributes after the primary key.
  • Identify any foreign key attributes with a dotted underline.
  • Format: TableName (PK_attribute, attribute2, FK_attribute)

Relational notation serves five purposes in the data modeling process:

  1. Represents entity structure clearly. The notation specifies entity names (which become tables), attributes (which become columns), primary keys, and foreign keys in a single compact line. For example:

    Customer(CustID, Name, Address)
    Order(OrderID, OrderDate, CustID)
        
    Here CustID is the primary key in Customer, and CustID in Order is a foreign key referencing Customer.
  2. Documents relationships. The notation makes relationship cardinalities visible through foreign key placement. In the Customer/Order example above, the presence of CustID as a foreign key in the Order relation expresses a 1:M relationship - one customer can place many orders. M:N relationships require an intersection table, which also appears in the relational notation.
  3. Supports logical to physical design. Relational notation bridges the gap between the conceptual ERD and the physical SQL schema. DBAs and developers can generate CREATE TABLE scripts and define constraints directly from the notation with minimal interpretation.
  4. Simplifies communication. The notation provides a concise, standardized language that data modelers, database designers, developers, and business analysts can all read and use to review a schema without needing a graphical tool.
  5. Enables validation and normalization. Relational notation makes it easier to spot redundancies, verify that a schema satisfies normal form requirements (1NF, 2NF, 3NF), and confirm that data integrity constraints are correctly expressed.
Function Description
Structure Definition Lists tables and their attributes in a compact textual format
Key Identification Marks primary and foreign keys explicitly
Relationship Mapping Shows how tables relate through foreign key placement
Bridge to SQL Provides the direct foundation for CREATE TABLE statements
Validation Aid Helps normalize and verify model correctness before implementation

The StreamingAsset Entity

Note the structural similarities between the CD relation shown above and the updated StreamingAsset entity in the project ER diagram. The StreamingAsset entity reflects a modernized version of the original domain, adapted for digital streaming platforms:

StreamingAsset entity diagram showing TrackID as primary key with attributes
Title, PlatformID, StreamPrice, AudienceRating, GenreTags, Duration, ReleaseDate, Description
StreamingAsset entity: TrackID (PK), Title, PlatformID (FK), StreamPrice, AudienceRating, GenreTags, Duration, ReleaseDate, Description.

In relational notation this entity would be expressed as:

StreamingAsset (TrackID, Title, PlatformID, StreamPrice, AudienceRating, GenreTags, Duration, ReleaseDate, Description)

The structure mirrors the CD relation: a primary key leads the listing, a foreign key references another table (Platform), and descriptive attributes follow. This consistency of form across different domains is one of relational notation's primary strengths.

Relational Data Modeling vs. Conceptual Modeling

The difference between relational data modeling and conceptual modeling lies primarily in their level of abstraction, purpose, and the artifacts they produce. Understanding the distinction is important because the two activities happen at different stages of the DBLC and serve different audiences.

Conceptual Modeling

Conceptual modeling is a high-level, technology-agnostic process that defines what the system should contain. It focuses on understanding and representing business requirements and real-world entities without concern for how those entities will be implemented in a specific DBMS.

The purpose of conceptual modeling is to capture business concepts and rules, define entities, attributes, and relationships at a semantic level, and act as a communication bridge between business stakeholders and technical teams. The primary artifact is the Entity-Relationship Diagram (ERD) at a conceptual level, sometimes supplemented by UML class diagrams. Conceptual models specify no primary keys, foreign keys, or data types - they are platform-independent and emphasize business meaning over implementation detail.

Conceptual modeling is the important first phase in designing a successful database application. A database application encompasses both the database itself and the associated programs that implement queries and updates. A banking application, for example, includes programs for deposits and withdrawals with user-friendly GUI forms for end users. The design and testing of those application programs has traditionally been considered part of software engineering, but database design methodologies and software engineering methodologies are strongly intertwined in most modern design tools.

This module concentrates on database structures and constraints during conceptual design, using the Entity-Relationship model - a widely used high-level conceptual data model whose concepts are employed by most database design tools. The ER model provides data-structuring concepts, constraints, and ER diagram notation for designing conceptual schemas.

Relational Data Modeling

Relational data modeling is a logical and technical activity that transforms the conceptual model into a format ready for implementation in a relational database system. It is a collection of processes and techniques that records the inventory, shape, size, contents, and rules of data elements within a business process scope. Despite the name, a database is not merely a collection of data - it is a collection of facts, or more precisely, true propositions about the domain it models.

The purpose of relational data modeling is to define how data will be stored using the relational model, focusing on tables, columns, keys, and constraints, and to normalize data for efficiency, integrity, and non-redundancy. The artifacts produced include relational schema diagrams, SQL DDL scripts, and fully defined normalized schemas with primary keys, foreign keys, data types, and constraints. Relational data modeling is platform-aware - the choices made at this stage can be mapped directly to SQL for a specific target DBMS.

Key elements that relational data modeling specifies and conceptual modeling does not:

  1. Domain - the type of data values permitted for each attribute.
  2. Primary key - the field or combination of fields that uniquely identifies each record in a table.
  3. Foreign key - the field or combination of fields used to link tables; a corresponding primary key field exists in the referenced table.
  4. Data redundancy - duplication of data in a database, which normalization is designed to eliminate.

Comparison

Feature Conceptual Modeling Relational Data Modeling
Abstraction Level High Medium to low
Focus Business concepts Database structure
Artifacts ERDs, entity and relationship specs Tables, keys, normalized schema, DDL
Users Business analysts, data architects Database designers, developers
Keys and Constraints Not specified Fully defined (PK, FK, CHECK, etc.)
Technology Specific No Yes (RDBMS-oriented)
Normalization Not applicable Highly relevant

A concrete example illustrates the transition between the two levels:

Conceptual: "A Customer places Orders. Each Order contains one or more Products."

Relational:


CREATE TABLE Customer (
    CustomerID INT PRIMARY KEY,
    Name       VARCHAR(100)
);

CREATE TABLE Orders (
    OrderID    INT PRIMARY KEY,
    CustomerID INT,
    FOREIGN KEY (CustomerID) REFERENCES Customer(CustomerID)
);

Other Elements of Relational Constructs

The relational notation examples above focus on the primary key and attribute listing. Other elements of relational constructs - the data domain of each field, the primary key of each relation, and any foreign keys - will each be covered in detail later in this module. Translating an ER diagram into relational notation[2] is a standard step in the logical design stage of the database life cycle.

Once relational notations have been created from the entities in your ER diagram, you can analyze them with an eye toward three goals:

  1. Ensuring that information is delivered in a consistent manner across all relations.
  2. Eliminating data redundancy - duplicate data stored in multiple places.
  3. Preserving existing data when deleting unwanted data from the database.

It is possible to perform the analysis in this module and the next using entities as they appear in the ER diagram, but representing entities in relational notation makes the task easier in three important ways:

  1. A text format is far more compact than a graphical representation, making it practical to work with many relations simultaneously on a single page.
  2. Concepts covered later in this course, such as functional dependencies and transitive dependencies, are customarily written and are easier to explain using relational notation.
  3. If you continue studying database design beyond this course, you will almost certainly encounter relational notation in textbooks, academic papers, and professional documentation.

Being able to read and write relational notation is a valuable and transferable skill. Relations are expressed in a shorthand called relational notation - a textual interpretation of the ER diagram that makes the structure of the database explicit, compact, and implementation-ready. The next lesson outlines the characteristics of tables, which are the objects that store database data.

Glossary

[1] ER diagram: A diagram used during the design phase of database development to illustrate the organization of and relationships between data during database design.
[2] Relational notation: Relations expressed in a shorthand that is a textual interpretation of the ER diagram.

SEMrush Software 3 SEMrush Banner 3