Lesson 5	Linking relational database tables
Objective	Link a primary key to a foreign key in a relational database system

Linking Relational Database Tables using Primary and Foreign Keys

The power of relational databases is often attributed to their ability to link tables, a feature that enhances data management and retrieval capabilities. This linking of tables is achieved through the use of keys and relationships between data entities. Here's an explanation in a tentative, technical writing style:
Basis of Relational Databases

Relational Model: At the heart of relational databases is the relational model, which organizes data into tables (or relations). Each table is a collection of rows (records) and columns (attributes).
Data Linking: The real strength of this model lies in its ability to link these tables through relationships, typically using keys (primary keys and foreign keys).

Advantages of Table Linking

Data Integrity and Consistency: Linking tables helps maintain data integrity. When tables are interconnected, changes made in one table can be reflected in related tables, ensuring consistency across the database.
Elimination of Data Redundancy: By separating data into linked tables, redundant storage of data can be significantly reduced. For instance, rather than repeating customer information in every order record, a separate customer table can be linked to the orders table.
Flexibility in Data Retrieval: The ability to join tables through relationships allows for flexible and powerful data retrieval. Users can query multiple tables simultaneously, combining data in various ways to meet specific requirements.
Enhanced Data Security: Linking tables also supports better data security measures. Access to sensitive data can be controlled more effectively by managing permissions at the table level.
Scalability and Efficiency: Relational databases are scalable. As the amount of data grows, these databases can manage increasing volumes efficiently, partly due to the effective organization of data into linked tables.

Technical Aspects

Primary and Foreign Keys: The primary key of a table uniquely identifies each record in that table. A foreign key in a table points to a primary key in another table, establishing a relationship between the two tables.
Normalization: This process involves organizing data in the database to reduce redundancy and improve data integrity. Normalization often results in multiple, related tables.

Practical Implications

Complex Queries: The ability to link tables allows for the execution of complex queries that can combine data from various parts of the database.
Data Analytics and Reporting: Linked tables are essential for comprehensive data analytics and reporting, as they allow for the aggregation and comparison of data from different sources.

Challenges and Considerations

Design Complexity: While linking tables offers many benefits, it also adds complexity to database design and requires careful planning to ensure optimal performance.
Maintenance and Performance: Proper maintenance is needed to manage the relationships between tables, and performance can be impacted if these relationships are not efficiently designed.

In summary, the ability to link tables in a relational database significantly enhances its functionality, allowing for more efficient data management, integrity, and retrieval. This interconnectedness forms the basis for sophisticated data operations, which are crucial in various applications, from simple record-keeping to complex data analysis. However, it also introduces complexities that require careful design and ongoing maintenance.

Linking Database Tables

Two tables are linked when they have a matching field, that is, a field in each table containing similar data. For example, suppose you want to link a Suppliers table with another table called Products. Both tables must have at least one field in common. If you insert the primary key field of the Suppliers table (for example, SuppID) into the Products table, the two tables will have a matching field which is the SuppID. The tables are now linked, and you can access their data together. There are two special keys in a relational database. As we saw earlier, a primary key uniquely identifies a record in a table and another type of key, a foreign key establishes a link between tables. The graphic below illustrates both key types.

Diagram displaying the primary key and foreign key of tables.

SuppID is the primary key for the table **Suppliers**

SuppID is the foreign key for the table **Products**.

Primary key of the Suppliers Table

The primary key of the Suppliers tables is SuppID and the primary key of the Products table is ProdId. When the SuppID is inserted into the Products table, it becomes a foreign key in the Products table because it is used to link the Suppliers and Products tables. This is related to the concept of mandatory and optional relationships in database design. This means, that for every foreign key^[1] in a database, there must be a corresponding primary key. You will learn about the importance of keys in establishing data integrity^[2] within a relational database.

Foreign Key

The one area of difficulty with using brief column names is the occasional appearance of a foreign key in a table in which another column has the same name that the foreign key column has in its parent table. One possible long-term solution is to allow the use of the full foreign key name, including the table name of its parent table, as a column name in the local table (such as BOOKSHELF.Title as a column name). The practical need to solve the same-name column problem requires one of the following actions:

Invent a name that incorporates the source table of the foreign key in its name without using the dot (using an underscore, for instance).
Invent a name that incorporates an abbreviation of the source table of the foreign key in its name.
Invent a name different from its name in its source table.
Change the name of the conflicting column.

None of these is particularly attractive, but if you come across the same-name dilemma, you will need to take one of these actions.

Accessing Data

A user can now ask the database to retrieve data that spans both tables. A typical supplier-product query might be:
What's the name of the company that supplies us with product X? The graphic below illustrates the two linked tables in which the data that answers this question resides, and the results of the query.

What is the name of the company that supplies us with the Beta tool?
SuppID is the primary key in the Suppliers table and a foreign key in the Products table.

Linking a Primary Key from one Database Table to a foreign key in a Separate Table

Linking a primary key from one database table to a foreign key in a separate table is a fundamental aspect of database design, ensuring data integrity, consistency, and the establishment of relationships between tables. This process, referred to as creating a foreign key constraint, involves a series of steps that vary depending on the database management system (DBMS) in use. However, the general principles remain consistent across most systems. Here are the general steps to link a primary key from one table to a foreign key in a separate table:

Identify the tables and the relationship: First, determine the two tables you wish to connect and the nature of the relationship between them. The relationship is typically one-to-many or one-to-one, with one table containing a primary key (parent table) and the other table containing a foreign key (child table) referencing the primary key.
Ensure primary key constraints: Verify that the parent table has a primary key constraint on the column(s) you intend to reference. A primary key constraint enforces uniqueness and non-null values, ensuring that every row in the table can be uniquely identified.
Create the foreign key column: In the child table, create a column or set of columns that will store the foreign key values. The foreign key column(s) must have the same data type(s) as the primary key column(s) in the parent table, and they should ideally have a descriptive name to indicate the relationship.
Define the foreign key constraint: In your database management system, use the appropriate SQL command or graphical user interface to create a foreign key constraint between the primary key column(s) in the parent table and the foreign key column(s) in the child table. This process will vary depending on your specific DBMS.
For example, using SQL syntax in a system like MySQL or PostgreSQL, you would create a foreign key constraint as follows:
```
ALTER TABLE child_table_name
ADD CONSTRAINT constraint_name
FOREIGN KEY (foreign_key_column_name)
REFERENCES parent_table_name (primary_key_column_name);
```
Replace child_table_name, constraint_name, foreign_key_column_name, parent_table_name, and primary_key_column_name with the appropriate names for your specific tables and columns.
Enforce referential integrity: The foreign key constraint enforces referential integrity, meaning that any value entered in the foreign key column(s) of the child table must have a corresponding value in the primary key column(s) of the parent table. This ensures data consistency and helps prevent orphaned records.

Data that is organized in a useful way is called information. The data returned from the supplier-product query is an example of organized data (information). You will learn a great deal more about linking tables in later modules. It is important to remember that matching fields between tables serve to link tables. The next lesson introduces you to Structured Query Language (SQL), the programming language used to create relational database tables.

[1]foreign key: A field (or combination of fields) used to link tables; a corresponding primary key field occurs in the same database.

[2]data integrity: A term used to describe the quality (in terms of accuracy, consistency, and validity) of data in a database, in the sense that values required to enforce data relationships actually exist. Problems with data integrity occur when a value in one table that is supposed to relate to a value in another cannot because the second value either has been deleted or was never entered.