Create equi-join

Describe the theory with respect to creating an equi-join when implementing joins with database tables.

Here's a description of the theory behind creating an EQUI-join in databases, along with the steps involved:
What is an EQUI-join?

An EQUI-join is a type of database join that combines rows from two or more tables based on the equality of values in specified columns.
The common columns used for the join are known as the "join columns" or "join keys".
The result of an EQUI-join is a new table that includes only rows where there's a corresponding match for the join columns in both original tables.

Theory and Steps

Identifying Join Columns:
- Carefully examine the two tables you intend to join.
- Look for columns that share common data types and hold related information that would logically connect the tables together. Common examples include:
  - CustomerID in a Customers table and CustomerID in an Orders table.
  - ProductID in a Products table and ProductID in a Sales table.
Join Condition (WHERE Clause):
- The join condition uses the equality operator (=) to express how the tables should be connected.
- For example: `WHERE Customers.CustomerID = Orders.CustomerID`
Cartesian Product (Conceptual): Conceptually, the database engine begins by creating a Cartesian product. This means it takes every row in the first table and combines it with every row in the second table.
Filtering Based on Condition:
- The database then examines the join condition you specified.
- It eliminates rows from the Cartesian product that don't meet the equality requirement in the join condition.
Resulting Table: The remaining rows, where the join columns have matching values, comprise the final result of the EQUI-join.

Example: Consider two tables:

Customers:
- CustomerID
- Name
- City
Orders:
- OrderID
- CustomerID
- OrderDate

The following SQL query would perform an EQUI-join:

SELECT Customers.Name, Customers.City, Orders.OrderID 
FROM Customers
JOIN Orders ON Customers.CustomerID = Orders.CustomerID;

Important Considerations:

Data Type Consistency: Ensure that the data types of the columns you're joining are the same or compatible.
Indexing: Indexed join columns can significantly increase the speed of EQUI-joins, especially for large tables.

Customer Table and Order table

The image below displays how to create an equi-join using records from a Customer table and Order table.
Here are the records created as a result of the equi-join.

Equi-Joins

In its most common form, a join forms new rows when data in the two source tables match. Because we are looking for rows with equal values, this type of join is known as an equi-join (or a natural equi-join ). As an example, consider the two tables in Figure 5.4.1 . Notice that the ID number column is the primary key of the customers table and that the same column is a foreign key in the orders table. The ID number column in orders therefore serves to relate orders to the customers to which they belong.

Figure 5.4.1: Two tables with a primary and foreign key relationship.

Assume that you want to see the names of the customers who placed each order. To do so, you must join the two tables, creating combined rows wherever there is a matching ID number. In database terminology, we are joining the two tables over ID number. The result table can be found in Figure 5.4.2. An equi-join can begin with either source table and the result should be the same regardless of the direction in which the join is performed. The join compares each row in one source table with the rows in the second. For each row in the first that matches data in the second source table in the column or columns over which the join is being performed, a new row is placed in the result table.

Ad Database Analysis for Design

Equijoins

An equijoin is a join with a join condition containing an equality operator. An equijoin combines rows that have equivalent values for the specified columns. Depending on the internal algorithm the optimizer chooses to execute the join, the total size of the columns in the equijoin condition in a single table may be limited to the size of a data block minus some overhead. The size of a data block is specified by the initialization parameter DB_BLOCK_SIZE.