SQL Joins   «Prev  Next»

Lesson 2 Examine tables from the PUBS database
ObjectiveSee how two tables are related by examining the PUBS database.

Join using Where Clause

When you combine results from two or more tables into a single results set, you join the information. Joins are on the strange side. Whereas other SQL items are called out specifically with a statement or keyword, you don't use the keyword JOIN to bring the results together. Instead, you use the WHERE clause. Consider the following tables, located in the PUBS database provided with your database engine or as part of this course:

Difference between Join Operation of two tables and a Subquery

In the realm of SQL development, the choice between using a join operation and a subquery can significantly impact the structure and performance of your queries. A join operation is used to combine rows from two or more tables based on a related column between them. There are various types of joins :
  1. INNER JOIN,
  2. LEFT JOIN,
  3. RIGHT JOIN, and
  4. FULL JOIN

each serving a specific purpose. The primary advantage of a join is its efficiency in combining large datasets. Joins are generally faster and less complex than subqueries when dealing with big data. They allow for a more readable and maintainable SQL script, especially when you're linking multiple tables. However, they can sometimes result in larger result sets, particularly with OUTER JOINs, which can include every row from each table, potentially causing performance issues.
On the other hand, a subquery is a query nested inside another query. It can be used in various parts of the main query, including the SELECT, FROM, and WHERE clauses. Subqueries are beneficial when you need to perform an operation on a set of data that you then need to use in another query. They are particularly useful for performing aggregations or when the intermediate result set is small. However, subqueries can sometimes lead to slower performance, especially if the subquery is executed once for every row processed by the main query. This is known as a correlated subquery and can be particularly resource-intensive. In summary, while joins are typically used for straightforward merging of tables and are generally faster for large datasets, subqueries offer more flexibility in certain scenarios, particularly with complex filters and aggregations. The choice between the two often depends on the specific requirements and constraints of the SQL operation you are performing.


Publishers Table
pub_idpub_namecitystatecountry
0736New Moon BooksBostonMAUSA
0877Binnet & HardleyWashingtonDCUSA
1389Algodata InfosystemsBerkeleyCAUSA
1622Five Lakes PublishingChicagoILUSA
1756Ramona PublishersDallasTXUSA
9901GGG&GMünchen
Germany
9952Scootney BooksNew YorkNYUSA
9999Lucerne PublishingParis
France

Titles Table (a portion of the table is shown below)
title_idpub_idpricepubdate
BU1032138919.99Jun 12 1991 12:00AM
BU1111138911.95Jun 9 1991 12:00AM

You can see just by looking at the tables that the books in the titles table (that are shown here) are published by Algodata Infosystems. How can you see this? By matching pub_id from the titles table with the corresponding pub_id of the publishers table.

Qualifying Column Names

Recall from Tables, Columns, and Rows in a previous module, that column names must be unique within a table but can be reused in other tables. The tables authors and publishers in the sample database both contain a column named city, for example. To identify an otherwise-ambiguous column uniquely in a query that involves multiple tables, use its qualified name. A qualified name is a table name followed by a dot and the name of the column in the table. Because tables must have different names within a database, a qualified name identifies a single column uniquely within the entire database.

To qualify a column name:

Type: (table.column) column is a column name, and table is name of the table that contains column
  1. You can mix qualified and unqualified names within the same statement.
  2. Qualified names are not required if there's no chance of ambiguity, that is, if the query's tables have no column names in common. To improve system performance, however, qualify all columns in a query with joins.
  3. Another good reason to use qualified names is to ensure that changes to a table’s structure do nott introduce ambiguities. If someone adds the column zip to the table publishers, any unqualified references to zip in a query that selects from the tables authors (which already contains a column zip) and publishers would be ambiguous
Listing 7.2 Here, the qualified names resolve otherwise-ambiguous references to the column city in the tables authors and publishers. See the SQL below.

SELECT au_id, authors.city
FROM authors
INNER JOIN publishers
ON authors.city = publishers.city;

au_id    city
-----      -------------
A03      San Francisco
A04      San Francisco
A05      New York