Lesson 6 | Advanced group operators: DISTINCT, UNIQUE, HAVING |
Objective | Correctly place group functions DISTINCT, UNIQUE, or HAVING clauses |
Group Operators DISTINCT, UNIQUE, HAVING
SQL "GROUP BY DISTINCT" used as Operator to manipulate data in SQL
The SQL "GROUP BY DISTINCT" operator is used to group and aggregate data in a SQL query. It is often used in combination with other aggregate functions, such as COUNT, SUM, AVG, MAX, and MIN.
Here's an example of how the "GROUP BY DISTINCT" operator can be used in a SQL query:
Suppose we have a table named "orders" with the following columns:
- order_id
- customer_id
- product_id
- order_date
- order_quantity
To find the total quantity of orders for each product, we could use the following SQL query:
SELECT product_id, SUM(order_quantity)
FROM orders
GROUP BY product_id
This query uses the "GROUP BY" clause to group the data by product_id, and the "SUM" function to calculate the total order quantity for each product. The "SELECT" statement returns two columns: the product_id and the total order quantity.
If we only want to group the data by distinct product IDs, we can use the "GROUP BY DISTINCT" operator instead of "GROUP BY":
SELECT DISTINCT product_id, SUM(order_quantity)
FROM orders
GROUP BY product_id
This query produces the same result as the previous query, but only groups the data by distinct product IDs. The "DISTINCT" keyword ensures that each product ID is only counted once in the grouping, even if it appears multiple times in the table.
The SQL "GROUP BY DISTINCT" operator is used to group and aggregate data in a SQL query, and can be used in combination with other aggregate functions to manipulate and summarize data in a table.
DISTINCT or UNIQUE
The
DISTINCT
or
UNIQUE
clause is a group function that does not require you to use the
GROUP BY
clause. Add this to a query and you eliminate any duplicate rows returned from the query.
You can use either
DISTINCT
or
UNIQUE
, both have exactly the same meaning.
For example, in the query below, you are looking for a list of
PRODUCT_ID
values and you are not interested in listing the same
PRODUCT_ID
more than once, even if it appears in more than one sale:
SELECT DISTINCT PRODUCT_ID FROM SALE_ITEM
You can also use
DISTINCT
in combination with other grouping functions. For example, if you want to count the number of rows found in the query above, you could use this query:
SELECT COUNT(DISTINCT PRODUCT_ID) FROM SALE_ITEM
HAVING
The HAVING
clause is an optional addition to a query that uses grouping functions. It is added to the WHERE
clause immediately after the GROUP BY
clause, so that you can filter your query results based on the group function's data.
For example, let us say you want to add a unique index on the LASTNAME
column of the CUSTOMER
table, but the index cannot be created because duplicates exist somewhere in the table's data.
How can you find the duplicates? The following query uses the HAVING
clause to quickly identify the duplicate rows.
SELECT LASTNAME, COUNT(LASTNAME)
FROM CUSTOMER
GROUP BY LASTNAME
HAVING COUNT(LASTNAME) > 1
Another use might be to find customers that have spent more than $100 at the pet store:
SELECT FIRSTNAME, LASTNAME, SUM(TOTAL_SALE_AMOUNT)
FROM CUSTOMER C, CUSTOMER_SALE CS
WHERE C.CUST_ID = CS.CUST_ID
GROUP BY FIRSTNAME, LASTNAME
HAVING SUM(TOTAL_SALE_AMOUNT) > 100
In the next lesson, you will learn how to use pseudocolumns to your advantage.
Advanced Group Operators - Exercise