Lesson 6	Advanced group operators: DISTINCT, UNIQUE, HAVING
Objective	Correctly place group functions DISTINCT, UNIQUE, or HAVING clauses

Advanced Group Operators: DISTINCT, UNIQUE, HAVING

Question: What are the benefits of using the Advanced group operators 1) DISTINCT, 2) UNIQUE, 3) HAVING by an Oracle DBA?

DISTINCT:
- Eliminates duplicate rows: Filters out rows with identical values for all columns in the specified grouping.
- Reduces data redundancy: Results in a more concise and meaningful dataset, making it easier to analyze and interpret.
- Enhances data integrity: Prevents duplicate data from skewing data analysis or causing errors in downstream processes.
- Improves performance: Can reduce the amount of data processed during aggregation operations, resulting in faster query execution.
UNIQUE:
- Selects unique rows: Ensures that only one row is returned for each unique combination of values in the specified grouping columns.
- Preserves unique data: Retains the integrity of the data by ensuring that no duplicates exist.
- Useful for deduplication: Removes duplicate rows that may arise from data integration or transformation processes.
- Supports data modeling: Can be used to define unique constraints on tables, ensuring data consistency and uniqueness
HAVING:
- Filters aggregated groups: Applies a filter condition to groups of data after aggregation operations have been performed.
- Selects groups based on summary information: Enables DBAs to extract specific groups of data that meet certain criteria.
- Facilitates data analysis: Allows for more granular analysis by grouping and filtering data based on calculated values.
- Identifies trends and patterns: Helps identify groups of data that exhibit specific behaviors or patterns.

SQL Guide

SQL "GROUP BY DISTINCT" used as Operator to manipulate data in SQL

The SQL "GROUP BY DISTINCT" operator is used to group and aggregate data in a SQL query. It is often used in combination with other aggregate functions, such as COUNT, SUM, AVG, MAX, and MIN. Here's an example of how the "GROUP BY DISTINCT" operator can be used in a SQL query: Suppose we have a table named "orders" with the following columns:

order_id
customer_id
product_id
order_date
order_quantity

To find the total quantity of orders for each product, we could use the following SQL query:

SELECT product_id, SUM(order_quantity)
FROM orders
GROUP BY product_id

This query uses the "GROUP BY" clause to group the data by product_id, and the "SUM" function to calculate the total order quantity for each product. The "SELECT" statement returns two columns: the product_id and the total order quantity. If we only want to group the data by distinct product IDs, we can use the "GROUP BY DISTINCT" operator instead of "GROUP BY":

SELECT DISTINCT product_id, SUM(order_quantity)
FROM orders
GROUP BY product_id

This query produces the same result as the previous query, but only groups the data by distinct product IDs. The "DISTINCT" keyword ensures that each product ID is only counted once in the grouping, even if it appears multiple times in the table. The SQL "GROUP BY DISTINCT" operator is used to group and aggregate data in a SQL query, and can be used in combination with other aggregate functions to manipulate and summarize data in a table.

Oracle Database SQL

DISTINCT or UNIQUE

The DISTINCT or UNIQUE clause is a group function that does not require you to use the GROUP BY clause. Add this to a query and you eliminate any duplicate rows returned from the query. You can use either DISTINCT or UNIQUE, both have exactly the same meaning. For example, in the query below, you are looking for a list of PRODUCT_ID values and you are not interested in listing the same PRODUCT_ID more than once, even if it appears in more than one sale:

SELECT DISTINCT PRODUCT_ID 
FROM SALE_ITEM

You can also use DISTINCT in combination with other grouping functions. For example, if you want to count the number of rows found in the query above, you could use this query:

SELECT COUNT(DISTINCT PRODUCT_ID) FROM SALE_ITEM

HAVING clause

The HAVING clause is an optional addition to a query that uses grouping functions. It is added to the WHERE clause immediately after the GROUP BY clause, so that you can filter your query results based on the group function's data.
For example, let us say you want to add a unique index on the LASTNAME column of the CUSTOMER table, but the index cannot be created because duplicates exist somewhere in the table's data. How can you find the duplicates? The following query uses the HAVING clause to quickly identify the duplicate rows.

SELECT LASTNAME, COUNT(LASTNAME)
FROM CUSTOMER
GROUP BY LASTNAME
HAVING COUNT(LASTNAME) > 1

Another use might be to find customers that have spent more than $100 at the pet store:

SELECT FIRSTNAME, LASTNAME, SUM(TOTAL_SALE_AMOUNT)
FROM CUSTOMER C, CUSTOMER_SALE CS
WHERE C.CUST_ID = CS.CUST_ID
GROUP BY FIRSTNAME, LASTNAME
HAVING SUM(TOTAL_SALE_AMOUNT) > 100

In the next lesson, you will learn how to use pseudocolumns to your advantage.

Advanced Group Operators - Exercise

Click the Exercise link below to place group functions into queries.
Advanced Group Operators - Exercise