Select Statement  «Prev  Next»
Lesson 9 The DISTINCT keyword
ObjectiveUnderstand how to use the DISTINCT keyword

DISTINCT Keyword in SQL

With the GROUP BY statement, you saw that you can limit results and remove duplicates in grouping situations.
What do you do if you want to remove duplicates but do not need to group the results?
That is where the DISTINCT keyword comes in.
DISTINCT is one of the easiest, most straightforward, most underused keywords in SQL.
It is so simple and powerful that you will probably use it often in your queries. Here is an example of the DISTINCT keyword:

SELECT DISTINCT Email 
FROM MailingList

This statement will pull all the unique email values from the MailingList table and return them in the results set. These resultswill be sorted based on email, and you will have no duplicates returned. You can see how this works in the image below:

Removing duplicate email addresses from a table using the distinct keyword

To use the DISTINCT keyword, you simply indicate the column or columns that you want to be unique. In this case, we indicated that the email value should be unique, but you can indicate columns from any table.

Theory behind distinct

If you want to know all the unique values in a record use the DISTINCT keyword. The DISTINCT keyword is added to the SELECT statement's column listing, directly after the SELECT keyword. For example, if someone asks you which cities members come from, you could try a query similar to the following:
SELECT City FROM MemberDetails;

Executing the query gives you the following results:

City
Townsville
Orange Town
New Town
Orange Town
Orange Town
Big City
Windy Village
In the above result set, Orange Town is listed three times because there are three members from that city. But if you simply want a list of the unique places that members live in, you could use the DISTINCT keyword:
SELECT DISTINCT City FROM MemberDetails;

Executing the modified query gives these results:
City
Big City
New Town
Orange Town
Townsville
Windy Village
After the second query is execued, Orange Town is mentioned just once. The DISTINCT keyword works on all columns in combination and all the columns listed in the SELECT statement must be unique. If you change the previous query to include MemberId (which is unique for every single row) and rerun the query,
you end up with all the rows:
SELECT DISTINCT City, MemberId FROM MemberDetails;

The results are as follows:
City          MemberId
Big City          8
New Town          4
Orange Town       5
Orange Town       6
Orange Town       7
Townsville        1
Windy Village     9

Orange Town appears three times because MemberId is unique on each row. Furthermore, using the DISTINCT keyword where one of the columns is always unique is pointless.