SQL Select Grouping Information - The DISTINCT Keyword

The text below explains the correct answers to the quiz.
 
1. Given the following columns from a table named MyTable:
MyLastNameMyFirstNameMyAddressMyCityMyStateMyZipCode
Which SQL statement sorts all rows from MyTable by MyLastName, including duplicates?
  A.
SELECT * FROM MyTable 
GROUP BY MyLastName
              
  B.
SELECT * FROM MyTable 
ORDER BY MyLastName ASC
              
  C.
SELECT * FROM MyTable 
ORDER BY MyLastName
              
  D.
SELECT DISTINCT * FROM MyTable 
ORDER BY MyLastName
              
  The correct answer is C.
The ORDER BY MyLastName clause sorts all rows in MyTable by the MyLastName column, retaining duplicates as required. Option A incorrectly uses GROUP BY, which is for aggregation, not sorting. Option B uses ORDER BY ... ASC, which is valid but redundant since ORDER BY defaults to ascending order. Option D uses DISTINCT, which removes duplicates, contrary to the question’s requirement.
2. Which SQL statement selects only the unique states from MyTable?
  A.
SELECT DISTINCT MyState FROM MyTable
              
  B.
SELECT MyState FROM MyTable ORDER BY MyState
              
  C.
SELECT MyState FROM MyTable GROUP BY MyState
              
  D.
SELECT * FROM MyTable WHERE MyState IS NOT NULL
              
  The correct answer is A.
The SELECT DISTINCT MyState statement retrieves only unique values of the MyState column, efficiently eliminating duplicates. Option C, using GROUP BY MyState, also returns unique states but is typically used for aggregation, making it less common for this purpose. Option B uses ORDER BY, which sorts but does not remove duplicates. Option D selects all rows where MyState is not null, including duplicates, which does not meet the requirement.

3. Why is the statement
SELECT * FROM MyTable GROUP BY MyState
invalid?
  A. All non-aggregated columns in the SELECT clause must appear in the GROUP BY clause.
  B. The SELECT * syntax is not allowed in any SQL statement.
  C. A WHERE clause is required when using GROUP BY.
  D. The GROUP BY clause cannot reference the MyState column.
  The correct answer is A.
In SQL, when using GROUP BY, all non-aggregated columns in the SELECT clause must be included in the GROUP BY clause. The statement SELECT * FROM MyTable GROUP BY MyState is invalid because SELECT * includes all columns (e.g., MyLastName, MyFirstName), but only MyState is in the GROUP BY. Option B is incorrect because SELECT * is valid in many contexts. Option C is wrong as a WHERE clause is not required. Option D is incorrect because MyState can be used in GROUP BY.

4. To query a sales table and retrieve the total sales value for each district, excluding detail line items, which SQL clause would you use?
  A. A sub-SELECT to limit the results by district
  B. A GROUP BY to summarize the results
  C. A SELECT statement with the DISTINCT keyword to filter detail lines
  D. An ORDER BY clause to group and summarize results
  The correct answer is B.
A GROUP BY clause, combined with an aggregate function like SUM, is used to summarize sales values by district, excluding individual line items. For example, SELECT district, SUM(sales_value) FROM sales_table GROUP BY district. Option A (subquery) is unnecessary for this task. Option C (DISTINCT) does not summarize data. Option D (ORDER BY) sorts results but does not group or summarize them.

5. To display grades from a Grades table for students with a specific grade level from a StudentInfo table, which SQL clause or statement would you use?
  A. A JOIN clause to combine the two tables
  B. A subquery to filter grades based on StudentInfo
  C. A GROUP BY clause to summarize grades
  D. An ORDER BY clause to sort the results
  The correct answer is A.
A JOIN clause combines the Grades and StudentInfo tables based on a common column (e.g., student ID), allowing filtering for a specific grade level using a WHERE clause. For example, SELECT g.grade FROM Grades g JOIN StudentInfo s ON g.student_id = s.student_id WHERE s.grade_level = 'value'. Option B (subquery) is a valid alternative but less common. Option C (GROUP BY) is for aggregation, not joining. Option D (ORDER BY) sorts results but does not combine tables.

6. Which SQL clause retrieves a list of unique customers from a sales table, excluding order details and duplicates?
  A. A sub-SELECT to select the unique rows for the query
  B. A GROUP BY to summarize the customer order information, then pull the order information
  C. A SELECT statement with the DISTINCT keyword to filter the table's duplicate results
  D. An ORDER BY clause to sort and filter the results
  The correct answer is C.
The DISTINCT keyword in a SELECT statement (e.g., SELECT DISTINCT customer_id FROM sales_table) retrieves unique customers, excluding duplicates and order details. Option A (subquery) is unnecessary. Option B (GROUP BY) is valid but less straightforward for this purpose. Option D (ORDER BY) sorts but does not eliminate duplicates.

7. Which statement about the = operator in SQL is not true?
  A. It compares two values for equality in a WHERE clause.
  B. It can be used in both simple and subquery comparisons.
  C. It requires a subquery to return exactly one row in a comparison.
  D. It cannot be used with NULL values to check for equality.
  The correct answer is C.
The = operator does not strictly require a subquery to return exactly one row; it can compare with a subquery returning one row, but other operators like IN are used for multiple rows. Option A is true, as = is used in WHERE clauses for equality. Option B is true, as = works in both simple and subquery comparisons. Option D is true, as = with NULL returns NULL, requiring IS NULL instead.