Rule Based Optimizer   «Prev  Next»

Lesson 7 Dealing with complex Boolean logic
ObjectiveRewrite complex Boolean logic with the UNION Operator

Complex Boolean Logic (UNION Operator)

The techniques presented in this lesson will improve the performance of both cost-based and rule-based SQL queries. It illustrates the need to carefully check queries to avoid full-table scans. Consider the following query for an online system:
SELECT
 student_name
FROM 
 STUDENT
WHERE
 (student_nbr = :hostvar_student 
 OR 
 :hostvar_student IS NULL)
 AND
 (social_security_nbr = :hostvar_social 
 OR 
 :hostvar_social IS NULL);

This query accepts host variables which are prefixed with a colon (:).

Host variables

Host variables are very common in online systems where the SQL must be executed to query based on the input from the end user. In this case, we query by student_nbr or by social_security_nbr, depending upon which of the values is supplied by the end user. This is very common in online systems, where a screen may allow the end user to enter either the social security number or the student number. Because of that, the query has an index on both social security number and student number.

Elegant, but slow

From a programming perspective, this SQL is very elegant. Unfortunately, this query does not run at optimum efficiency.
So what do we expect to see from the EXPLAIN PLAN? Since we have an index on both columns, we would expect to see an index range scan. Instead, there is a full-table scan.
TABLE ACCESS FULL STUDENT

So, why the full-table scan? Here we see two Boolean conditions that are joined by the AND command. Within each Boolean, we test for two conditions, joined by an OR operator. As it turns out, the Oracle optimizer cannot resolve this complex Boolean expression.
Whenever the optimizer gets confused, it invokes a full-table scan.
Note: This is a very important point about SQL tuning. The “elegance” of an SQL statement often has an adverse impact on the performance of the query. Many SQL developers are taught to write Boolean expressions in the most compact form, and not for optimal performance.
The solution
SELECT
   student_name
FROM 
   STUDENT
WHERE
   student_nbr = :hostvar_student 
   AND 
   :hostvar_student IS NOT NULL
   AND
   :hostvar_social  IS     NULL
UNION
SELECT
   student_name
FROM 
   STUDENT
WHERE
   social_security_nbr = :hostvar_social
   AND
   :hostvar_social  IS NOT NULL 
   AND 
   :hostvar_student IS NULL
UNION
SELECT
   student_name
FROM 
   STUDENT
WHERE
   student_nbr = :hostvar_student 
   AND
   social_security_nbr = :hostvar_social
   AND
   :hostvar_social  IS NOT NULL 
   AND 
   :hostvar_student IS NOT NULL
UNION
SELECT
   student_name
FROM 
   STUDENT
WHERE
   :hostvar_social  IS NULL 
   AND 
   :hostvar_student IS NULL;                             

Re-write SQL as four Separate Queries

The solution to this dilemma is to re-write the SQL as four separate queries, each with different Boolean conditions, and the use the UNION operator to re-join the result sets. As you may already know, the UNION operator simply joins together the result sets from several queries into a single result set.
In this example, we have taken a very elegant SQL statement and replaced it with a statement that is more then four times longer. However, it is worth the trouble, because we will now see that the full-table scan is gone, and we get a far faster execution plan for the query. Click the View Code button on the left.
PROJECTION
   SORT                        UNIQUE
      UNION-ALL
         FILTER
           TABLE ACCESS        BY ROWID      STUDENT
           INDEX               RANGE SCAN    STU_NBR_SOC_SEC_IDX
         FILTER
           INDEX               RANGE SCAN    SOC_SEC_STU_NBR_IDX
         TABLE ACCESS          BY ROWID      STUDENT
           INDEX               RANGE SCAN    STU_NBR_SOC_SEC_IDX
         FILTER
           TABLE ACCESS        FULL          STUDENT               

Re-writing Query

Re-writing a query in this fashion may seem like a lot of effort. However, if this is a query for an online application that is executed thousands of time each day, it will result in huge performance gains. Now let s sum up the rules that we have learned about rule-based SQL optimization.

SEMrush Software