Lesson 4 | STAR Index Queries |
Objective | Describe STAR Query Operation |
Index Queries (Describe Operation)
STAR Data Warehouse Schema
The term
STAR was coined by data warehouse researcher Dr. Ralph Kimball. Kimball described the
data warehouse table structure as a star formation, where the central
FACT table is surrounded by smaller points called
DIMENSION or
lookup tables.
The STAR Query
A STAR query is a join between a FACT table and a number of DIMENSION tables. The first Oracle data warehouses lacked an optimized execution strategy to efficiently support access for queries on a STAR data warehouse table structure.
In running STAR queries, early query optimizers often chose to process substantial amounts of data from the FACT table, multiple times.
They would first join the smallest DIMENSION table against the FACT table, and then join each of the other DIMENSION tables in turn against the intermediate table. These n-way table joins were performed very slowly.
The STAR Schema Method
To remedy this problem, Oracle developed the STAR schema method for performing large n-way joins of very large FACT tables. The SlideShow below examines the STAR query process.
- Oracle will first service the queries against the smaller DIMENSION tables, combining the result set into a Cartesian product table that is held in Oracle RAM in the shared pool.
- The primary key for this virtual table will be a composite of all of the keys for the DIMENSION tables.
- If this key matches the composite index on the FACT table, then the query will be able to process very quickly.
- Once the sum of the reference tables has been addressed, Oracle will perform a nested-loop join of the intermediate table against the FACT table.
Star Query Process
Conditions for STAR Query indexing Technique
The STAR query requires that a single concatenated index reside on the FACT table for all keys. To invoke the STAR query path, the following characteristics must be present:
- At least three tables in the join, including one large FACT table and several smaller DIMENSION tables
- A concatenated index on the FACT table with at least three columns, one for each of the table join keys
- An EXPLAIN PLAN to verify that the NESTED LOOPS operation is being used to perform the join
The speed of the star join technique is a result of reducing the physical I/O.
The STAR Query Process
In the STAR query below, the indexes are read to gather the virtual table in memory. The FACT table is not accessed until the virtual index has everything it requires to go directly to the requested rows via the composite index on the FACT table.
NESTED LOOPS
MERGE JOIN CARTESIAN
MERGE JOIN CARTESIAN
SORT JOIN
TABLE ACCESS ... DIMENSION1
SORT JOIN
TABLE ACCESS ... DIMENSION2
SORT JOIN
TABLE ACCESS ... DIMENSION3
TABLE ACCESS BY ROWID FACT_TABLE
INDEX RANGE SCAN FACT_TABLE_CONCAT_IDX
In summary, the STAR query is extremely useful in cases where large, fully populated data warehouse tables are joined together.
The next lesson describes the STAR transformation.