Question: What are the characteristics of a star schema used in a data warehouse?
A star schema is a popular database design for a data warehouse due to its simplicity and efficiency in handling analytical queries and reporting tasks. It is characterized by the following key features:
Central Fact Table: At the heart of the star schema is the fact table, a large table that stores the primary transactional data. This table includes measures, which are often numeric and additive in nature, such as sales amounts, quantities, or counts. Fact tables typically have a composite primary key made up of foreign keys referencing the associated dimension tables.
Dimension Tables: Surrounding the fact table are one or more dimension tables. These tables provide descriptive, categorical information related to the measures stored in the fact table. Dimension tables typically have a simple primary key and contain attributes such as dates, geographic information, product details, or customer demographics. Each row in a dimension table represents a unique combination of attributes and is linked to the fact table via foreign key relationships.
Denormalization: In contrast to the third normal form (3NF) typically used in OLTP databases, star schemas use denormalized data. Denormalization reduces the number of tables and joins required to answer a query, which significantly speeds up data retrieval in analytical and reporting queries.
Hierarchies: Dimension tables in a star schema often represent hierarchies. For instance, a time dimension table might include columns for day, month, quarter, and year, enabling analysis at different levels of granularity.
Data Integrity: While the star schema uses denormalized data, it still maintains referential integrity through the use of primary and foreign keys. However, it usually does not enforce domain integrity (a function of normalization) since it prioritizes query performance over update anomalies.
Query Performance: Star schemas are optimized for query performance. By reducing the number of joins and the complexity of queries, they enable faster data retrieval. This is particularly beneficial for complex and ad-hoc analytical queries common in business intelligence and reporting applications.
Bitmap Indexing: Star schemas often use bitmap indexes on the foreign key columns in the fact table. When a query joins the fact table with dimension tables, Oracle can use these bitmap indexes to speed up data retrieval significantly.
Data Warehouse Environments: Star schemas are most commonly found in data warehouse environments, which are characterized by large volumes of data and a read-intensive workload. They are less suited to transactional (OLTP) environments with frequent, small updates.
By understanding these characteristics, database administrators and data architects can effectively design, implement, and manage star schemas in data warehouse environments, optimizing data retrieval and analysis performance.
This is an example of a star schema. Notice that the DIMENSION tables are not joined to each other.
Dimension
A single set of data about an item described in a fact table, a dimension is usually a denormalized table. A dimension table holds a key value and a numerical measurement or set of related measurements about the fact table object. A measurement is usually a sum but could also be an average, a mean or a variance. A dimension can have many attributes, 50 or more is the norm, since they are denormalized structures.
This is an example of a star schema
More on Star Schema
The star schema is one of the simplest of data warehouse schemas. It is known as star because it appears like a star with points expanding from a center. Figure 6.4 represents the star schema in which the fact table is at the center and the dimension tables are the nodes of the star. Each dimension in a star schema represents a one-dimensional table only and the dimension table consists of a set of attributes.
Dimension tables comprise of relatively small numbers of records in comparison to fact tables, but each record may consist of a large number of attributes to describe the fact data. Fact tables usually consist of numeric facts and foreign keys to dimensional data.
Generally, fact tables are in (3NF) third normal form in the case of star schema while dimensional tables are in de-normalized form.
Although the star schema is one of the simplest structures, it is still extensively used nowadays and recommended by Oracle.
Figure 6.4 Graphical representation of Star schema
About Bitmap Indexes on Partitioned Tables
You can create bitmap indexes on partitioned tables but they must be local to the partitioned table, they cannot be global indexes.
A partitioned table can only have global B-tree indexes, partitioned or nonpartitioned.
Benefits of Indexes for Data Warehousing Applications
Bitmap indexes are primarily intended for data warehousing applications where users query the data rather than update it.
They are not suitable for OLTP applications with large numbers of concurrent transactions modifying the data. Indexes are more beneficial for high cardinality columns.