Analyze Logical Database Design using Real-life example
Lesson 5
Implementing Logical Database Performance Design
Objective
Describe Characteristics of a High-Performing Database
Analyze Logical Database Design Using a Real-Life Example
To illustrate how logical database design impacts performance, consider a database for managing product orders. Management wants to know the counts of "State flower" for all orders of "sunflower seeds." The output would look like this:
How Logical Database Design Impacts Performance
The logical design of a database significantly affects its performance by determining how data is organized and accessed. Key factors influencing performance include:
Data Normalization: Normalization reduces redundancy and improves data integrity but can lead to complex queries and slower performance with large datasets.
Indexing: Proper indexing speeds up data retrieval, while poor indexing can slow queries and increase storage needs.
Joins: Combining data from multiple tables via joins can be resource-intensive, especially with many tables or large datasets.
Data Types: Choosing appropriate data types optimizes performance; oversized types can slow queries and increase storage.
Query Complexity: Complex queries with multiple joins or subqueries can degrade performance. Simplifying queries improves efficiency.
Table Structure: Lean table designs with minimal columns and optimized row sizes enhance query performance and reduce storage.
The following comparison illustrates how different database designs result in different SQL queries to achieve this output.
1) This normalized database design minimizes redundancy and optimizes storage in Oracle. It includes the following tables:
CUSTOMER Table
cust_nbr
cust_name
cust_street_address
cust_city
cust_zip_code
CITY Table
cust_city
cost_of_living
city_mascot
state_name
STATE Table
state_name
state_bird
state_flower
region_name
ORDER Table
order_nbr
order_date
cust_nbr
salesperson_name
SALESPERSON Table
salesperson_name
salesperson_address
job_title
city_assigned
ITEM Table
item_nbr
item_name
inventory_amount
list_price
QUANTITY Table
order_nbr
item_nbr
quantity_sold
SQL Code (Normalized Design):
SELECT s.state_flower, COUNT(*)
FROM state s
JOIN city c ON c.state_name = s.state_name
JOIN customer cu ON cu.cust_city = c.cust_city
JOIN "order" o ON o.cust_nbr = cu.cust_nbr
JOIN quantity q ON q.order_nbr = o.order_nbr
JOIN item i ON i.item_nbr = q.item_nbr
WHERE i.item_name = 'Flower Seed'
GROUP BY s.state_flower;
This SQL query requires joining six tables to retrieve the state flower counts. The multiple joins increase complexity and can slow performance, especially with large datasets.
This denormalized design consolidates data from the CITY and STATE tables into the CUSTOMER table, reducing the need for joins. The updated table structure is:
CUSTOMER
cust_nbr
cust_name
cust_street_address
cust_city
cust_zip_code
cost_of_living
city_mascot
state_name
state_bird
state_flower
region_name
ORDER
order_nbr
order_date
cust_nbr
salesperson_name
SALESPERSON
salesperson_name
salesperson_address
job_title
city_assigned
ITEM
item_nbr
item_name
inventory_amount
list_price
QUANTITY
order_nbr
item_nbr
quantity_sold
SQL Code (Denormalized Design):
SELECT cu.state_flower, COUNT(*)
FROM customer cu
JOIN "order" o ON o.cust_nbr = cu.cust_nbr
JOIN quantity q ON q.order_nbr = o.order_nbr
JOIN item i ON i.item_nbr = q.item_nbr
WHERE i.item_name = 'Flower Seed'
GROUP BY cu.state_flower;
This query eliminates two table joins (CITY and STATE), reducing complexity and improving performance by accessing the state_flower directly from the CUSTOMER table.
Data Warehouse Table Design
In data warehouses, "wide" tables with significant redundancy are common to optimize query performance. For example, combining all data from CUSTOMER, ORDER, SALESPERSON, ITEM, and QUANTITY into a single QUANTITY table creates a wide table. While updates become cumbersome, queries are faster because no joins are needed. A single-row fetch retrieves all order details, simplifying complex queries.
Physical Design Next
Having explored logical design, the next step is to examine how physical design impacts performance. By optimizing normalization, indexing, joins, data types, query complexity, and table structure, designers can create databases that handle large datasets efficiently.