Lesson 1

Physical Design of Database

Question: Does the physical design of a database ask the question, how to store the database on a physical device?
Yes, the physical design of a database focuses on determining the most efficient way to store and manage the database on a physical device. This stage of the database design process is concerned with optimizing data storage, access, and retrieval by translating the logical data model into a physical representation that aligns with the specific characteristics and requirements of the underlying hardware and database management system (DBMS). Key aspects of the physical design process that address the question of how to store the database on a physical device include:

File organization: The physical design determines the file organization method for storing the data on the physical device. This involves choosing the most suitable storage structure (e.g., heap files, clustered files, or indexed files) based on the data access patterns, query requirements, and performance needs of the database.
Data storage formats: The physical design specifies the data storage formats and data types for each table, column, and index, taking into account factors such as storage space, data integrity, and the specific capabilities of the DBMS.
Indexing: The physical design process identifies the most appropriate indexing strategies to optimize query performance and data retrieval. This includes selecting the types of indexes (e.g., B-trees, bitmap indexes, or hash indexes) and determining which columns should be indexed based on query patterns and performance requirements.
Partitioning: The physical design may involve partitioning large tables into smaller, more manageable pieces, based on specific criteria such as date ranges, geographic regions, or other attributes. Partitioning can improve query performance, simplify maintenance tasks, and enhance data availability and scalability.
Data compression: The physical design considers data compression techniques to reduce storage space requirements and optimize I/O performance. Compression methods must be chosen carefully to balance storage efficiency and query performance.
Security and access control: The physical design addresses the implementation of security mechanisms and access control policies at the storage level, ensuring that data is protected from unauthorized access, tampering, or loss.
Backup and recovery: The physical design includes planning for data backup and recovery procedures to safeguard against data loss, system failures, or other unexpected events.
Integration with hardware and system architecture: The physical design takes into account the underlying hardware, operating system, and network infrastructure, tailoring the database storage and access mechanisms to optimize performance, reliability, and resource utilization.

The physical design of a database indeed addresses the question of how to store the database on a physical device. By carefully considering factors such as storage structures, indexing strategies, partitioning, and data formats, the physical design process aims to optimize data storage, access, and retrieval, ensuring that the database performs efficiently and effectively on the underlying hardware and within the constraints of the DBMS.

Up to now, this course has dealt with the logical design phase of the database life cycle (DBLC). You laid out the database's tables, created links between the tables, normalized the tables so they will perform with maximum efficiency, designed user views to create information from table data, and assigned permissions to users to ensure they made no unexpected changes to the database. It is time now to investigate the third phase of the DBLC, physical design: how the computer stores the database on its hard drive. You will learn how to change the physical design in ways that might improve database performance. Yes, "might." Unlike normalizing tables, which verifiably improves database performance, the techniques you will learn in this module must be tried on a specific database before their effects can be known. As you will see, changing the physical design of a database is just as likely to hurt performance as to help it.

Learning Objectives

After completing this module, you should be able to:

Describe physical design^[1] in terms of optimizing database performance.
Explain the purpose and trade-offs of creating indexes^[2]
Explain the purpose and trade-off ofclustering.^[3]
Explain the purpose and trade-offs of horizontal partitioning.^[4]
Explain the purpose and trade-offs of vertical partitioning.^[5]
Explain how to create a database with SQL
Explain how to use SQL to create database tables
Explain how to use SQL to build a database
Describe the entries that appear in thedata dictionary.^[6]
Describe the tables that are part of the data dictionary

The next lesson introduces you to physical design.

[1]Physical Design: Partitioning a table, in database lexicon, divides a single table into two or more tables to limit the amount of data the RDBMS has to retrieve at once.

[2]index: An index is an ordered list of values in a field, exactly like the index in the back of a book.

[3]clustering: Clustering brings records from two or more tables together on a hard disk to improve joins between the two tables.

[4]Horizontal partitioning: Horizontal partitioning divides a table's records into two or more tables.

[5]Vertical partitioning: To divide a table into multiple tables, placing different columns in each new table. This type of partition divides the table vertically, which means that the structure of the main table changes in the new tables which are generated.

[6]data dictionary: Also called a catalog. The data dictionary is the foundation of the database and it is where the RDBMS finds which tables are in a database, which columns are in the tables, which columns are primary or foreign keys, and what type of data to expect in those columns.