Physical Design   «Prev  Next»

Lesson 7Table Partitioning
ObjectiveExplain the purpose of Partitioning

Database Table Partitioning

The following lessons explain the advantages and disadvantages of both partitioning methods.
Partitioning is the opposite of clustering. Clustering brings records from two or more tables together on a hard disk to improve joins between the two tables. Partitioning a table, by contrast, splits a single table into two or more tables to limit the amount of data the RDBMS has to retrieve at once. Partitioning involves the splitting of large tables into smaller ones so that the DBMS does not need to retrieve as much data at any one time. Consider, for example, what happens to DistributedNetworks 1) order and 2) order items tables over time. Assuming that the business is reasonably successful, those tables (especially order items) will become very large. Retrieval of data from those tables will therefore begin to slow down. It would speed up retrieval of open orders if filled orders and their items could be separated from open orders and their items.
There are two types of partitioning operations or two ways to partition a table:
  1. Horizontal partitioning, which splits a table's records into two or more tables. It involves splitting the rows of a table between two or more tables with identical structures.
  2. Vertical partitioning, which splits a table's columns into two or more tables. It involves dividing the columns of a table and placing them in two or more tables linked by the original table's primary key.

The best way to remember how each type of partitioning divides a table is to envision drawing a line across a table. If you draw a horizontal line, the table's records are split into two identical tables.

Horizontal Partitioning

Horizontal partitioning involves creating two or more tables with exactly the same structure and splitting rows between those tables. DistributedNetworks might use this technique to solve the problem with the order and order items tables becoming increasingly large. The database design might be modified as follows:
open_order (order_numb, customer_numb, order_date)
open_order_items (order_numb, item_numb, quantity,
shipped?)
filled_order (order_numb, customer_numb, order_date)
filled_order_items (order_numb, item_numb, quantity,
shipped?)

Whenever all items in an open order have shipped, an application program deletes rows from the 1) open order and 2) open order items table and inserts them into the filled order and filled order items table. The open order and open order items tables remain relatively small, speeding up both retrieval and modification performance. Although retrieval from filled order and filled order lines will be slower, DistributedNetworks uses those tables much less frequently.
The drawback to this solution occurs when DistributedNetworks needs to access all of the orders and/or order items at the same time. A query whose result table includes data from both sets of open and filled tables must actually be two queries connected by the union operator[1]. Performance of such a query will be worse than that of a query of either set of tables individually. Nonetheless, if an analysis of DistributedNetworks’ data access patterns reveals that such queries occur rarely and that most retrieval involves the open set of tables, then the horizontal partitioning is worth doing.
The only way you can determine whether horizontal partitioning will increase performance is to examine the ways in which your database applications access data. If there is a group of rows that are accessed together significantly more frequently than the rest of the rows in a table, then horizontal partitioning may make sense.

Horizontal Partitioning: When you draw a horizontal line, the table's records are split into two identical tables.

Vertical Partitioning

Vertical partitioning involves creating two or more tables with selected columns and all rows of a table. For example, if DistributedNetworks accesses the titles and prices of their merchandise items more frequently than the other columns in the item table, the item table might be partitioned as follows:

item_titles (item_numb, title, price)
item_details (item_numb, distributor, release_date,
. . .)

The benefit of this design is that the rows in the smaller item titles table will be physically closer together. The smaller table will take up fewer disk pages and thus support faster retrieval.
Queries that require data from both tables must join the tables over the item number. Like most joins, this will be a relatively slow operation. Therefore, vertical partitioning makes sense only when there is a highly skewed access pattern from the columns of a table. The more often a small, specific group of columns is accessed together, the more vertical partitioning will help.

Drawing a vertical line splits the table's columns into two tables, which must then be linked by the table's primary key columns.
Vertical partitioning: Both tables contain the same primary key, while the other 2 columns of both tables are different.

The next lesson introduces horizontal partitioning.
[1]union operator: The union operation creates one table by merging the rows of two tables with the same structure.