Lesson 1

Clustering Oracle Tables and Indexes

Logical Structures on Disk

One advantage of a relational database is that users do not need to know anything about the physical structure of the database.
They can access all the data by simply using (SQL) Structured Query Language . However, underneath the logical structures manipulated by SQL, the data in your Oracle database is still stored on a disk. And, as with all data on disk, the performance of data retrieval is dependent to some extent on the physical movement of the disk head over the spinning disk. If you can minimize this disk head movement, you minimize the single largest bottleneck in data retrieval. Oracle allows you to create clustered tables and indexes, whose primary purpose is to minimize the inherent delays caused by disk head movement. This module covers all you need to know about clusters, including:

The advantages of clustering
How to create and size a cluster
How to create a cluster key
The purpose of a hash cluster
How to create a hash cluster
How to drop a cluster

Clustering Technique

Clustering is a method of storing tables that are intimately related and often joined together into the same area on disk. For example, instead of the BOOKSHELF table being in one section of the disk and the BOOKSHELF_AUTHOR table being somewhere else, their rows could be interleaved together in a single area, called a cluster. The cluster key is the column or columns by which the tables are usually joined in a query (for example, Title for the BOOKSHELF and BOOKSHELF_ AUTHOR tables). To cluster tables, you must own the tables you are going to cluster together. The following is the basic format of the create cluster command:

create cluster cluster
(column datatype [,column datatype]. . .) [other options];

The cluster name follows the table-naming conventions, and column datatype is the name and datatype you will use as the cluster key. The column name may be the same as one of the columns of a table you will put in this cluster, or it may be any other valid name. Here is an example:

create cluster BOOKandAUTHOR (Col1 VARCHAR2(100));

This creates a cluster (a space is set aside, as it would be for a table) with nothing in it. The use of Col1 for the cluster key is irrelevant; you will never use it again. However, its definition should match the primary key of the table to be added. Next, tables are created to be included in this cluster:

create table BOOKSHELF
(Title VARCHAR2(100) primary key,
Publisher VARCHAR2(20),
CategoryName VARCHAR2(20),
Rating VARCHAR2(2),
constraint CATFK foreign key (CategoryName)
references CATEGORY(CategoryName)
)
cluster BOOKandAUTHOR (Title);

Prior to inserting rows into BOOKSHELF, you must create a cluster index:

create index BOOKandAUTHORndx
on cluster BOOKandAUTHOR;

Recall that the presence of a cluster clause here precludes the use of a tablespace or storage clause. Note how this structure differs from a standard create table statement:

create table BOOKSHELF
(Title VARCHAR2(100) primary key,
Publisher VARCHAR2(20),
CategoryName VARCHAR2(20),
Rating VARCHAR2(2),
constraint CATFK foreign key (CategoryName)
references CATEGORY(CategoryName)
);

Although clusters are not appropriate for all types of data, in the right situation a cluster can provide a significant increase in performance. In the next lesson you will learn when to use clustered tables.