Clustering Tables   «Prev  Next»

Lesson 6 Hash clusters
Objective Know When to use Oracle Hash Cluster

When to use Oracle Hash Cluster

There is another type of clustering available in your Oracle database that is called a hash cluster. A hash cluster is similar in most respects to a standard cluster. The data in a hash cluster is grouped according to a value, which is stored in a cluster index.

How is a hash cluster different?

The big difference between a hash cluster and a normal cluster is the way the data is accessed. In a regular cluster, Oracle uses the value of the cluster key to access the data in the cluster. In a hash cluster, Oracle uses the value of a hashing function to access the data in the cluster.
The following slide show illustrates a hash cluster in action:

  1. A user issues a SELECT statement based on the value of the cluster key in the hashed cluster.
  2. Oracle parses the value with the specified hash cluster.
  3. Oracle uses the resultant hash value to directly access the appropriate cluster.

Hash Table Cluster
A hash cluster is similarly used with cluster tables, the difference being a hash cluster uses a hash function instead of the index key.

Hashing Function

The hashing function is an algorithm that acts on the values in the cluster key. Because of this, all the values in a cluster key for a hash cluster must be numeric. Oracle can either use its standard hashing algorithm, or you can assign a specific hashing function for a hash cluster. A hash cluster differs from a normal cluster in that the hash value is not stored in the cluster itself. In addition, although a cluster index is required for a regular cluster, you cannot create a cluster index on a hash cluster. Because there is no cluster index, the number of I/O operations is cut in half.

When should you use a hash cluster?

A cluster contains all the data for a specific value of the cluster key in a limited number of data blocks. You may have data that is appropriate for a cluster, but whose cluster key has an unequal distribution of values. This would result in the data for some key values being much larger than others, which would in turn force you to size the cluster to accommodate the largest sets of rows, which would waste space for the smaller sets of rows. In this case, you might use a hash cluster to force a more even distribution of rows and still get the advantages of a cluster. The next lesson explores how to create and size a hash cluster.