One of the strengths of a relational database is that application developers and analysts work primarily with logical structures (tables, rows, columns) using SQL, without needing to manage physical disk layouts directly. However, an Oracle database still has to place blocks on storage somewhere, and performance is influenced by how efficiently Oracle can locate and read the blocks needed to satisfy a query.
In modern Oracle implementations (including Oracle 23ai), the “bottleneck” is rarely literal disk-head movement. The practical goal is still the same: reduce unnecessary I/O and improve locality so Oracle can return rows with fewer physical reads and fewer buffer gets. Clusters are one of Oracle’s physical design features that can help in a narrow set of workload patterns—primarily when multiple tables are frequently joined using the same key.
This module introduces Oracle clusters and shows when clustering is beneficial, how to build clusters correctly, and how to avoid designs that increase DML overhead or waste space.
In Oracle, a cluster is a database segment designed to store rows from one or more tables together based on a shared value called the cluster key. When rows are stored together by that key, Oracle can often retrieve related rows with fewer block reads—especially for join-heavy queries that repeatedly access the same key value.
Clustering is not a general performance “boost.” It is a physical design choice that trades flexibility for locality. You typically consider a cluster only when:
An index cluster stores rows by cluster key and uses a cluster index to locate the block(s) that hold the key value. This is the more common cluster type and is usually the better choice when you have range access patterns or mixed predicates (not only equality).
A hash cluster uses a hash function on the cluster key to compute the target block location. This can be very fast for
exact-match access (for example, WHERE key = :value) but is a poor fit for range queries and can degrade if sizing is
wrong (hash collisions and overflow blocks).
The cluster key is the column (or columns) that Oracle uses to colocate rows. It should match how your workload joins or filters data. In a parent/child model, the cluster key is often the parent key that also exists as the child foreign key.
All tables you place in a cluster must include the cluster key column(s), and the table’s CLUSTER ... (key) clause tells
Oracle which column(s) map to the cluster key definition.
The example below clusters two tables on a shared department key, so department rows and employee rows for the same department can be stored close together. This can reduce block visits for join-heavy lookups by department.
CREATE CLUSTER emp_dept_cl
(deptno NUMBER(4))
SIZE 1024
TABLESPACE users;
-- Required for an index cluster before loading data
CREATE INDEX emp_dept_cl_ix
ON CLUSTER emp_dept_cl;
CREATE TABLE dept (
deptno NUMBER(4) CONSTRAINT dept_pk PRIMARY KEY,
dname VARCHAR2(30)
) CLUSTER emp_dept_cl (deptno);
CREATE TABLE emp (
empno NUMBER(4) CONSTRAINT emp_pk PRIMARY KEY,
ename VARCHAR2(30),
deptno NUMBER(4) CONSTRAINT emp_dept_fk REFERENCES dept(deptno)
) CLUSTER emp_dept_cl (deptno);
Notes:
SIZE is the estimated space needed to store all rows that share the same cluster key value
(for example, the average “department + its employees”). Poor sizing leads to wasted space or overflow blocks.
A hash cluster is optimized for equality lookups on the key. It is typically used for lookup-style access patterns where you repeatedly query by a specific key value.
CREATE CLUSTER product_hcl
(product_id NUMBER)
HASHKEYS 1000
SIZE 1024
TABLESPACE users;
CREATE TABLE products (
product_id NUMBER CONSTRAINT products_pk PRIMARY KEY,
product_name VARCHAR2(50)
) CLUSTER product_hcl (product_id);
Practical guidance:
HASHKEYS should reflect the expected number of distinct key values.Dropping a cluster removes the cluster segment and (optionally) its tables. In practice, you typically drop the tables explicitly as part of a controlled change, then drop the cluster.
DROP CLUSTER emp_dept_cl INCLUDING TABLES CASCADE CONSTRAINTS;
Use INCLUDING TABLES only when you explicitly intend to remove the cluster tables as part of the same operation.
Otherwise, plan the migration (move tables out of the cluster) before dropping.
Clusters are specialized. In many modern systems, better outcomes come from appropriate indexing, partitioning, and query tuning. Avoid clustering when:
In the next lesson, you will evaluate workload patterns and learn how to decide whether clustering is appropriate before implementing it.