RelationalDBDesign RelationalDBDesign



Clustering Tables   «Prev 

Oracle Clustering Advantages

When is clustering bad?

Clustering does not work well for all tables. For instance, clustering is inadvisable in the following three scenarios:
  1. Frequent updates of the cluster key-Remember that data is stored in a data block based on the value of the cluster key. If a user updates that value, the actual data row will have to be moved to a different data block, which is more resource intensive than a simple in-place update of a non-clustered row.
  2. Data takes up more than one or two blocks-The big advantage of clusters comes from the ability to read a data block that contains all the rows for a particular value for the cluster key. However, the cluster is set up so that the cluster key points to the first data block for the cluster. If there are multiple data blocks for a value, and you are trying to retrieve a specific row that is in a later block, you will actually increase the number of database reads with a cluster. This is because Oracle has to read through the values for the cluster from the beginning.
  3. Frequent table scans required-In a cluster, the data is stored according to a value in the cluster key. Because you may have a cluster that includes data from more than one table, the overall size of the cluster will be the combined size of the data in all the tables in the cluster. This combination means that a table scan of a single table in the cluster will take longer, and possiblymuch longer, than scanning the data for a single table.

Database Clustering Architecture

To achieve horizontal scalability or scale-out of a database, multiple database servers are grouped together to form a cluster infrastructure. These servers are linked by a private interconnect network and work together as a single virtual server that is capable of handling large application workloads. This cluster can be easily expanded or shrunk by adding or removing servers from the cluster to adapt to the dynamics of the workload. This architecture is not limited by the maximum capacity of a single server, as the vertical scalability (scale-up) method is. There are two types of clustering architecture:
  1. Shared Nothing Architecture
  2. Shared Everything Architecture
The shared nothing architecture is built on a group of independent servers with storage attached to each server. Each server carries a portion of the database. The workloads are also divided by this group of servers so that each server carries a predefined workload. Although this architecture can distribute the workloads among multiple servers, the distribution of the workloads and data among the servers is predefined. Adding or removing a single server would require a complete redesign and redeployment of the cluster. For those applications where each node only needs to access a part of the database, with very careful partitioning of the database and workloads, this shared nothing architecture may work. If the data partition is not completely in sync with the application workload distribution on the server nodes, some nodes may need to access data stored in other nodes. In this case, database performance will suffer.
Shared nothing architecture also does not work well with a large set of database applications such as OLTP (Online transaction processing), which need to access the entire database; this architecture will require frequent data redistribution across the nodes and will not work well. Shared nothing also does not provide high availability. Since each partition is dedicated to a piece of the data and workload which is not duplicated by any other server, each server can be a single point of failure. In case of the failure of any server, the data and workload cannot be failed over to other servers in the cluster.