Describe the characteristics of Horizontal Partitioning
Characteristics of Horizontal Partitioning in Database Design
Horizontal partitioning, also known as data sharding or partitioning by row, is a technique in database design that divides a table into multiple separate fragments based on a specific criterion or partition key. This approach aims to improve performance, scalability, and manageability of databases, particularly in large-scale applications. The key characteristics of horizontal partitioning are as follows:
Data distribution: In horizontal partitioning, the rows of a table are divided into several partitions, each containing a subset of the data. The partitioning is typically based on a specific attribute or a range of attribute values, ensuring that each partition contains a distinct subset of the data.
Partition key: The partition key is a column or a set of columns in the table used to determine the criteria for splitting the data into partitions. It is essential to choose an appropriate partition key to ensure even data distribution and optimal performance. The partition key should have a high degree of cardinality and should be relevant to the most common query patterns.
Scalability: Horizontal partitioning is an effective approach to addressing scalability issues in large-scale applications. By dividing data into smaller, more manageable partitions, it can help to reduce the load on the database system, thereby enhancing overall performance.
Query performance: Partitioning can significantly improve query performance, particularly for large datasets. Queries that involve a specific partition can be executed faster, as they only need to access the relevant partition instead of scanning the entire table. Furthermore, query optimizers can leverage partitioning metadata to optimize query execution plans, further enhancing performance.
Distributed systems: Horizontal partitioning is particularly beneficial in distributed database systems, where data can be spread across multiple servers or clusters. This can help to balance the workload, enhance fault tolerance, and provide better data locality for query processing.
Manageability: Partitioning can simplify database maintenance tasks, such as backups, indexing, and updates. By working with smaller partitions instead of the entire table, these tasks can be performed more efficiently, reducing the impact on overall system performance.
Flexibility: Horizontal partitioning allows for flexible partition management. Partitions can be added, removed, or modified independently without affecting other partitions or the entire table. This flexibility facilitates changes in the database schema or data distribution as the application requirements evolve.
Horizontal partitioning is a critical technique in database design that can enhance performance, scalability, and manageability. By dividing a table into smaller partitions based on a partition key, it enables efficient query execution, improved resource utilization, and easier maintenance in large-scale applications.
You should consider partitioning a table horizontally when the table grows so large that searches and joins using the table become unacceptably slow. For example, your database might contain a table listing every order placed by Stories on CD.
One technique for reducing the size of a table accessed by a query is to split it into two or more tables with the same columns and to
allocate the rows to different tables according to some criteria. In effect we are defining and implementing subtypes.
For example, although it might make sense to include historical data in the same table as the corresponding current data, it is likely that different queries access current and historical data. Placing current and historical data in
different tables with the same structure will certainly improve the performance
of queries on current data. You may prefer to include a copy of the current data in the historical data table to enable queries on all data to be written without the UNION operator. This is duplication rather than splitting.
Partition by Date
Since Stories on CD is a successful, growing business, the table will most likely grow to an unmanageable size quite quickly.
Even though all of the order data is important, you may find that Stories on CD employees most frequently access records for orders placed over the previous three months. Rather than store all orders in a single table, you could choose to partition the
table horizontally so that only those orders from a given three-month period (quarter) are stored in a single table.
Partitioning the Orders table in this manner would reduce the size of the table the RDBMS would need to search while still meeting most of the needs of the users .The next lesson discusses the disadvantages of horizontal partitioning.
When placing a table in first normal form, one ends up with one large table with one primary key. Likewise with graphical normalization, we end up with on large graph/diagram after bringing the known attributes to first normal form.