A partition is a division of a logical database or its constituent elements into distinct independent parts. Database partitioning is normally done for manageability, performance or availability[1] reasons, or for load balancing. It is popular in distributed database management systems, where each partition may be spread over multiple nodes, with users at the node performing local transactions on the partition. This increases performance for sites that have regular transactions involving certain views of data, whilst maintaining availability and security.
Current high-end relational database management systems provide for different criteria to split the database. They take a partitioning key and assign a partition based on certain criteria. Some common criteria include:
Country
is either Iceland
, Norway
, Sweden
, Finland
or Denmark
could build a partition for the Nordic countries.n
partitions, the i
th tuple in insertion order is assigned to partition (i mod n)
. This strategy enables the sequential access to a relation to be done in parallel. However, the direct access to individual tuples, based on a predicate, requires accessing the entire relation.
See also: Shard (database architecture) |
The partitioning can be done by either building separate smaller databases (each with its own tables, indices, and transaction logs), or by splitting selected elements, for example just one table.
Horizontal partitioning involves putting different rows into different tables. For example, customers with ZIP codes less than 50000 are stored in CustomersEast, while customers with ZIP codes greater than or equal to 50000 are stored in CustomersWest. The two partition tables are then CustomersEast and CustomersWest, while a view with a union might be created over both of them to provide a complete view of all customers.
Vertical partitioning involves creating tables with fewer columns and using additional tables to store the remaining columns.[1] Generally, this practice is known as normalization. However, vertical partitioning extends further, and partitions columns even when already normalized. This type of partitioning is also called "row splitting", since rows get split by their columns, and might be performed explicitly or implicitly. Distinct physical machines might be used to realize vertical partitioning: storing infrequently used or very wide columns, taking up a significant amount of memory, on a different machine, for example, is a method of vertical partitioning. A common form of vertical partitioning is to split static data from dynamic data, since the former is faster to access than the latter, particularly for a table where the dynamic data is not used as often as the static. Creating a view across the two newly created tables restores the original table with a performance penalty, but accessing the static data alone will show higher performance. A columnar database can be regarded as a database that has been vertically partitioned until each column is stored in its own table.