In computer science, a heap is a tree-based data structure that satisfies the heap property: In a max heap, for any given node C, if P is a parent node of C, then the key (the value) of P is greater than or equal to the key of C. In a min heap, the key of P is less than or equal to the key of C.^{[1]} The node at the "top" of the heap (with no parents) is called the root node.
The heap is one maximally efficient implementation of an abstract data type called a priority queue, and in fact, priority queues are often referred to as "heaps", regardless of how they may be implemented. In a heap, the highest (or lowest) priority element is always stored at the root. However, a heap is not a sorted structure; it can be regarded as being partially ordered. A heap is a useful data structure when it is necessary to repeatedly remove the object with the highest (or lowest) priority, or when insertions need to be interspersed with removals of the root node.
A common implementation of a heap is the binary heap, in which the tree is a complete^{[2]} binary tree (see figure). The heap data structure, specifically the binary heap, was introduced by J. W. J. Williams in 1964, as a data structure for the heapsort sorting algorithm.^{[3]} Heaps are also crucial in several efficient graph algorithms such as Dijkstra's algorithm. When a heap is a complete binary tree, it has the smallest possible height—a heap with N nodes and a branches for each node always has log_{a} N height.
Note that, as shown in the graphic, there is no implied ordering between siblings or cousins and no implied sequence for an in-order traversal (as there would be in, e.g., a binary search tree). The heap relation mentioned above applies only between nodes and their parents, grandparents, etc. The maximum number of children each node can have depends on the type of heap.
Heaps are typically constructed in-place in the same array where the elements are stored, with their structure being implicit in the access pattern of the operations. Heaps differ in this way from other data structures with similar or in some cases better theoretic bounds such as Radix trees in that they require no additional memory beyond that used for storing the keys.
The common operations involving heaps are:
Heaps are usually implemented with an array, as follows:
For a binary heap, in the array, the first index contains the root element. The next two indices of the array contain the root's children. The next four indices contain the four children of the root's two child nodes, and so on. Therefore, given a node at index i, its children are at indices and , and its parent is at index ⌊(i−1)/2⌋. This simple indexing scheme makes it efficient to move "up" or "down" the tree.
Balancing a heap is done by sift-up or sift-down operations (swapping elements which are out of order). As we can build a heap from an array without requiring extra memory (for the nodes, for example), heapsort can be used to sort an array in-place.
After an element is inserted into or deleted from a heap, the heap property may be violated, and the heap must be re-balanced by swapping elements within the array.
Although different types of heaps implement the operations differently, the most common way is as follows:
Construction of a binary (or d-ary) heap out of a given array of elements may be performed in linear time using the classic Floyd algorithm, with the worst-case number of comparisons equal to 2N − 2s_{2}(N) − e_{2}(N) (for a binary heap), where s_{2}(N) is the sum of all digits of the binary representation of N and e_{2}(N) is the exponent of 2 in the prime factorization of N.^{[7]} This is faster than a sequence of consecutive insertions into an originally empty heap, which is log-linear.^{[a]}
Here are time complexities^{[8]} of various heap data structures. Function names assume a max-heap. For the meaning of "O(f)" and "Θ(f)" see Big O notation.
Operation | find-max | delete-max | insert | increase-key | meld |
---|---|---|---|---|---|
Binary^{[8]} | Θ(1) | Θ(log n) | O(log n) | O(log n) | Θ(n) |
Leftist | Θ(1) | Θ(log n) | Θ(log n) | O(log n) | Θ(log n) |
Binomial^{[8]}^{[9]} | Θ(1) | Θ(log n) | Θ(1)^{[b]} | Θ(log n) | O(log n) |
Skew binomial^{[10]} | Θ(1) | Θ(log n) | Θ(1) | Θ(log n) | O(log n)^{[c]} |
Pairing^{[11]} | Θ(1) | O(log n)^{[b]} | Θ(1) | o(log n)^{[b]}^{[d]} | Θ(1) |
Rank-pairing^{[14]} | Θ(1) | O(log n)^{[b]} | Θ(1) | Θ(1)^{[b]} | Θ(1) |
Fibonacci^{[8]}^{[15]} | Θ(1) | O(log n)^{[b]} | Θ(1) | Θ(1)^{[b]} | Θ(1) |
Strict Fibonacci^{[16]} | Θ(1) | O(log n) | Θ(1) | Θ(1) | Θ(1) |
Brodal^{[17]}^{[e]} | Θ(1) | O(log n) | Θ(1) | Θ(1) | Θ(1) |
2–3 heap^{[19]} | Θ(1) | O(log n)^{[b]} | Θ(1)^{[b]} | Θ(1) | O(log n) |
The heap data structure has many applications.
java.util.PriorityQueue
in the Java Collections Framework. This class implements by default a min-heap; to implement a max-heap, programmer should write a custom comparator. There is no support for the replace, sift-up/sift-down, or decrease/increase-key operations.