Weak heap


A weak heap is a combination of the binary heap and binomial heap data structures for implementing priority queues. It can be stored in an array as an implicit binary tree like the former, and has the efficiency guarantees of the latter.
Weak-heapsort uses fewer comparisons than most other algorithms, close to the theoretical lower limit, so is particularly useful when comparison is expensive, such as when comparing strings using the full Unicode collation algorithm.

Description

A weak heap is most easily understood as a heap-ordered multi-way tree stored as a binary tree using the "right-child left-sibling" convention.
In the multi-way tree, and assuming a max-heap, each parent's key is greater than or equal to all the child keys.
Expressed as a binary tree, this translates to the following invariants:
The last condition is a consequence of the fact that an implicit binary tree is a complete binary tree.
The structure of this tree maps very neatly onto the traditional -based implicit binary tree arrangement, where node has a next sibling numbered and a first child numbered, by adding an additional root numbered. This root has no siblings, only a first child, which is node .
This structure is very similar to that of a binomial heap, with a tree of height being composed of a root plus trees of heights,,...,. A perfect weak heap with elements is exactly isomorphic to a binomial heap of the same size, but the two algorithms handle sizes which are not a power of differently: a binomial heap uses multiple perfect trees, while a weak heap uses a single imperfect tree.
Weak heaps require the ability to exchange the left and right children of a node. In an explicit representation of the tree, this is straightforward. In an implicit representation, this requires one "reverse bit" per internal node to indicate which child is considered the left child. A weak heap is thus not a strictly implicit data structure since it requires additional space. However, it is often possible to find space for this extra bit within the node structure, such as by tagging a pointer which is already present.
In the implicit binary tree, node with reverse bit has parent, left child, and right child.
Viewed as a multi-way tree, each node in a weak heap is linked to two others: a "next sibling" and a "first child". In the implicit tree, the links are fixed, so which of the two links is the sibling and which the first child is indicated by the reverse bit.

Operations on weak heaps

Note that every node in a weak heap can be considered the root of a smaller weak heap by ignoring its next sibling. Nodes with no first child are automatically valid weak heaps.
A node of height has children: a first child of height, a second child of height, and so on to the last child of height. These may be found by following the first child link and then successive next sibling links.
It also has next siblings of height,, etc.
A node's parent in the multi-way tree is called its "distinguished ancestor". To find this in the binary tree, find the node's binary parent. If the node is the right child, the parent is the distinguished ancestor. If the node is the left child, its distinguished ancestor is the same as its binary parent's. In the implicit tree, finding the binary parent is easy, but its reverse bit must be consulted to determine which type of child the node is.
Although the distinguished ancestor may be levels high in the tree, the average distance is. Thus, even a simple iterative algorithm for finding the distinguished ancestor is sufficient.
Like binomial heaps, the fundamental operation on weak heaps is merging two heaps of equal height, to make a weak heap of height. This requires exactly one comparison, between the roots. Whichever root is greater is the final root. Its first child is the losing root, which retains its children. The winning root's children are installed as siblings of the losing root.
This operation can be performed on the implicit tree structure because the heaps being merged are never arbitrary. Rather, the two heaps are formed as part of sifting a node up the multi-way tree:
At the beginning, the heap invariants apply everywhere except possibly between the first root and its distinguished ancestor. All other nodes are less than or equal to their distinguished ancestors.
After comparing the two roots, the merge proceeds in one of two ways:
  1. Nothing needs to be moved, and the result of the merge is the distinguished ancestor.
  2. The first root's binary children are exchanged, and then the first root and its distinguished ancestor are exchanged.
The second case works because, in the multi-way tree, each node keeps its children with it. The first root is promoted up the tree because it is greater than its distinguished ancestor. Thus, it is safely greater than all of the ancestor's previous children.
The previous ancestor, however, is not a safe parent for the first root's old children, because it is less than the first root and so it's not guaranteed to be greater than or equal to all of its children.
By swapping the binary children, the appropriate subset of the demoted ancestor's old children are demoted with it. The demoted ancestor's new siblings are the first root's old children, promoted, which are safely less than or equal to the promoted first root.
After this operation, it is uncertain whether the invariant is maintained between the new distinguished ancestor and its distinguished ancestor, so the operation is repeated until the root is reached.

Weak-heap sort

Weak heaps may be used to sort an array, in essentially the same way as a conventional heapsort. First, a weak heap is built out of all of the elements of the array, and then the root is repeatedly exchanged with the last element, which is sifted down to its proper place.
A weak heap of elements can be formed in merges. It can be done on various orders, but a simple bottom-up implementation works from the end of the array to the beginning, merging each node with its distinguished ancestor. Note that finding the distinguished ancestor is simplified because the reverse bits in all parents of the heaps being merged are unmodified from their initial state, and so do not need to be consulted.
As with heapsort, if the array to be sorted is larger than the CPU cache, performance is improved if subtrees are merged as soon as two of the same size become available, rather than merging all subtrees on one level before proceeding to the next.
Sifting down in a weak heap can be done in comparisons, as opposed to for a binary heap, or for the "bottom-up heapsort" variant. This is done by "merging up": after swapping the root with the last element of the heap, find the last child of the root. Merge this with the root, resulting in a valid height-2 heap at the global root. Then go to the previous sibling of the last merged node, and merge again. Repeat until the root is reached, when it will be correct for the complete tree.

Priority queue operations

In a weak max-heap, the maximum value can be found as the value associated with the root node; similarly, in a weak min-heap, the minimum value can be found at the root.
As with binary heaps, weak heaps can support the typical operations of a priority queue data structure: insert, delete-min, delete, or decrease-key, in logarithmic time per operation.
Sifting up is done using the same process as in binary heaps. The new node is added at the leaf level, then compared with its distinguished ancestor and swapped if necessary. This is repeated until no more swaps are necessary or the root is reached.
Variants of the weak heap structure allow constant amortized time insertions and decrease-keys, matching the time for Fibonacci heaps.

History and applications

Weak heaps were introduced by, as part of a variant heap sort algorithm that could be used to sort items using only comparisons. They were later investigated as a more generally applicable priority queue data structure.