Heaps (structure, not storage) and Priority Queues

Maybe we can add this to improve the default container set....

Continuing the discussion from Computations, and where to save resources, are hard:

I pulled out the heap-graph functionality and the priority queue type (that uses heap graphs) in a Gist. They are inspired by the functions and type from the C++ standard library.

  • HeapPriority.swift
    • Instead of assuming a max-heap, the heap operations take an enum parameter to indicate if you want a max-heap or min-heap. I think if you assume a max-heap, you can create a min-heap if you switch the arguments to your comparison predicate, but I'm not completely sure and I don't think that's a good user experience.
    • That HeapPriority type includes an extension method to compare a parent heap node to a child node to make sure the invariant is kept, or if the two nodes should be swapped. I'm not sure the method should be kept public.
  • Heap.swift
    • RandomAccessCollection.endIndexOfHeap(prioritizing: by:) is the equivalent to std::is_heap_until.
    • RandomAccessCollection.isHeap(prioritizing: by:) is the equivalent to std::is_heap.
    • There are default Comparable versions that assume comparison is <, but instead of taking a priority parameter, there are separate methods for min-heaps and max-heaps.
    • MutableCollection.partitionIntoHeap(at: prioritizing: by:) is the equivalent to push_heap.
    • MutableCollection.partitionOutHeapLead(prioritizing: by:) is the equivalent to pop_heap.
    • MutableCollection.arrangeIntoHeap(prioritizing: by:) is the equivalent to make_heap.
    • MutableCollection.updateHeapLead(using: prioritizing: by:) is a new method to improving changing the lead element instead of a pop and re-push.
    • MutableCollection.sortOutHeap(prioritizing: by:) is the equivalent to sort_heap. That method should have its documentation improved to mention that the priority parameter determines whether the result is sorted or reverse-sorted.
    • These methods have default Comparable versions, with separate variants for min-heaps and max-heaps.
  • PriorityQueue.swift
    • The SomePriorityQueue generic type is the main client of the heap-graph methods.
    • The PriorityQueue type-alias wraps the first type with the most common container type, Array. C++ can specify its equivalent's container type at compile time. This is a problem brought up here before. I don't if a (SomeReallyGeneric, SomewhatGeneric) pair of types are the way we should go to solve this issue.
    • You can initialize a SomePriorityQueue with a collection to wrap, priority direction, and a comparison predicate. The collection must be mutable and random-access. You can't add nor remove elements, but you can always: inspect the lead element (if any), get an element count, get an is-empty check, return a sorted copy of the queue, and update the lead element in-place. That last method may move the lead after the update to make it a non-lead element.
    • If the wrapped collection is also range-replaceable, you can also: push an element, push a series of elements, pop the lead element, and remove one or more lead elements, or all the elements. The element capacity can be set in advance. Such a priority queue can be initialized with a sequence of initial elements (including none), a priority direction, and a comparison predicate.
    • There's also debug-level printing.
1 Like

What's the value of allowing heaps to be implemented on top of arbitrary mutable random access collections, as opposed to simply having a canonical implementation.

4 Likes

I wanted to have minimal limitations; i.e. because I saw no reason to ban the arbitrariness. Your linked example can't even take custom predicates.

And I wanted to have analogues to the C++ functions.

The reason to ban the arbitrariness is to avoid ballooning the API surface of the collection protocols. The effect of extending the collection protocols is to require that all collections provide API surface to be heaps, even though the vast majority of collections will not be heaps. It's the same reason to provide a Dictionary type rather than having Dictionary-type methods that can use an arbitrary RandomAccessCollection & RangeReplaceableCollection to be a Dictionary.

If it's considered useful to allow any arbitrary RandomAccessCollection & MutableCollection to be a heap, then that's fine, we can do so by having some HeapWrapper data structure that is generic over a backing storage, and then use typealias Heap<T> = HeapWrapper<ArrayT>>. But I still don't know that I am convinced of the value of allowing arbitrary backing storage: does anyone meaningfully heapify anything that isn't an array?

Sure, but I didn't ask about the custom predicates because I can see clear value in having that. I only asked about supporting arbitrary mutable random access collections.

Cool, that makes sense. I think "have analogues to the C++ standard library" is a non-goal for the Swift standard library though.

2 Likes

Something like the SomePriorityQueue generic type, perhaps? Would it be better to not publicize the heap-graph methods and keep them internal? Should we make wrapping a given collection into SomePriorityQueue the only way to access this functionality?

Yes, I know that Array will most likely be the storage for a priority queue 99% of the time, but I want to leave the options open. That why I also provided the PriorityQueue type-alias. For performance sake, I can see someone (like Mac or iPhone developers) using Data as the container, so we still need to keep SomePriorityQueue around.

I would normally say yes, yeah.

1 Like

I recalled why I haven't done it already. The site where I reacquaint myself with the C++ heap functions and priority queues had a page about keeping the heap functions. The main example he showed could be done with an accessor that is a Collection of priority sub-queues. I don't know about the example in the last-minute edit; I need to research Dijkstra’s algorithm more to know if it can be done with priority queues only (or some other way that doesn't need the heap functions exposed to the public).