Working on another part of the C++ Standard Library to import.
Continuing the discussion from Computations, and where to save resources, are hard:
The set-subtraction functionality is actually part of a mini-framework. I uploaded it to a Gist.
MergedSortedSequence.swift
Sequence.whileMergingSorted(with: do: by:)
is the core method. Both the receiver and the sequence passed as the first argument must be sorted according to the predicate passed into the third argument. The second argument is an action closure. The method reads from both sequences and passes to the action closure a value with one element from each sequence if they're equivalent, or a value with one element that's the least-ordered from the current front of the receiver or second sequence. The passed value is of a custom MergedSetElementSource
generic enum
type. If you prefer a pull interface instead of a push one, the MergeSortedSequence
generic type is a lazy sequence (and collection if both sequences are collections too) that walks its input sequences and vends MergedSetElementSource
values.
Segregating by elements that are exclusive to the first sequence, second sequence, or are shared is the key to all of the functions in this framework.
Since I didn't need it for my prime sieve research, I didn't directly provide an analogue to the C++ function std::merge
, but the methods above can be used to make such a method. Just push-out/transform the source values the same order they came in, stripping out the MergedSetElementSource
wrapper. For .shared
values, output two values, the first member, then the second, if you want an interleaved merge-sort. But we usually want to have all the equivalent elements from the second sequence to follow those from the first; you have to output only the values from the first sequence and queue those from the second until the next upcoming element has a greater value. I can write that method if we go further along in the process.
The Sequence.overlapIfMergedSorted(with: by:)
method is the analogue to the C++ std::includes
function. It is more precise than a simple inclusion Bool
; a custom TwoSetOverlap
enumeration type is returned that specifies the exact relationship between the two sequences. Right now, I'm not sure we need to keep its initializer public.
There are two similar functions that came out of the above method. The Sequence.sourceCountWhileMergingSorted(with: by:)
method gives the exact counts of the first-exclusive/shared/second-exclusive elements. The Sequence.whileMergingSortedConfirmFor
method is a quasi-dual where you submit what shape of first-exclusive/shared/second-exclusive layout you want and the method confirms if the sorted merger would match. I'm not sure if either of these should go in our standard library.
All the methods that take a comparison predicate have a default Comparable
version.
SetOperations.swift
The SetOperation
enumeration type is a dual to TwoSetOverlap
. While TSO
is for reading combinations, SO
is for writing them. Those operations are called by the Sequence.mergeSortedAsSets(with: keeping: by:)
method. It has variants for overriding the returned collection, working lazily, or using Comparable
by default. The lazy-sequence version is a collection when both sorted sequences are collections. This enum
and method set combined are analogues to std::set_difference
, std::set_intersection
, std::set_symmetric_difference
, std::set_union
, and then some.
I wanted to use one enumeration type for set combinations, but SO
and TSO
have slightly different operations. And the enum
case
names are vastly different, so I don't see how I can merge them and still keep the names making sense.