extension LazyCollectionProtocol {
/**
Returns all subsequences of this collection that match the given collection, *i.e.* equivalent elements in the same order, using the given predicate as the equivalence test, lazily vended.
- Precondition: `areEquivalent` must be an equivalence relation over the elements.
- Parameter pattern: A collection to compare to this collection.
- Parameter allowingOverlap: Whether or not to include matching subsequences that start before the previous one ends.
- Parameter areEquivalent: A predicate that returns `true` if its two arguments are equivalent; otherwise, `false`.
- Returns: A lazy collection of the matching subsequences.
*/
public func allMatches<Pattern: Collection>(for pattern: Pattern, allowingOverlap: Bool, by areEquivalent: @escaping (Element, Pattern.Element) -> Bool) -> LazyMatchCollection<Elements, Pattern> {
return LazyMatchCollection(elements, pattern: pattern, allowingOverlap: allowingOverlap, by: areEquivalent)
}
}
I got a basic LazyMatchCollection
done. Like LazyFilterCollection
, it repeatedly generates startIndex
on demand, but this algorithm imposes a O(n * m) penalty instead of just O(n). I'll probably adapt it like my lazy split
collections and cache the first index to a class
type to get amortized O(1) performance. (I just tried a lazy
variable, but those are always considered mutating
, disqualifying it for properties that have to be usable in a get
mode.)
The next problem is trying to conditionally make it bi-directional. Like index(after:)
uses my firstRange(matching:)
function to get the next range, I could use lastRange(matching:)
to go backwards, but it's not that simple.
"121212124"
Let's look for "1212"
backwards. This is easy if I allow overlaps; just narrow the range to end one index before the last range's upper bound then use lastRange(matching:)
. But what happens if I ban overlaps? The last range is at 4..<8. The range before that is 2..<6, so I shouldn't use 4..<8 as my last index? Now try again, the 2..<6 range also has an overlap with 0..<4. There are no ranges before 0..<4, so that range must be in the collection. That means 2..<6 is nullified due to an overlap. So 4..<8 is included because none of the previous valid ranges overlap it, and the only ones that do overlap it are invalid.
So, to do a index(before:)
when banning overlaps, I might have to step all the way back to startIndex
. (Unless I'm missing a shortcut on how far I must go back at most.) If that's not acceptable, then I can't support backwards iteration and runtime-selectable overlap. (Bidirectional support can be conditionally added to a collection that is locked to allow overlaps.)
Should we allow BidirectionalCollection
conformance whenever possible, even if it sometimes hits hard on performance? Should a custom locked-to-always-permit-overlaps variant be made? Or do we go the other way and not support backwards traversal?