Should a LazyMatchCollection be bi-directional when its base is?

collection

(Daryle Walker) #1
extension LazyCollectionProtocol {

    /**
     Returns all subsequences of this collection that match the given collection, *i.e.* equivalent elements in the same order, using the given predicate as the equivalence test, lazily vended.

     - Precondition: `areEquivalent` must be an equivalence relation over the elements.

     - Parameter pattern: A collection to compare to this collection.
     - Parameter allowingOverlap: Whether or not to include matching subsequences that start before the previous one ends.
     - Parameter areEquivalent: A predicate that returns `true` if its two arguments are equivalent; otherwise, `false`.

     - Returns: A lazy collection of the matching subsequences.
     */
    public func allMatches<Pattern: Collection>(for pattern: Pattern, allowingOverlap: Bool, by areEquivalent: @escaping (Element, Pattern.Element) -> Bool) -> LazyMatchCollection<Elements, Pattern> {
        return LazyMatchCollection(elements, pattern: pattern, allowingOverlap: allowingOverlap, by: areEquivalent)
    }

}

I got a basic LazyMatchCollection done. Like LazyFilterCollection, it repeatedly generates startIndex on demand, but this algorithm imposes a O(n * m) penalty instead of just O(n). I'll probably adapt it like my lazy split collections and cache the first index to a class type to get amortized O(1) performance. (I just tried a lazy variable, but those are always considered mutating, disqualifying it for properties that have to be usable in a get mode.)

The next problem is trying to conditionally make it bi-directional. Like index(after:) uses my firstRange(matching:) function to get the next range, I could use lastRange(matching:) to go backwards, but it's not that simple.

"121212124"

Let's look for "1212" backwards. This is easy if I allow overlaps; just narrow the range to end one index before the last range's upper bound then use lastRange(matching:). But what happens if I ban overlaps? The last range is at 4..<8. The range before that is 2..<6, so I shouldn't use 4..<8 as my last index? Now try again, the 2..<6 range also has an overlap with 0..<4. There are no ranges before 0..<4, so that range must be in the collection. That means 2..<6 is nullified due to an overlap. So 4..<8 is included because none of the previous valid ranges overlap it, and the only ones that do overlap it are invalid.

So, to do a index(before:) when banning overlaps, I might have to step all the way back to startIndex. (Unless I'm missing a shortcut on how far I must go back at most.) If that's not acceptable, then I can't support backwards iteration and runtime-selectable overlap. (Bidirectional support can be conditionally added to a collection that is locked to allow overlaps.)

Should we allow BidirectionalCollection conformance whenever possible, even if it sometimes hits hard on performance? Should a custom locked-to-always-permit-overlaps variant be made? Or do we go the other way and not support backwards traversal?