Why are general collection returns fixed to Array?

I was about to type up a question, but thanks to the post suggestion links, it seems that I already asked it!

So, why am I still writing this? Because I'm wondering about a variant of the idea. That time was about Sequence.map, but now I'm wondering about similar methods around the Standard Library in general. The example by @Ben_Cohen in the old thread had the use of lazy, but what about algorithms that aren't lazy? If I wanted this:

public func firstMatch<Pattern: Collection>(for pattern: Pattern, by areEquivalent: (Element, Pattern.Element) throws -> Bool) rethrows -> [Element]? {
    let rawResult: CustomSequence = // Put my awesome algorithm here
    return Array(rawResult)
}

in the Standard Library someday, why shouldn't I generalize it to:

public func firstMatch<Pattern: Collection, Result: RangeReplaceableCollection>(for pattern: Pattern, storingAs: Result.Type, by areEquivalent: (Element, Pattern.Element) throws -> Bool) rethrows -> [Element]? where Result.Element == Element {
    let rawResult: CustomSequence = // Put my awesome algorithm here
    return Result(rawResult)
}

public func firstMatch<Pattern: Collection>(for pattern: Pattern, by areEquivalent: (Element, Pattern.Element) throws -> Bool) rethrows -> [Element]? {
    return try firstMatch(for: pattern, storingAs: Array.self, by: areEquivalent)
}

? It should be more efficient than always returning an Array then copying it over. Was the design of the compiler and/or Standard Library in older days insufficient to allow this?

The hypothetical implementation of your algorithm would return a Self.SubSequence—which, like lazy, has allocated almost nothing. Then any call site that needs it converted to any particular collection type would just do MyCollection(original.firstMatch(...)).

This pattern is used fairly consistently:

  • subscript(bounds:)Self.SubSequence
  • lazyLazyCollection
  • reversed()ReversedCollection
  • joined()FlattenCollection
  • zip(_:_:)Zip2Sequence
  • enumerated()EnumeratedSequence

In case it is helpful, you can also see an actual deployment of the very algorithm you mention here. The definition of the related return type is here.

Oof, I left out in my copy & paste that these methods are on Sequence, not Collection. So I have to copy out the elements to a RangeReplaceableCollection. The Standard Library techniques that need to do this always use Array, but I see no reason why (now) it must be fixed like that. Whether to keep the return container customizable is the question.

BTW, the method doesn't need to originate on Collection, because of the firstRange(matching:) I have instead, which returns a Range<Index>?, and which I can get the SubSequence from there (since collections are multi-pass). The firstRange method uses firstMatch on indices with a predicate customized with an extra layer of indirection.

I updated the related Gist to include, as the new primary overload, the Sequence.firstMatch<Pattern, Result>(for: storingAs: by:) method, which takes in a meta-type parameter for the return type.

To finally answer your question regarding Sequence.reversed(), Sequence.sorted() and others which need storage the sequence itself lacks:

Array is a logical turn‐to for such storage. I doubt there is a reason customization is not available. Likely it is just that you are the first person to (a) think of it, (b) have a use for it, (c) consider the possible performance improvement worth the extra API surface, and (d) actually come ask about it.

Pitch it if you’d like.

And I just updated the Gist again to add the Equatable variant (Sequence.firstMatch<Pattern, Result>(for: storingAs:)).