Why are general collection returns fixed to Array?

collection

(Daryle Walker) #1

I was about to type up a question, but thanks to the post suggestion links, it seems that I already asked it!

So, why am I still writing this? Because I'm wondering about a variant of the idea. That time was about Sequence.map, but now I'm wondering about similar methods around the Standard Library in general. The example by @Ben_Cohen in the old thread had the use of lazy, but what about algorithms that aren't lazy? If I wanted this:

public func firstMatch<Pattern: Collection>(for pattern: Pattern, by areEquivalent: (Element, Pattern.Element) throws -> Bool) rethrows -> [Element]? {
    let rawResult: CustomSequence = // Put my awesome algorithm here
    return Array(rawResult)
}

in the Standard Library someday, why shouldn't I generalize it to:

public func firstMatch<Pattern: Collection, Result: RangeReplaceableCollection>(for pattern: Pattern, storingAs: Result.Type, by areEquivalent: (Element, Pattern.Element) throws -> Bool) rethrows -> [Element]? where Result.Element == Element {
    let rawResult: CustomSequence = // Put my awesome algorithm here
    return Result(rawResult)
}

public func firstMatch<Pattern: Collection>(for pattern: Pattern, by areEquivalent: (Element, Pattern.Element) throws -> Bool) rethrows -> [Element]? {
    return try firstMatch(for: pattern, storingAs: Array.self, by: areEquivalent)
}

? It should be more efficient than always returning an Array then copying it over. Was the design of the compiler and/or Standard Library in older days insufficient to allow this?


A RingBuffer collection
Compiler wants to correct my closure signature to be generic?!
(Jeremy David Giesbrecht) #2

The hypothetical implementation of your algorithm would return a Self.SubSequence—which, like lazy, has allocated almost nothing. Then any call site that needs it converted to any particular collection type would just do MyCollection(original.firstMatch(...)).

This pattern is used fairly consistently:

  • subscript(bounds:)Self.SubSequence
  • lazyLazyCollection
  • reversed()ReversedCollection
  • joined()FlattenCollection
  • zip(_:_:)Zip2Sequence
  • enumerated()EnumeratedSequence

(Jeremy David Giesbrecht) #3

In case it is helpful, you can also see an actual deployment of the very algorithm you mention here. The definition of the related return type is here.


(Daryle Walker) #4

Oof, I left out in my copy & paste that these methods are on Sequence, not Collection. So I have to copy out the elements to a RangeReplaceableCollection. The Standard Library techniques that need to do this always use Array, but I see no reason why (now) it must be fixed like that. Whether to keep the return container customizable is the question.

BTW, the method doesn't need to originate on Collection, because of the firstRange(matching:) I have instead, which returns a Range<Index>?, and which I can get the SubSequence from there (since collections are multi-pass). The firstRange method uses firstMatch on indices with a predicate customized with an extra layer of indirection.


(Daryle Walker) #5

I updated the related Gist to include, as the new primary overload, the Sequence.firstMatch<Pattern, Result>(for: storingAs: by:) method, which takes in a meta-type parameter for the return type.


(Jeremy David Giesbrecht) #6

To finally answer your question regarding Sequence.reversed(), Sequence.sorted() and others which need storage the sequence itself lacks:

Array is a logical turn‐to for such storage. I doubt there is a reason customization is not available. Likely it is just that you are the first person to (a) think of it, (b) have a use for it, (c) consider the possible performance improvement worth the extra API surface, and (d) actually come ask about it.


(Jeremy David Giesbrecht) #7

Pitch it if you’d like.


(Daryle Walker) #8

And I just updated the Gist again to add the Equatable variant (Sequence.firstMatch<Pattern, Result>(for: storingAs:)).