[Pitch] Make Collection Super-Convenient and Retire Sequence

Alexandre_Lopoukhine · May 5, 2018, 1:43pm

I'm all for this unification, but please call it Sequence instead of Collection. The whole point of the thing is that it has an order, like all sequences do. The name Collection could then be reused for something more appropriate, maybe something that isn't iterable at all.

dabrahams · May 5, 2018, 2:13pm

Well, I tried, but at the moment I can't even get past step 1. Just adding this, which should be benign, causes the standard library to not compile.

palimondo · June 3, 2018, 12:33am

I am strongly in support of doing something about Sequence, but I’ve arrived at this position in a different way from Dave and I have a different set of concerns about the status quo as well as about practical impact of the proposal presented here.

It is rather long, so you might want some tasty beverage to wash this down:

dabrahams · June 10, 2018, 7:40pm

Well, I found a workaround, so the work is currently unblocked. Until SR-7605 is fixed, we will have the compiler deducing Index = DefaultIndex<Self> for any Collection without an explicit nested Index type. After it is fixed, though, the picture changes slightly: either we make Sequence a refinement of Collection that supplies DefaultIndex<Self> as a default, or Sequence = Collection and we have to migrate existing Sequence instances to include typealias Index = DefaultIndex<Self>.

mpangburn · June 10, 2018, 7:55pm

If the behavior of SR-7605 is desirable, would a default associatedtype suffice?

protocol Collection {
    // ...
    associatedtype Index: Comparable = DefaultIndex<Self>
    // ...
}

hlovatt · June 10, 2018, 8:57pm

I like @mpangburn solution, I have had a lot of problems with type inference of associated types and generally use explicit type aliases or defaults to overcome the problems.

cloutiertyler · June 10, 2018, 9:12pm

FWIW, single pass streams are very common in network programming. Asynchronous stream processing is a large part of reactive and functional reactive programming for example. There's also all sorts of interesting problems with buffering and back-pressure as well (Chris mentioned bounded queue depths in his concurrency manifesto Swift Concurrency Manifesto · GitHub for example). I would kill for better stream support in Swift. It would be awesome if they could be an extension to the async/await constructs Chris talks about in Concrete proposal for async semantics in Swift · GitHub.

As @anandabits points out though we should be careful to distinguish between pull streams (synchronous) and push streams (asynchronous).

(Sorry for being kind of off topic, I'm reading chronologically.)

dabrahams · June 10, 2018, 11:03pm

Good question. There are plenty of collections that look something like:

struct X : RandomAccessCollection {
  let startIndex = 0
  let endIndex = 3
  subscript(_ x: Int) -> Bool { return x > 2 }
}

and for these to be deduced as having Index = Int the compiler would have to prefer deducing Index from the startIndex/endindex/subscript declarations over deducing it from an explicit default. But the compiler's type deduction rules are…somewhat… underspecified , so… maybe? I think we'd need to ask @Douglas_Gregor

dabrahams · June 14, 2018, 10:02pm

Update: @nnnnnnnn found a great workaround and we are even able to declare the associated type default without breaking any tests. See [NOMERGE] Make Sequence be a typealias for Collection by dabrahams · Pull Request #17220 · apple/swift · GitHub

palimondo · June 15, 2018, 5:42am

What happens to LazySequence in the brave new world sketched in this proposal?

dabrahams · June 15, 2018, 5:43am

typealias LazySequence = LazyCollection

ersump'n.

CTMacUser · June 17, 2018, 9:09pm

I need time to write a full response, but I completely disagree with this premise. Sequence and Collection are separate concepts; neither is a special case of the other, and neither makes the other obsolete; we need both.

The only commonality is that Sequence and Collection need to be processed in for-in statements. The mistake we made here was insisting on One True Way to make something be able to be processed by for-in. So, an alternate solution is for for-in to follow two paths:

For both sequences and collections that also conform to Sequence, we do what we currently do.
For collections that are not also sequences (e.g. Set and Dictionary after we change the hierarchy), we call indices.makeIterator(). We'll insist that indices is a Sequence, even if the corresponding collection isn't. For these un-sequenced collections, Index only has to be Equatable, not also Comparable. Collections that could have multiple ways to iterate through there elements could provide several make*Iterator functions, but would have to canonize one for for-in.

(The inspiration for this came from remembering one of my fixed-sized array ideas. FSAs wouldn't conform to Sequence, but still would be processable through for-in. That breaks the rules of for-in's One True Way, modifying its rules simply by fiat. When I saw this thread, I wondered if that idea could be applied here.)

jawbroken · June 17, 2018, 10:57pm

Perhaps I need to wait for the full response, but I'm not sure what you're saying here. If there's one canonical way to iterate through the elements of something then in what way is it not a Sequence? And why wouldn't a fixed-size array be a Sequence? I remain very negative about the idea that you should be able to use something in a for-in loop but pretend it isn't a Sequence by introducing a more complex protocol hierarchy. This matter also seems somewhat unrelated to the topic of this thread, in that you could still split up Collection to stop Set and Dictionary from conforming to it, or move conformance to a view/property, in the future proposed in this thread.

Tino · June 18, 2018, 6:15am

I strongly hope this is going on, but one aspect that disturbs me a little bit with Set and Dictionary not being Collections anymore is that there won't be any common type that groups all collections / containers.
Technically, there is no reason there has to be such a supertype, but for someone coming from another language, it might be quite irritating to have collections that aren't Collections...

Dante-Broggi · June 18, 2018, 10:30am

Another thing that would benefit from having a non-Sequence based Collection-esque super-protocol is bi-(or higher)-axial-collections. That is have a protocol for things which have Elements and can vend Sequence views onto those elements by row, column, etc. Like matrices.

CTMacUser · June 22, 2018, 3:12am

For something that IS-A sequence, the relative order of the elements is part of the whole object’s state. For containers that ARE-NOT-A sequence, there is no relative order conceptually, but we have to pick something practically in order for for-in to work. If a container does have a canonical linear order, then it should probably publicly be a Sequence; my use of “canonical” in the previous post was for implementing, not for modeling.

I’ve been imagining fixed-sized arrays as structural/compound types, so they can’t conform to any protocol. Even if we add protocol support to compound types, I think there were subtleties to how my FSA interface worked that made conforming to Sequence by default a bad idea. One way I gave support for Sequence operations was to have a global Standard Library function to flatten a FSA to a Unsafe(Mutable)BufferPointer in a closure.

Multi-axial containers, like @Dante-Broggi mentioned is his/her response are an example of something that could be iterated over all elements, but not necessarily have a model-level linear order. (They may have an implementation-level linear order.) Besides user-defined types, multi-dimensional FSAs are another example of multi-axial types.