Does this type satisfy the requirements of Sequence and IteratorProtocol?

I don't have time to look into this that deeply, but I'll try to answer these. Please forgive and correct any errors.

1. I'm not sure of what it means to be destructively consumed, especially when you start mixing in random sequences for which it doesn't seem to be quite the correct term. I haven't thought about it extensively, but I would guess the more relevant distinction here would be that you get the same sequence of elements every time you iterate over the sequence (call makeIterator, then next repeatedly).

2. Sure, though I'm not sure of the details of the lazy implementation to answer that specific part.

3. a. Sure, I agreed that it only makes sense to do a single “operation” (i.e. iterate over a Sequence), but not to do multiple, in a generic context where you only know you have a Sequence.

3. b. Some of this is more complex because, as far as I recall, prefix returns the SubSequence, so I would guess that the SubSequence may be multipass even if the Sequence isn't, etc. That aside, it's clear that prefix, contains and the Array initialiser would all have to iterate over the sequence.

1 Like

See the section Repeated Access in the documentation for Sequence.

The following is an example of a sequence that is destructively consumed by iteration:

import Foundation

struct RandomNumberGenerator : Sequence, IteratorProtocol {
    mutating func next() -> Int? {
        return Int(truncatingIfNeeded: arc4random()) & 7
        // NOTE: For demonstration purposes / increased readability, the domain
        // of this random sequence is Int values in the closed range [0, 7], ie
        // we use only the three least significant bits instead of all 64(/32).
    }
}

let firstThree = RandomNumberGenerator().prefix(3)
print(firstThree.elementsEqual(firstThree)) // Most probably false.
if firstThree.contains(7) {
    print("Contains at least one value that is 7!")
    print(Array(firstThree)) // Might be eg [2, 3, 5] // Note no 7.
    print(Array(firstThree)) // Might be eg [1, 4, 1]
} else {
    print("Contains no value that is 7!")
    print(Array(firstThree)) // Might be eg [7, 6, 6] // Note the 7.
    print(Array(firstThree)) // Might be eg [6, 6, 3]
}

And, the following is too (including it here just to show that random isn't necessary for potential confusion):

struct Ascender : Sequence, IteratorProtocol {
    private static var value = 0
    mutating func next() -> Int? {
        defer { Ascender.value += 1 }
        return Ascender.value
    }
}

let firstThree = Ascender().prefix(3)
print(firstThree.elementsEqual(firstThree))
if firstThree.contains(7) {
    print("Contains at least one value that is 7!")
    print(Array(firstThree))
    print(Array(firstThree))
} else {
    print("Contains no value that is 7!")
    print(Array(firstThree))
    print(Array(firstThree))
}

As this latter example program will output:

false
Contains no value that is 7!
[5, 6, 7]
[8, 9, 10]

@nnnnnnnn, I remember reading somewhere in the Random Unification thread that you conformed distributions to Sequence in your Playground experiment, did you do some kind of evaluation as to whether the conformance to Sequence added more value than (highly probable) confusion (among a majority of users)?

Random values are sort of the prototypical example of a single-pass sequence for me. On the one hand, they show how confusing sequence methods can be with sequences that aren't multi-pass, but on the other hand, since they never run out and are (usually) independently distributed, that's less of a problem than your other examples. (That is, it's okay if you skip a number now and then — the resulting distribution is the same.)

The NonrepeatingDistribution example (or whatever it was called) is an example where the elements are not independently distributed — with that one if you miss elements you lose the non-repeating guarantee.

If there's an effort to make Sequence multi-pass only, such that random distributions would be able to conform to IteratorProtocol, one of the big questions would be how to interact with iterators in a streamlined way. Directly using for-in? Would we want a next(10) method or something that gives us back a group of values at a time? Etc.

2 Likes

My gut instinct is that you should be able to use IteratorProtocol directly with for-in, in addition to being able to do so on Sequences.

If I understand it correctly, that is raison d’être of the IteratorProtocol. It is the thing that makes something usable in for-in loop.

No, for-in loops requires Sequence (and the makeIterator() method that it requires).

This looks like a it’s touching on the subtle point about protocol's semantics (not just a bag of syntax). But reading the docs for associated type SubSequence doesn’t fully answer this to me. @nnnnnnnn, how would one translate this into prose? The part = AnySequence confuses me:

associatedtype SubSequence : Sequence = AnySequence<Self.Element> where 
  Self.SubSequence == Self.SubSequence.SubSequence

The default implementations on Sequence conform literally, (returning AnySequence), but conformances on Collections conform in spirit, returning same type as the collection you called this method on...

1 Like

Are you sure about that?
There definitely is an implicit conversion that makes anything conforming to IteratorProtocol also a Sequence, erasing the practical distinction. I’m pretty sure, but can’t find definitive source, that the IteratorProtocol is the conformance requirement for for-in.

Definitive source:

struct I: IteratorProtocol {
    mutating func next() -> Int? {
        return 1234
    }
}

let iterator = I()
for e in iterator { // <-- Error: Type 'I' does not conform to protocol 'Sequence'
    print(e)
}

: )

The = AnySequence<Self.Element> part of that declaration is specifying the default type for SubSequence. The remaining pieces (: Sequence and where Self.SubSequence == Self.SubSequence.SubSequence) are constraints on whatever type can play the role of a concrete subsequence, default or not.

Re sequences vs iterators—the Sequence protocol is what marks a type as being something you can use in a for-in statement, while IteratorProtocol is how the for-in statement actually works:

// this:
for x in mySequence { 
    print(x)
}

// gets essentially turned into this:
var iterator = mySequence.makeIterator()
while let x = iterator.next() {
    print(x)
}

If you have a type that's an iterator, you can just declare Sequence conformance, since a type that's both a sequence and an iterator just returns itself from makeIterator().

1 Like

Ah, OK. I have apparently inverted in my had this part from documentation:

Alternatively, if your type can act as its own iterator, implementing the requirements of the IteratorProtocol protocol and declaring conformance to both Sequence and IteratorProtocol are sufficient.

2 Likes

This is not the case though, unless special care has been taken to design a custom SubSequence for this purpose, and AFAICS the only way for a SubSequence (or any other code construct for that matter) to make a multi-pass sequence from a single-pass sequence is to store all the elements that need to be multi-pass.

So when reading or writing code, not only do we have to keep in mind that there can be multi- or single-pass sequences, but also that a multi- or single-pass sequence can have a multi- or single-pass SubSequence.

I think this is yet another good example of how hairy this is, and how the hairiness only becomes fully visible once we actually start to work with and think about single-pass sequences.

I share the concern voiced by @dabrahams in the following quote from @dwaite's very informative summary of the previously mentioned mega thread on this topic from 2016:

Dave Abrahams’s primary concern about the existing system is that the restriction that Sequence is single-pass is not intuitive, and that generalized functions over sequences would be tested with the commonly-available Sequences, all of which are multi-pass.

Of course, I don't know if he has changed his mind about this since then.

Sure, like I said, I'm currently agnostic on the question of whether Sequence should be constrained to be multipass, I was just refuting your idea that “every such method and property should document that they are unsuitable/undefined for use on single-pass sequences”, which isn't true. If you want to change the semantic requirements of Sequence then I suggest you start a new evolution pitch thread with the proposed changes, whether they be to require that all Sequences be multipass, or that all single pass sequences should trap if makeIterator is called more than once, or something else.

1 Like

My current thoughts are here for anyone who wants to read them.

6 Likes

I feel that it’s a good thing we have separate concepts between element generators that:

  • Have a token per element and can revisit elements via said tokens.
  • Can provide elements without adding weight for an unnecessary (for a particular application) revisiting system.

Flipping the inheritance between Collection and Sequence would weigh down the latter. I like an idea from another thread to have the two protocols become siblings; to refine another protocol Iterable, which is what for-in loops will look for instead. This also allows Collection to support non-ordered Index types (They still need to be Equatable.), like for muti-dimensional containers and such.

4 Likes

What weight? It would have exactly zero cost to any use of a Sequence. As far as I can tell, there is no principle that supports having two distinct protocols for forward, multipass traversal.

It's relieving to see that the perceived oddities of Sequence seem to be substantial enough to bother the core team as well... but is there still a realistic option for changes that are more fundamental than tinkering with names of single methods?
Most people on evolution seem not to care about the hierarchy of collection protocols, and is it already tough to just start a serious discussion about alternatives (but I hope that has happened now ;-).
After all, I think it's an important topic, and with each release of Swift, the already overpowering argument of "we can't, because of compatibility" becomes even stronger (and this might be one of the strongest forces that render languages ugly).

Semantically, any stream of Ts can be captured in a generator function of the form ()->T?. Generators avoid the impression given by Sequence that it can be traversed without mutation and by IteratorProtocol that it can be independently copied and stored. If the need should arise to build generic systems over single-pass sequences, generators would almost work… except that, as non-nominal types, they can't be extended. Any algorithm over streams would have to be defined as a free function, which I think is unacceptable.

I think this idea shouldn't be discarded hastily:
Is extendability of iterators actually important? I can't think of a compelling example for extensions on arbitrary iterators, and the common operations you can do with them are rather trivial (after all, it's just a single method, so there aren't many combinations). I wouldn't say that my opinion is enough to remove an ability others may need, but what if no one really needs extensions on iterators?
And of course, I'm in favor of extensions for non-nominal types, so ideally, this would only be a temporary restriction ;-)

Let's continue this discussion over here. Thanks.

1 Like

Well, Sequence and Collection are really separate concepts; we just put them in the same hierarchy to support for-in for both. Connecting them in a refinement relationship means we have to adapt the more-derived protocol with the operations of its parent.

Right now, Collection is the more-derived protocol. We make it work with Sequence operations by giving it a default makeIterator() and the like that reads from the startIndex, uses index(after:) to go to the next element, then check against endIndex to know when we're done.

If we flip the relationship, that means we have to add operations inherent to Collection onto Sequence instead. So we have to designate a token value (and therefore a token type first, but most likely Int) for each element in the sequence, and then be able to derive an element's value based on the given token. If I make a type as a Sequence intentionally instead of as a Collection, it meant that I intentionally didn't want to support backtracking, and this new hierarchy would break that. The exception would be if there was default-implementation support for Collection-operations for Sequence, just like there is currently default Sequence-operation support in Collection.

But even if Collection-operation support was provided by default for Sequence, how would it be implemented besides caching the results of computing the sequence? If I deliberately chose to base my type off of Sequence instead of Collection, I probably wouldn't want to waste memory with caching by default either.

1 Like

This is now a non-issue since the proposal now suggests eliminating Sequence as a distinct protocol.