Does this type satisfy the requirements of Sequence and IteratorProtocol?

gwendal.roue · April 18, 2018, 12:16pm

Given a sequence of random elements, I think that you are pushing a little far the "reasonableness" (is is English?) of your statements.

Jens · April 18, 2018, 12:17pm

Note that it is a prefix of a sequence of random elements.

The prefix is also a sequence, and its documentation says that it:

Returns a subsequence, up to the specified maximum length, containing the initial elements of the sequence.

And note that a.elementsEqual(a) not always being true also applies to eg a Set.

EDIT: No it doesn’t, i’m obviously confused :-), but at least a == b doesn’t imply a.elementsEqual(b) for Set.

gwendal.roue · April 18, 2018, 12:26pm

You use the prefix of a lazy sequence of random elements. We've left the trivial zone a long time ago, don't you think? Are the "average users" still the people we need to talk about/with?

jawbroken · April 18, 2018, 12:30pm

Can you show an example here? Or cite the documentation? From what I thought I knew about the semantics of Collection, this isn't possible for any conforming type.

Jens · April 18, 2018, 12:31pm

No lazy there, I think it’s a trivial example.

Jens · April 18, 2018, 12:37pm

I realized this and edited at the same time as you wrote this, please see the edit.

jawbroken · April 18, 2018, 12:48pm

Ah, okay. Well, elementsEqual never implies ==, because even a type that conforms completely “normally” to Sequence/Collection conformance (has a controllable/meaningful order, finite, multipass, whatever) can have additional properties that an Equatable conformance may need to consider.

Nevin · April 18, 2018, 3:04pm

The prefix does not satisfy the semantic requirements of Sequence. In particular, the prefix’s *iterator* fails to obey the rule that, after it has returned nil once, it must continue to return nil forever.

Jens · April 18, 2018, 3:09pm

Interesting!

Nevin · April 18, 2018, 3:16pm

Wait, maybe I’m wrong about that. I think the prefix might satisfy that requirement after all. Nevermind.

Tino · April 18, 2018, 3:18pm

let aa = Random.default.prefix(3)
let iter = aa.makeIterator()
for _ in 1...10 {
	print(iter.next())
}

I think it's always a fresh iterator... at least, this snippet prints 7 nil

Jens · April 19, 2018, 1:02pm

I found another old thread (started by @Lily_Ballard) that contains an interesting discussion related to this, after some posts the same random generator example as in this thread is even brought up (by @SusanCheng).

It might be interesting to note that since that discussion, first(where:) has been proposed and implemented, making it trivial to implement first:

public extension Sequence {
    /// The first element of the sequence.
    /// If the sequence is empty, the value of this property is `nil`.
    /// NOTE: Implemented as a function since properties are not expected
    /// to be mutating, and renamed from first to sequentiallyFirstElement
    /// in order to avoid confusion with the first property of Collection
    /// and to make the logic of the operation explicit via its name to
    /// reduce confusion in the context of eg a `Set`.
    public func sequentiallyFirstElement() -> Element? {
        return self.first(where: { _ in true })
    }
}

This code example, the name of the function, etc. is not to be taken as a suggestion to add something like this to Sequence, it's purpose is only to highlight some parts of the requirements of Sequence and IteratorProtocol which I find very confusing.

I might be just temporarily extra confused right now, but I'm not even sure why the above method, and eg first(where:), starts(with:), prefix(_: ) and virtually every function on Sequence shouldn't be declared as being mutating ... Can anyone explain why they shouldn't/aren't?

Jens · April 19, 2018, 4:33pm

Regarding this section from the documentation for IteratorProtocol:

Using Multiple Iterators

Whenever you use multiple iterators (or for-in loops) over a single sequence, be sure you know that the specific sequence supports repeated iteration, either because you know its concrete type or because the sequence is also constrained to the Collection protocol.

Obtain each separate iterator from separate calls to the sequence’s makeIterator() method rather than by copying. Copying an iterator is safe, but advancing one copy of an iterator by calling its next() method may invalidate other copies of that iterator. for-in loops are safe in this regard.

As far as I can tell, from the discussion above and elsewhere, the underlined (by me) part should be replaced with what it actually implies (for single-pass sequences in particular and thereby for Sequence in general) so that the first sentence reads more like this:

Whenever you use essentially any method or property* of a single sequence more than once, be sure you know that the specific sequence supports repeated iteration, either because you know its concrete type or because the sequence is also constrained to the Collection protocol.

(*) any method or property that calls the iterator's next()

And, if so, this is crucial information for not misunderstanding Sequence, and should be central in its documentation, not only in a subsection of the documentation for IteratorProtocol.

And I suppose, every such method and property should document that they are unsuitable/undefined for use on single-pass sequences, similar to how its done for those that only makes sense for finite sequences. And since I guess that this would be essentially any method or property, why not add the semantic requirement of not being single-pass to Sequence?

jawbroken · April 20, 2018, 12:53am

I don't think that I understand your reasoning here, and I'm undecided on the idea of requiring Sequence to be multipass. These methods aren't generally unsuitable for use on single-pass sequences, since you can safely call them once (with some additional issues with passing the same Sequence twice to a single method). These caveats could be better documented, though it seems obvious to me that your former quoted section (multiple iterators) implies the latter quoted section (calling methods that obviously require the use of an iterator).

The analogy I would make here is that there is probably no point to splitting up finite and infinite Sequences, since it's very hard or impossible for the language to tell at any point if you've produced one or the other and therefore help to maintain this division. Similarly, there's no language support for “you can iterate over a Sequence 0 or 1 times”, though you're free to make makeIterator() trap if you call it a second time on your custom Sequence conforming type. I would guess that sometimes trying to model this information in the type system and standard library creates a lot of complexity that doesn't necessarily pay for itself. Is this one of those cases?

Jens · April 20, 2018, 12:37pm

I'd like to try and rule out some possible misunderstandings and see what our common ground is, if you don't mind.

1

Do you agree that what is referred to by the term "single-pass sequence" as used in this thread, can be described as "a sequence that is destructively consumed by iteration", and in extreme detail "a sequence that is destructively consumed, by calling the next() method of any Iterator that has been returned by the sequence's makeIterator(), something that is done by every property and method on Sequence except for underestimatedCount"?

2

Do you agree that, in the case of single-pass sequences, what you refer to as

"methods that obviously require the use of an iterator"

is in fact

Every single property and method on Sequence except for underestimatedCount, thus including eg its (supposedly read only, non-mutating) lazy property, and to this we must also add things like Array(mySinglePassSeq)

?

3 a

Do you agree with the following ("their" being single-pass sequences):

?

3 b

And do you agree that by "a single operation" here, we mean for example …

let firstThree = mySinglePassSeq.prefix(3) // <-- … this,
if firstThree.contains(7) { // <-- this,
    let arr = Array(firstThree) // <-- and this.
    ...
}

?

jawbroken · April 20, 2018, 1:26pm

I don't have time to look into this that deeply, but I'll try to answer these. Please forgive and correct any errors.

1. I'm not sure of what it means to be destructively consumed, especially when you start mixing in random sequences for which it doesn't seem to be quite the correct term. I haven't thought about it extensively, but I would guess the more relevant distinction here would be that you get the same sequence of elements every time you iterate over the sequence (call makeIterator, then next repeatedly).

2. Sure, though I'm not sure of the details of the lazy implementation to answer that specific part.

3. a. Sure, I agreed that it only makes sense to do a single “operation” (i.e. iterate over a Sequence), but not to do multiple, in a generic context where you only know you have a Sequence.

3. b. Some of this is more complex because, as far as I recall, prefix returns the SubSequence, so I would guess that the SubSequence may be multipass even if the Sequence isn't, etc. That aside, it's clear that prefix, contains and the Array initialiser would all have to iterate over the sequence.

Jens · April 20, 2018, 2:17pm

See the section Repeated Access in the documentation for Sequence.

The following is an example of a sequence that is destructively consumed by iteration:

import Foundation

struct RandomNumberGenerator : Sequence, IteratorProtocol {
    mutating func next() -> Int? {
        return Int(truncatingIfNeeded: arc4random()) & 7
        // NOTE: For demonstration purposes / increased readability, the domain
        // of this random sequence is Int values in the closed range [0, 7], ie
        // we use only the three least significant bits instead of all 64(/32).
    }
}

let firstThree = RandomNumberGenerator().prefix(3)
print(firstThree.elementsEqual(firstThree)) // Most probably false.
if firstThree.contains(7) {
    print("Contains at least one value that is 7!")
    print(Array(firstThree)) // Might be eg [2, 3, 5] // Note no 7.
    print(Array(firstThree)) // Might be eg [1, 4, 1]
} else {
    print("Contains no value that is 7!")
    print(Array(firstThree)) // Might be eg [7, 6, 6] // Note the 7.
    print(Array(firstThree)) // Might be eg [6, 6, 3]
}

And, the following is too (including it here just to show that random isn't necessary for potential confusion):

struct Ascender : Sequence, IteratorProtocol {
    private static var value = 0
    mutating func next() -> Int? {
        defer { Ascender.value += 1 }
        return Ascender.value
    }
}

let firstThree = Ascender().prefix(3)
print(firstThree.elementsEqual(firstThree))
if firstThree.contains(7) {
    print("Contains at least one value that is 7!")
    print(Array(firstThree))
    print(Array(firstThree))
} else {
    print("Contains no value that is 7!")
    print(Array(firstThree))
    print(Array(firstThree))
}

As this latter example program will output:

false
Contains no value that is 7!
[5, 6, 7]
[8, 9, 10]

@nnnnnnnn, I remember reading somewhere in the Random Unification thread that you conformed distributions to Sequence in your Playground experiment, did you do some kind of evaluation as to whether the conformance to Sequence added more value than (highly probable) confusion (among a majority of users)?

nnnnnnnn · April 20, 2018, 4:39pm

Random values are sort of the prototypical example of a single-pass sequence for me. On the one hand, they show how confusing sequence methods can be with sequences that aren't multi-pass, but on the other hand, since they never run out and are (usually) independently distributed, that's less of a problem than your other examples. (That is, it's okay if you skip a number now and then — the resulting distribution is the same.)

The NonrepeatingDistribution example (or whatever it was called) is an example where the elements are not independently distributed — with that one if you miss elements you lose the non-repeating guarantee.

If there's an effort to make Sequence multi-pass only, such that random distributions would be able to conform to IteratorProtocol, one of the big questions would be how to interact with iterators in a streamlined way. Directly using for-in? Would we want a next(10) method or something that gives us back a group of values at a time? Etc.

Nobody1707 · April 20, 2018, 7:36pm

My gut instinct is that you should be able to use IteratorProtocol directly with for-in, in addition to being able to do so on Sequences.

palimondo · April 21, 2018, 6:13am

If I understand it correctly, that is raison d’être of the IteratorProtocol. It is the thing that makes something usable in for-in loop.