[Pitch] Remove the single-pass requirement on Sequence

It's not that there are examples that don't fit into Sequence or Collection—the OP is proposing that we require any Sequence to guarantee that it is safely multi-pass, while not requiring indices or finiteness. Dealing with the single-pass nature of sequences makes writing correct generic code more challenging than it would be without that requirement.

So if I'm understanding this correctly, OP is suggesting that we remove single-pass Sequences altogether because he thinks they're "extremely rare"? I'm very opposed to this, for reasons outlined upthread, but I'd be much more receptive to creating a new MultipassSequence or similar protocol in between Sequence and Collection that provides the guarantees that @soroush is looking for. I'm less keen on Iterator (which is now the basic requirement for for-in loops), which would promote Sequence to being multipass.

@Jon_Hull thanks for linking that thread; I'd done some searching but couldn't find any real discussion about this topic, but this is good reading.

Yup, this is a great summary of my position. But this pitch was a lot less popular than I expected, so y'all can considered it withdrawn. :blush:

I didn't offer this as an approach because, for better or worse, Sequence is the level that everyone trades in when working with these types of algorithms. If Swift added another protocol like MultipassSequence, all the current algorithms would still stay on Sequence, and people would continue to write all their new algorithms in terms of Sequence (because you want your algorithm to work on the biggest subset of iterable things, so you put it on Sequence). And then we'd be back to square one. This is similar to Nate's point here. The only way I can see to break out of that pattern is to create a bright line between destructive (non-for...in-able) iterators, and reusable sequences.

The issue I see here is library authors restricting algorithms to Sequences where an Iterator would have done because this change makes iteration more annoying.

I'm not sure I follow. The "bright line" I'm describing is basically how the world works today (minus random sequences and the other edge cases). You can't for..in over an Iterator today. You ...could write map on Iterator today, but no one does.

To put this another way: in your second post to this thread, you rewrote your algorithm to work with single-pass sequences after learning Sequence was single-pass. If Sequence were multi-pass, rewriting the function for Iterator would require a call to makeIterator() just to pass in a Sequence. Therefore, making Sequence multi-pass discourages this generalisation to single-pass sequences.

Currently, you can still write a more concise algorithm by making Collection be the parameter. The tradeoff is greater, in terms of what sequences can no longer be handled, encouraging generalisation to single-pass sequences through use of Sequence.

The status quo encourages generalisation, while your proposed change would actively discourage it.


PS:
I understand there are merits to your suggestion - it can be easy for authors to forget the possibility for single-pass sequences, given the presence of extensions which only work with multi-pass sequences. Having this option would also be nice for those of us who aren't library authors, and are okay with making functions just for our own use-cases. One could also make the argument that single-pass sequences are an edge-case.

Personally, I think encouraging generalisation is worth sacrificing a bit of convenience. I like that the most obvious solution is also the best practice one. But it is a matter of opinion.

Yes, this is exactly what I'm talking about. If you make Iterator too annoying to work with, people will stop accepting them as parameters.

I still don't follow. No one really works at the level of iterators today (except to define them for other sequences). There should be very few functions that accept iterators as parameters, right?

Iterators are great for dealing with streams of data, especially one coming from another device such as user input or bytes from a network. They're also useful for, as you mentioned, nondeterministic, nonreplicable data such as the product of a random number generator. Some languages (off the top of my head, C++) value this distinction enough to give it a name (no, I'm not saying that we should wholesale steal from what other languages are doing. I'm just providing an example of how an Iterator can be useful).

There are many algorithms that do not fully utilize the Sequence that you're proposing and are thus overqualified. From this thread, a cartesian product can be defined as:

func withAllPairs<I: Iterator, S: Sequence>(of iterator: I, and sequence: S, perform body: (I.Element, S.Element) -> Void, breakWhen breakingCondition: @autoclosure (I.Element, S.Element) -> Bool = false) {
	for first in iterator { // Strawman syntax to advance through iterator
		for second in sequence {
			if (breakingCondition(first, second)) {
				break
			} else {
				body(first, second)
			}
		}
	}
}

As for the standard library, many of Sequence's operators apply to an Iterator as well, and free functions such as zip(_:_:) should also work on Iterators.

If I'm understanding you correctly, it sounds like you want to move the Sequence algorithms down to IteratorProtocol. That's a bit different than what I'm proposing here, so it might be better to start a new thread about it?

1 Like

It looks like there is a much deeper issue at play here. If a sequence is single pass, then most of the methods on it (e.g. map, makeIterator, etc...) really should be marked as 'mutating' because they mutate the sequence by using it up. But they aren't marked that way because it would make working with the multi-pass case unbearable for value type sequences. You end up with all these inconsistencies around both Iterator and Sequence because we are very hand wavy as to what is mutating or not. Worse, as Dave pointed out in the thread I linked, using one of those mutating methods that aren't marked mutating will mutate all of the copies too when they contain a reference-type Iterator.

Tl;dr: Many single pass sequences are value types with reference semantics.

3 Likes

To put it another way: they're pointers.

Yes, exactly. This is the thing I want to fix.

1 Like

That is a very good point.

I suppose generic code written to work with conforming classes (as all good generic code should) will already account for this possibility.

Classes in generics are already a potential pitfall, being able to mutate their state without using a mutating function. You really can't ever trust mutating to have meaning, in a generic context.

This does put me back on the fence about whether we should change Sequence, though.

classes in swift are basically smart/shared pointers in C++, and methods in a class mutate the pointee, not the pointer itself. so it makes sense

Isn't this true by construction? Can there exist any Sequence type that's single-pass and also has value semantics?

Most Iterators, if conformed to Sequence, would be single-pass and value-semantic.

How can it have value semantics?

Iterating over a Sequence type (for example, using forEach) is required to be nonmutating, but iterating over a single-pass Sequence by definition changes its value. Therefore, any single-pass conforming type must have reference semantics for both of these conditions to hold. Am I mistaken?

1 Like

It just occured to me after posting that there should be no way for Sequence to provide default implementations for non-mutating functions with only the mutating next function.

It seems the default implementations make a copy of self, making such a sequence multi-pass.

1 Like

big mood