Hi everyone,
I've been waiting until the new forums were rolled out to post this, and I'm excited to finally get a chance to.
Sequence
in the standard library has two semantic capabilities: sequences may be infinite, which means they keep generating values forever, and they may be single-pass, which means if they are iterated a second time, they may not produce the same values (or any values at all!).
I think Swift should require all sequences to safely be multi-pass, and I think the argument for this is quite strong. I will lay out that arugment in this post.
True single-pass sequences in Swift are extremely rare. By my account, there are only a few situations where they can happen at all:
- Some situations (
_DropFirstSequence
in the standard library is the only one I know of) where both theSequence
and theIterator
are the same object, and that object is a reference type. This means that when the iterator is mutated, the sequence is mutated as well. - Sequences defined with blocks, such as with the constructor functions
sequence(first:next:)
andsequence(state:next:)
, where they also capture external elements instead of purely relying on theirfirst
orstate
values. - Sequences based on network data, which I have never seen in practice.
- Sequences of random data, which generate random elements as they are iterated.
Case 1 is a easy to avoid. Use structs, or make the Iterator
and Sequence
separate types.
Case 2 is easy to avoid as well. I would argue that code that captures external values in those blocks is worse than code that does not.
Case 3 is ill-suited for sequences and would be better expressed using streams or reactive paradigms.
Case 4 is a small case, and random data being different each time you access it isn't particularly unexpected anyway. Swift's randomness story isn't very well-defined at the moment, and it's possible that a RandomSequence
could be initialized with a seed that would create the exact same sequence each time if necessary.
Single pass sequences are so rare in Swift code that I usually create this contrived situation to test functions single pass sequences (h/t Nate Cook):
let singlePassSequence = sequence(first: 1, next: { $0 == 4 ? nil : $0 + 1 }).dropFirst()
for e in singlePassSequence { print(e) }
// 2
// 3
// 4
for e in singlePassSequence { print(e) }
// no output
While single pass sequences themselves are rare, code that operates on sequences (for example, a function in an extension on Sequence
) is much more common, and so the onus of handling these single-pass sequences is foisted upon all this code. This code is written by Swift consumers much more often than new Sequences, and so this change will easy the burden on them (with some minor cost to the standard library maintainers). While it can sometimes be a fun coding challenge to write some operation in a single-pass fashion, it usually a burden and an afterthought. I'd rather Swift be optimized away from this type of "gotcha!" code.
Further, if the idea is to have a construct whose iteration is destructive, Swift already has that: Iterator
. This change will brighten the line that Swift makes between Sequence
and Iterator
. It will mean that one is always mutating/destructive and one is always safe and always readable, instead of the blurry world we have today where one is always mutating/destructive and the other...sometimes can be?
To execute this change, we would need very few changes in Swift itself: we'd need to change _DropFirstSequence
to be a multi-pass sequence, update the documentation a little bit to reflect the new semantics, and do an audit of the other sequences in the standard library to make sure there are no other cases of single-pass sequences. The impact on consumer code would be very minimal.
Some may ask: if we make this change, what is the value in keeping Sequence
around at all? I think there are two main reasons. First, sequences can still be infinite, which is a useful semantic distinction commonly used in Swift code and provided for in the standard library (c.f., 1...
), and second, sequences are still a really easy way to create an iterable thing without having to muck around with creating an Index
type, IndexDistance
, and so on.
As a cherry on top of the semantic simplicity we would gain from this change, it would also let us move the first
property from Collection
to Sequence
. (Properties are never supposed to mutate their owner, and since sequences currently can be single-pass, in rare cases, first
actually mutates the owner.)
If there is interest in this change, I would be happy to write up a more full proposal and open a pull request and sample implementation.