[Pitch] Remove destructive consumption from Sequence


(Jon Hull) #1

Can’t a Sequence be potentially infinite, whereas a collection has a defined count/endIndex? Other than that, I agree with your statement.

Here is what I see as the appropriate structure:

Iterator: Single destructive pass, potentially infinite, (should be for-in able)
Sequence: Guaranteed non-destructive multi-pass (vends Iterators), potentially infinite, (should be subscript-able, gain most of collection, but lose anything that relies on it ending)
Collection: Multi-pass, guaranteed finite, (no changes from current form, except extra inits from Iterator/Sequence with end conditions)

Right now we are allowed to have an infinite sequence, but calling dropLast or non-lazy map will cause an infinite loop. These cases could be made much safer by considering the potentially infinite and finite cases separately...

Thanks,
Jon

···

on Wed Jun 22 2016, David Waite <david-AT-alkaline-solutions.com> wrote:

>> On Jun 22, 2016, at 2:57 PM, Dave Abrahams via swift-evolution <swift-evolution at swift.org <https://lists.swift.org/mailman/listinfo/swift-evolution>> wrote:
>>
>> <Ahem> “Iterators,” please.
>
> That makes me happy - for some reason I thought it was still GeneratorProtocol
>
>>> destructively, but such Generators would not conform to the needs of
>>> Sequence. As such, the most significant impact would be the inability
>>> to use such Generators in a for..in loop,
>>
>> Trying to evaluate this statement, it's clear we're missing lots of
>> detail here:
>>
>> * Would you remove Sequence?
>> * If so, what Protocol would embody “for...in-able?”
> No, I would just remove the allowance in the documentation and API
> design for a destructive/consuming iteration. Sequence would be the
> interface to getting access to repeatable iteration, without the need
> for meeting the other requirements for Collection.

That would be wrong unless there exist substantial examples of a
multipass Sequence that *can't* meet the other requirements of
Collection without loss of efficiency. And since I can write an adaptor
that turns any multipass sequence into a Collection, I think it's
trivial to prove that no such examples exist.

--
-Dave


Does this type satisfy the requirements of Sequence and IteratorProtocol?
(Dave Abrahams) #2

I should be clear up-front about the main problem I'm trying to solve:

Today, people see a beautiful, simple protocol (Sequence) to which many
things conform. They don't recognize that there's a semantic restriction
(assume you can make only a single-pass!) on it, so they write libraries
of functions that may make multiple passes over Sequences. They test
their libraries with the most commonly-available Sequences, e.g. Arrays
and Ranges, which happen to be multi-pass. Their tests pass! But their
constraints are wrong, their whole model of how to write generic code
over sequences is wrong, and some of their code is wrong.

IMO this is a problematic programming model.

Can’t a Sequence be potentially infinite, whereas a collection has a
defined count/endIndex? Other than that, I agree with your statement.

I agree that that is currently what the documentation allows and requires.
Maybe we do need to separate finiteness from multipass-ness. There's
certainly no reason one can't make multiple passes over a portion of an
infinite sequence.

[Just to complicate things... I wonder if finiteness is really
meaningful. It's easy to create a finite sequence that's so long that
it's “effectively infinite.”]

Here is what I see as the appropriate structure:

Iterator: Single destructive pass, potentially infinite, (should be
for-in able)

[Note: the best way to represent “single destructive pass” today is to
constrain Iterator to be a reference type. Otherwise, it's both easy to
create an Iterator that can be used to make multiple passes (by
copying), and to create a truly single-pass Iterator that suggests it
has value semantics. These are both potentially misleading situations]

Sequence: Guaranteed non-destructive multi-pass (vends Iterators),
potentially infinite, (should be subscript-able, gain most of
collection, but lose anything that relies on it ending)

Collection: Multi-pass, guaranteed finite, (no changes from current
form, except extra inits from Iterator/Sequence with end conditions)

This is a reasonable structure, but there are important details missing.

1. Presumably these are all for-in-able. What makes something
   for-in-able?

2. Presumably Collection refines Sequence. Does Sequence refine
   Iterator? IMO that would create the same problematic programming
   model we have today.

Perhaps the language should accept types conforming to either of two
unrelated protocols (your Sequence and Iterator, as you've described
them, with no refinement relationship) for for-in.

Right now we are allowed to have an infinite sequence, but calling
dropLast or non-lazy map will cause an infinite loop. These cases
could be made much safer by considering the potentially infinite and
finite cases separately...

The other thing I am concerned about here is that we're addressing real
use-cases with these distinctions. For example, do people commonly get
in trouble with infinite sequences today?

···

on Sun Jun 26 2016, Jonathan Hull <jhull-AT-gbis.com> wrote:

Thanks,
Jon

on Wed Jun 22 2016, David Waite <david-AT-alkaline-solutions.com> wrote:

>> On Jun 22, 2016, at 2:57 PM, Dave Abrahams via swift-evolution >> >> <swift-evolution at swift.org >> >> <https://lists.swift.org/mailman/listinfo/swift-evolution>> >> >> wrote:
>>
>> <Ahem> “Iterators,” please.
>
> That makes me happy - for some reason I thought it was still GeneratorProtocol
>
>>> destructively, but such Generators would not conform to the needs of
>>> Sequence. As such, the most significant impact would be the inability
>>> to use such Generators in a for..in loop,
>>
>> Trying to evaluate this statement, it's clear we're missing lots of
>> detail here:
>>
>> * Would you remove Sequence?
>> * If so, what Protocol would embody “for...in-able?”
> No, I would just remove the allowance in the documentation and API
> design for a destructive/consuming iteration. Sequence would be the
> interface to getting access to repeatable iteration, without the need
> for meeting the other requirements for Collection.

That would be wrong unless there exist substantial examples of a
multipass Sequence that *can't* meet the other requirements of
Collection without loss of efficiency. And since I can write an adaptor
that turns any multipass sequence into a Collection, I think it's
trivial to prove that no such examples exist.

--
-Dave

--
-Dave


Does this type satisfy the requirements of Sequence and IteratorProtocol?
(Matthew Johnson) #3

Can’t a Sequence be potentially infinite, whereas a collection has a defined count/endIndex? Other than that, I agree with your statement.

Here is what I see as the appropriate structure:

Iterator: Single destructive pass, potentially infinite, (should be for-in able)
Sequence: Guaranteed non-destructive multi-pass (vends Iterators), potentially infinite, (should be subscript-able, gain most of collection, but lose anything that relies on it ending)
Collection: Multi-pass, guaranteed finite, (no changes from current form, except extra inits from Iterator/Sequence with end conditions)

Right now we are allowed to have an infinite sequence, but calling dropLast or non-lazy map will cause an infinite loop. These cases could be made much safer by considering the potentially infinite and finite cases separately…

I think this is pointing in the right general direction. It would make working with `Sequence` much more straightforward and allow us to depend on the multi-pass property that is true in practice of the most common models of `Sequence`.

But I agree that we should give much more careful consideration to finite / infinite generally and for..in specifically.

Now that I have been thinking about the finite / infinite distinction more closely I have begun to notice a lot of code that is written generically using `Sequence` where a for..in loop is really what is required, however the “finite sequence” precondition is not explicitly stated. Interestingly, this is the case with the standard library’s eager `map` (but not the case with `dropLast` which explicitly notes the precondition). I have been somewhat surprised to realize how common this “bug” is (i.e. not stating a precondition). I think we have gotten away with it thus far because the sequences most people use most of the time in practice are finite. But that doesn’t mean we should accept this as good enough - IMO it is way to easy to forget to document this precondition (and obviously easier for users to overlook than preconditions that are actually encoded in the type system, violations of which are caught at compile time).

The fact that this pattern is so pervasive is what I meant when I said for..in “naturally” requires a finite sequence.

IMO it’s better to encode preconditions in the type system when that is practical, and especially when the precondition is shared by a vast majority of code written using a particular construct (in this case a for..in loop written using the most generic for..in-able protocol).

I think the safest solution is to take the position that writing an infinite loop is relatively uncommon and is a more “advanced” technique, and thus should be done explicitly. Do people really write infinite loops often enough that the convenience of using for..in when writing infinite loops outweighs the safety benefit of preventing accidental infinite loops? I haven’t seen a compelling argument for this.

If we adopt that position then for..in would need to be built on top of a guaranteed finite construct. This would allow programmers to continue writing generic code agains the most generic for..in-able construct while eliminating a precondition that is often (usually?) unstated and likely unconsidered.

If we do decide to move forward with infinite for..in loops I think we need to establish strong guidance around how to properly write generic code with these protocols. Should such code really be constrained to `Collection` rather than `Sequence` (i.e. should a potentially infinite `Sequence` have an eager map)? If this is the guidance, should it be paired with guidance that all finite sequences should conform to `Collection`? Or is it sufficient to just educate developers about this issue and expect people to document the “finite Sequence” precondition when the constraint is `Sequence` rather than `Collection`?

I hope we will give serious consideration to these questions while this topic is open for discussion.

-Matthew

···

On Jun 26, 2016, at 10:56 PM, Jonathan Hull via swift-evolution <swift-evolution@swift.org> wrote:

Thanks,
Jon

on Wed Jun 22 2016, David Waite <david-AT-alkaline-solutions.com <http://david-at-alkaline-solutions.com/>> wrote:

>> On Jun 22, 2016, at 2:57 PM, Dave Abrahams via swift-evolution <swift-evolution at swift.org <https://lists.swift.org/mailman/listinfo/swift-evolution>> wrote:
>>
>> <Ahem> “Iterators,” please.
>
> That makes me happy - for some reason I thought it was still GeneratorProtocol
>
>>> destructively, but such Generators would not conform to the needs of
>>> Sequence. As such, the most significant impact would be the inability
>>> to use such Generators in a for..in loop,
>>
>> Trying to evaluate this statement, it's clear we're missing lots of
>> detail here:
>>
>> * Would you remove Sequence?
>> * If so, what Protocol would embody “for...in-able?”
> No, I would just remove the allowance in the documentation and API
> design for a destructive/consuming iteration. Sequence would be the
> interface to getting access to repeatable iteration, without the need
> for meeting the other requirements for Collection.

That would be wrong unless there exist substantial examples of a
multipass Sequence that *can't* meet the other requirements of
Collection without loss of efficiency. And since I can write an adaptor
that turns any multipass sequence into a Collection, I think it's
trivial to prove that no such examples exist.

--
-Dave

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Haravikk) #4

This seems pretty reasonable to me, though I'm not sure about subscripting Sequence, I mean you can always add that yourself to sequence types where it makes sense to them that way, but otherwise I think it's best to just keep it simple and leave Sequences as a type that vends iterators over the same values.

Some methods of sequence may need to be moved to iterators though, but in mutating form; things like .dropFirst() to skip elements, .first(where:) for skipping to a matching element and so-on. Iterator should probably also have the .underestimatedCount property so they can give a useful value if it can be known.

But yeah, I think the basic structure of this makes sense, as the potentially destructive nature of sequences doesn't seem that well known in my experience, and I often have to go back and check my code to be sure I've avoid possible destructive usage; I've never thought of trying to use iterators instead, might try changing some methods and see how that goes.

···

On 27 Jun 2016, at 04:56, Jonathan Hull via swift-evolution <swift-evolution@swift.org> wrote:

Can’t a Sequence be potentially infinite, whereas a collection has a defined count/endIndex? Other than that, I agree with your statement.

Here is what I see as the appropriate structure:

Iterator: Single destructive pass, potentially infinite, (should be for-in able)
Sequence: Guaranteed non-destructive multi-pass (vends Iterators), potentially infinite, (should be subscript-able, gain most of collection, but lose anything that relies on it ending)
Collection: Multi-pass, guaranteed finite, (no changes from current form, except extra inits from Iterator/Sequence with end conditions)

Right now we are allowed to have an infinite sequence, but calling dropLast or non-lazy map will cause an infinite loop. These cases could be made much safer by considering the potentially infinite and finite cases separately...


(Dave Abrahams) #5

Can’t a Sequence be potentially infinite, whereas a collection has a
defined count/endIndex? Other than that, I agree with your

statement.

Here is what I see as the appropriate structure:

Iterator: Single destructive pass, potentially infinite, (should be for-in able)
Sequence: Guaranteed non-destructive multi-pass (vends Iterators),
potentially infinite, (should be subscript-able, gain most of
collection, but lose anything that relies on it ending)
Collection: Multi-pass, guaranteed finite, (no changes from current
form, except extra inits from Iterator/Sequence with end conditions)

Right now we are allowed to have an infinite sequence, but calling
dropLast or non-lazy map will cause an infinite loop. These cases
could be made much safer by considering the potentially infinite and
finite cases separately…

I think this is pointing in the right general direction. It would
make working with `Sequence` much more straightforward and allow us to
depend on the multi-pass property that is true in practice of the most
common models of `Sequence`.

But I agree that we should give much more careful consideration to
finite / infinite generally and for..in specifically.

Now that I have been thinking about the finite / infinite distinction
more closely I have begun to notice a lot of code that is written
generically using `Sequence` where a for..in loop is really what is
required, however the “finite sequence” precondition is not explicitly
stated. Interestingly, this is the case with the standard library’s
eager `map` (but not the case with `dropLast` which explicitly notes
the precondition). I have been somewhat surprised to realize how
common this “bug” is (i.e. not stating a precondition). I think we
have gotten away with it thus far because the sequences most people
use most of the time in practice are finite. But that doesn’t mean we
should accept this as good enough - IMO it is way to easy to forget to
document this precondition (and obviously easier for users to overlook
than preconditions that are actually encoded in the type system,
violations of which are caught at compile time).

The fact that this pattern is so pervasive is what I meant when I said
for..in “naturally” requires a finite sequence.

IMO it’s better to encode preconditions in the type system when that
is practical, and especially when the precondition is shared by a vast
majority of code written using a particular construct (in this case a
for..in loop written using the most generic for..in-able protocol).

I think the safest solution is to take the position that writing an
infinite loop is relatively uncommon and is a more “advanced”
technique, and thus should be done explicitly. Do people really write
infinite loops often enough that the convenience of using for..in when
writing infinite loops outweighs the safety benefit of preventing
accidental infinite loops? I haven’t seen a compelling argument for
this.

Good questions. I'd also add: “do infinite sequences come up often
enough that accidentally looping on them forever is a problem?”

···

on Mon Jun 27 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:

On Jun 26, 2016, at 10:56 PM, Jonathan Hull via swift-evolution >> <swift-evolution@swift.org> wrote:

If we adopt that position then for..in would need to be built on top
of a guaranteed finite construct. This would allow programmers to
continue writing generic code agains the most generic for..in-able
construct while eliminating a precondition that is often (usually?)
unstated and likely unconsidered.

If we do decide to move forward with infinite for..in loops I think we
need to establish strong guidance around how to properly write generic
code with these protocols. Should such code really be constrained to
`Collection` rather than `Sequence` (i.e. should a potentially
infinite `Sequence` have an eager map)? If this is the guidance,
should it be paired with guidance that all finite sequences should
conform to `Collection`? Or is it sufficient to just educate
developers about this issue and expect people to document the “finite
Sequence” precondition when the constraint is `Sequence` rather than
`Collection`?

I hope we will give serious consideration to these questions while
this topic is open for discussion.

-Matthew

Thanks,
Jon

on Wed Jun 22 2016, David Waite <david-AT-alkaline-solutions.com >>> <http://david-at-alkaline-solutions.com/>> wrote:

>> On Jun 22, 2016, at 2:57 PM, Dave Abrahams via swift-evolution >>> >> <swift-evolution at swift.org >>> >> <https://lists.swift.org/mailman/listinfo/swift-evolution>> >>> >> wrote:
>>
>> <Ahem> “Iterators,” please.
>
> That makes me happy - for some reason I thought it was still GeneratorProtocol
>
>>> destructively, but such Generators would not conform to the needs of
>>> Sequence. As such, the most significant impact would be the inability
>>> to use such Generators in a for..in loop,
>>
>> Trying to evaluate this statement, it's clear we're missing lots of
>> detail here:
>>
>> * Would you remove Sequence?
>> * If so, what Protocol would embody “for...in-able?”
> No, I would just remove the allowance in the documentation and API
> design for a destructive/consuming iteration. Sequence would be the
> interface to getting access to repeatable iteration, without the need
> for meeting the other requirements for Collection.

That would be wrong unless there exist substantial examples of a
multipass Sequence that *can't* meet the other requirements of
Collection without loss of efficiency. And since I can write an adaptor
that turns any multipass sequence into a Collection, I think it's
trivial to prove that no such examples exist.

--
-Dave

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

--
-Dave


(Matthew Johnson) #6

I should be clear up-front about the main problem I'm trying to solve:

Today, people see a beautiful, simple protocol (Sequence) to which many
things conform. They don't recognize that there's a semantic restriction
(assume you can make only a single-pass!) on it, so they write libraries
of functions that may make multiple passes over Sequences. They test
their libraries with the most commonly-available Sequences, e.g. Arrays
and Ranges, which happen to be multi-pass. Their tests pass! But their
constraints are wrong, their whole model of how to write generic code
over sequences is wrong, and some of their code is wrong.

IMO this is a problematic programming model.

Can’t a Sequence be potentially infinite, whereas a collection has a
defined count/endIndex? Other than that, I agree with your statement.

I agree that that is currently what the documentation allows and requires.
Maybe we do need to separate finiteness from multipass-ness. There's
certainly no reason one can't make multiple passes over a portion of an
infinite sequence.

Exactly. It is perfectly reasonable to have an infinite, dynamically generated mathematical sequence and make multiple passes over a portion of it (for example, by zipping it with finite sequences, taking a prefix, etc).

[Just to complicate things... I wonder if finiteness is really
meaningful. It's easy to create a finite sequence that's so long that
it's “effectively infinite.”]

Of course that is *possible* and not hard to do but it sounds like a degenerate case to me. How often do such things arise *in practice*? In reality I suspect we tend to have sequences that are either infinite or finite not just in a strict sense, but also in a practical sense.

Here is what I see as the appropriate structure:

Iterator: Single destructive pass, potentially infinite, (should be
for-in able)

[Note: the best way to represent “single destructive pass” today is to
constrain Iterator to be a reference type. Otherwise, it's both easy to
create an Iterator that can be used to make multiple passes (by
copying), and to create a truly single-pass Iterator that suggests it
has value semantics. These are both potentially misleading situations]

Sequence: Guaranteed non-destructive multi-pass (vends Iterators),
potentially infinite, (should be subscript-able, gain most of
collection, but lose anything that relies on it ending)

Collection: Multi-pass, guaranteed finite, (no changes from current
form, except extra inits from Iterator/Sequence with end conditions)

This is a reasonable structure, but there are important details missing.

1. Presumably these are all for-in-able. What makes something
  for-in-able?

2. Presumably Collection refines Sequence. Does Sequence refine
  Iterator? IMO that would create the same problematic programming
  model we have today.

Perhaps the language should accept types conforming to either of two
unrelated protocols (your Sequence and Iterator, as you've described
them, with no refinement relationship) for for-in.

Right now we are allowed to have an infinite sequence, but calling
dropLast or non-lazy map will cause an infinite loop. These cases
could be made much safer by considering the potentially infinite and
finite cases separately...

The other thing I am concerned about here is that we're addressing real
use-cases with these distinctions. For example, do people commonly get
in trouble with infinite sequences today?

I think the same question can apply to destructive consumption of sequences. I know people write code that assumes multi-pass and people also write code that assumes finiteness. Do you know of cases where people commonly get into trouble with single pass sequences today?

I don’t know of specific cases where either has caused problems in practice but both leave plenty of room for trouble. Enough that they deserve fixing IMO as they are pretty fundamental constructs that most Swift code relies upon.

-Matthew

···

On Jun 27, 2016, at 11:39 AM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:
on Sun Jun 26 2016, Jonathan Hull <jhull-AT-gbis.com> wrote:

Thanks,
Jon

on Wed Jun 22 2016, David Waite <david-AT-alkaline-solutions.com> wrote:

On Jun 22, 2016, at 2:57 PM, Dave Abrahams via swift-evolution >>>>> <swift-evolution at swift.org >>>>> <https://lists.swift.org/mailman/listinfo/swift-evolution>> >>>>> wrote:

<Ahem> “Iterators,” please.

That makes me happy - for some reason I thought it was still GeneratorProtocol

destructively, but such Generators would not conform to the needs of
Sequence. As such, the most significant impact would be the inability
to use such Generators in a for..in loop,

Trying to evaluate this statement, it's clear we're missing lots of
detail here:

* Would you remove Sequence?
* If so, what Protocol would embody “for...in-able?”

No, I would just remove the allowance in the documentation and API
design for a destructive/consuming iteration. Sequence would be the
interface to getting access to repeatable iteration, without the need
for meeting the other requirements for Collection.

That would be wrong unless there exist substantial examples of a
multipass Sequence that *can't* meet the other requirements of
Collection without loss of efficiency. And since I can write an adaptor
that turns any multipass sequence into a Collection, I think it's
trivial to prove that no such examples exist.

--
-Dave

--
-Dave
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Russ Bishop) #7

The more I think about it the more I think this is a good model. Iterator is already single-use. Sequence then describes a type that can vend multiple iterators and is inherently multi-pass. I would argue the protocol is making this guarantee already, since makeIterator() naturally lends itself to multiple invocations.

In my example, LazyRowSequence becomes LazyRowIterator.

Russ

···

On Jun 27, 2016, at 9:39 AM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:

Perhaps the language should accept types conforming to either of two
unrelated protocols (your Sequence and Iterator, as you've described
them, with no refinement relationship) for for-in.


(Jon Hull) #8

Comments inline

I should be clear up-front about the main problem I'm trying to solve:

Today, people see a beautiful, simple protocol (Sequence) to which many
things conform. They don't recognize that there's a semantic restriction
(assume you can make only a single-pass!) on it, so they write libraries
of functions that may make multiple passes over Sequences. They test
their libraries with the most commonly-available Sequences, e.g. Arrays
and Ranges, which happen to be multi-pass. Their tests pass! But their
constraints are wrong, their whole model of how to write generic code
over sequences is wrong, and some of their code is wrong.

IMO this is a problematic programming model.

Agreed.

Can’t a Sequence be potentially infinite, whereas a collection has a
defined count/endIndex? Other than that, I agree with your statement.

I agree that that is currently what the documentation allows and requires.
Maybe we do need to separate finiteness from multipass-ness. There's
certainly no reason one can't make multiple passes over a portion of an
infinite sequence.

I have a use-case for this below. Graphical manipulation using a repeatable sequence of random numbers.

[Just to complicate things... I wonder if finiteness is really
meaningful. It's easy to create a finite sequence that's so long that
it's “effectively infinite.”]

See below… I am definitely guilty of this. That said, if we had explicit infinite sequences (with subscripts), I would use those instead of collections for these use-cases.

Here is what I see as the appropriate structure:

Iterator: Single destructive pass, potentially infinite, (should be
for-in able)

[Note: the best way to represent “single destructive pass” today is to
constrain Iterator to be a reference type. Otherwise, it's both easy to
create an Iterator that can be used to make multiple passes (by
copying), and to create a truly single-pass Iterator that suggests it
has value semantics. These are both potentially misleading situations]

Hmm… This is a really good point. I can see why you are thinking of making it a reference type.

If a particular iterator can safely be cloned mid-stream, then it can provide its own interface to allow that. The value type makes a promise which can’t always be kept.

Sequence: Guaranteed non-destructive multi-pass (vends Iterators),
potentially infinite, (should be subscript-able, gain most of
collection, but lose anything that relies on it ending)

Collection: Multi-pass, guaranteed finite, (no changes from current
form, except extra inits from Iterator/Sequence with end conditions)

This is a reasonable structure, but there are important details missing.

1. Presumably these are all for-in-able. What makes something
  for-in-able?

I would think the potentially infinite should require for-in-until (even if you explicitly set until to false to create an infinite loop), but collection would allow for-in (with optional until). That way you have to acknowledge the possibility of an infinite sequence/iterator.

2. Presumably Collection refines Sequence. Does Sequence refine
  Iterator? IMO that would create the same problematic programming
  model we have today.

Sequence vends iterators. (a sequence is NOT a refinement of iterator, it just creates them as needed)

Perhaps the language should accept types conforming to either of two
unrelated protocols (your Sequence and Iterator, as you've described
them, with no refinement relationship) for for-in.

Yes.

Right now we are allowed to have an infinite sequence, but calling

dropLast or non-lazy map will cause an infinite loop. These cases
could be made much safer by considering the potentially infinite and
finite cases separately...

The other thing I am concerned about here is that we're addressing real
use-cases with these distinctions. For example, do people commonly get
in trouble with infinite sequences today?

Probably not… because they avoid infinite sequences to avoid getting into trouble. I was bitten a few times early on (I assumed map was lazy) and just avoided them until recently.

I think they would be used more often if you could guarantee that their use was safe (i.e. being forced to consider the infinite possibility). I would like to have a bunch of infinite sequences that could easily be refined to collections. The ones I would use most often would be the natural numbers and random numbers. Also imagine, an infinite sequence of random colors which look fairly good together. Markov chains, die rolls… there are a lot of potential uses that become interesting once the threat of accidental infinite loop has been removed...

As a real world example of the "effectively infinite” sequence, this weekend I created a Swift 3 collection of repeatable random values (of any type conforming to a protocol). I would have used sequence if the subscript behavior was available on it, since I do intend it to be an infinite sequence in most cases (Apologies for the lack of comments, it is not yet prepared for public consumption):
https://gist.github.com/jonhull/3655672529f8cf5b2eb248583d2cafb9

The use case here is to create a hand-drawn look for a CGPath by breaking it up into pieces and then jiggling the pieces about (using random CGVectors). I quickly realized that I needed a repeatable source of randomness (otherwise the drawing would move each time it was redrawn), and thus a multi-pass sequence.

I am a little bit nervous every time I use this, as it has the potential for an “effectively infinite” loop, but is proving useful throughout my project.

Thanks,
Jon

···

On Jun 27, 2016, at 9:39 AM, Dave Abrahams <dabrahams@apple.com> wrote:
on Sun Jun 26 2016, Jonathan Hull <jhull-AT-gbis.com <http://jhull-at-gbis.com/>> wrote:

Thanks,
Jon

on Wed Jun 22 2016, David Waite <david-AT-alkaline-solutions.com <http://david-at-alkaline-solutions.com/>> wrote:

On Jun 22, 2016, at 2:57 PM, Dave Abrahams via swift-evolution >>>>> <swift-evolution at swift.org <http://swift.org/> >>>>> <https://lists.swift.org/mailman/listinfo/swift-evolution>> >>>>> wrote:

<Ahem> “Iterators,” please.

That makes me happy - for some reason I thought it was still GeneratorProtocol

destructively, but such Generators would not conform to the needs of
Sequence. As such, the most significant impact would be the inability
to use such Generators in a for..in loop,

Trying to evaluate this statement, it's clear we're missing lots of
detail here:

* Would you remove Sequence?
* If so, what Protocol would embody “for...in-able?”

No, I would just remove the allowance in the documentation and API
design for a destructive/consuming iteration. Sequence would be the
interface to getting access to repeatable iteration, without the need
for meeting the other requirements for Collection.

That would be wrong unless there exist substantial examples of a
multipass Sequence that *can't* meet the other requirements of
Collection without loss of efficiency. And since I can write an adaptor
that turns any multipass sequence into a Collection, I think it's
trivial to prove that no such examples exist.

--
-Dave

--
-Dave


(Brent Royal-Gordon) #9

[Just to complicate things... I wonder if finiteness is really
meaningful. It's easy to create a finite sequence that's so long that
it's “effectively infinite.”]

And similarly, operations like `first(while:)` on an infinite sequence might or might not produce an infinite sequence, depending on properties of the sequence and predicate which aren't expressed in the type system.

I think there are really three cases:

* Known finite
* Known infinite
* Unknown

The most obvious way to model this is as two subtypes:

    Possibly infinite
    / \
  Infinite Not infinite

A known-infinite iterator might refine unknown-infinite iterators like this:

  protocol InfiniteIteratorProtocol: IteratorProtocol {
    mutating func next() -> Element // Note that the return is non-Optional!
  }
  
  extension InfiniteIteratorProtocol {
    mutating func next() -> Self.Element? { return .some(next()) }
  }

Refining unknown-infinite collections to provide known-infinite collections is trickier. I think we would want something like this:

  // We don't use Optional because its Comparable conformance puts `.none`
  // before, not after, `.some`.
  enum InfiniteCollectionIndex<ConcreteIndex: Comparable>: Comparable {
    case .element (ConcreteIndex)
    case .end
    
    var concreteIndex: ConcreteIndex {
      get {
        switch self {
        case .element(let i):
          return i
        case .end:
          preconditionFailure("Attempted to access the end of an infinite collection")
        }
      }
      set {
        self = .element(newValue)
      }
    }
  }
  func < <ConcreteIndex: Comparable>(lhs: InfiniteCollectionIndex<ConcreteIndex>, rhs: InfiniteCollectionIndex<ConcreteIndex>) -> Bool {
    switch (lhs, rhs) {
    case let (.element(lhsIndex), .element(rhsIndex)):
      return lhsIndex < rhsIndex
    case (.element, .end):
      return true
    default:
      return false
    }
  }
  
  extension IndexingIterator: InfiniteIteratorProtocol where Elements: InfiniteCollection {}
    
  protocol InfiniteCollection: Collection where Index == InfiniteCollectionIndex<ConcreteIndex>, Iterator: InfiniteIteratorProtocol {
    typealias ConcreteIndex: Comparable
    
    var startConcreteIndex: ConcreteIndex { get }
    func index(after i: ConcreteIndex) -> ConcreteIndex
    func formIndex(after i: inout ConcreteIndex)
    
    subscript(i: ConcreteIndex) { get }
  }
  extension InfiniteCollection {
    var startIndex: InfiniteCollectionIndex<ConcreteIndex> {
      return .element(startConcreteIndex)
    }
    var endIndex: InfiniteCollectionIndex<ConcreteIndex> {
      return .end
    }
    func index(after i: InfiniteCollectionIndex<ConcreteIndex>) -> InfiniteCollectionIndex<ConcreteIndex> {
      return .element(index(after: i.concreteIndex))
    }
    func formIndex(after i: inout InfiniteCollectionIndex<ConcreteIndex>) {
      formIndex(after: &i.concreteIndex)
    }
    subscript(i: InfiniteCollectionIndex<ConcreteIndex>) {
      return self[i.concreteIndex]
    }
  }
  // With corresponding InfiniteBidirectionalCollection and InfiniteRandomAccessCollection types

But the problem with this is, for all its complications, it probably doesn't actually provide much additional safety. "Possibly infinite" needs to have all the operations we ought to forbid on an infinite sequence or collection. All the "infinite" type can really do is provide runtime implementations that crash immediately; the type system can't prevent it.

The alternative is to have some kind of wrapper type you put around "possibly infinite" sequences/collections to essentially assert they're finite:

  // Forbidden:
  infiniteSequence.prefix(while: { $0 < 100 }).map { … }
  // Instead, write:
  infiniteSequence.prefix(while: { $0 < 100 }).assumingFinite.map { … }

But that once again brings us to the question: How important *is* it that we prevent greedy calls on known-infinite sequences? There aren't actually any known-infinite sequences in the standard library; even the `sequence` function can be terminated by returning `nil`. Known-infinite sequences are certainly a coherent concept, but are they something we need to model? And if not, is requiring people to say `assumingFinite` on calls where, as far as the type system is concerned, there *is* a possibility of a `nil` return really worth it?

(It *is* kind of a shame that we didn't keep lazy-by-default `map` and `filter`, because the lazy algorithms are usable on infinite sequences. But I understand why that wasn't possible.)

···

--
Brent Royal-Gordon
Architechies


(Anton Zhilin) #10

Dave Abrahams via swift-evolution <swift-evolution@...> writes:

I should be clear up-front about the main problem I'm trying to solve:

Today, people see a beautiful, simple protocol (Sequence) to which

many

things conform. They don't recognize that there's a semantic

restriction

(assume you can make only a single-pass!) on it, so they write

libraries

of functions that may make multiple passes over Sequences. They test
their libraries with the most commonly-available Sequences, e.g.

Arrays

and Ranges, which happen to be multi-pass. Their tests pass! But

their

constraints are wrong, their whole model of how to write generic code
over sequences is wrong, and some of their code is wrong.

This is a mistake that I see more often than I want to: designers of
development tools underestimate abilities of developers, and end up
restricting many of them.

My point is that we should impose minimal requirements on custom data
types. Quoting yourself:

In this case, protocols used to interface with the language
at the lowest levels may be purely about syntax.

So there should be a protocol for building into for-in loops that
requires only that the sequence should be iterable once. Currently, we
have Sequence for that. If we remove destructive consumption from
Sequence, then there should be a super-protocol SinglePassSequence or
Forinable (excuse the pun).

But even if we do this, I still don't understand the intentions. If a
Sequence is multi-pass, then we can satisfy requirements of Collection,
as already noted in the discussion. Why should we keep Sequence at all,
then? Then let's remove it and rename SinglePassSequence to Sequence???


(Matthew Johnson) #11

Can’t a Sequence be potentially infinite, whereas a collection has a
defined count/endIndex? Other than that, I agree with your

statement.

Here is what I see as the appropriate structure:

Iterator: Single destructive pass, potentially infinite, (should be for-in able)
Sequence: Guaranteed non-destructive multi-pass (vends Iterators),
potentially infinite, (should be subscript-able, gain most of
collection, but lose anything that relies on it ending)
Collection: Multi-pass, guaranteed finite, (no changes from current
form, except extra inits from Iterator/Sequence with end conditions)

Right now we are allowed to have an infinite sequence, but calling
dropLast or non-lazy map will cause an infinite loop. These cases
could be made much safer by considering the potentially infinite and
finite cases separately…

I think this is pointing in the right general direction. It would
make working with `Sequence` much more straightforward and allow us to
depend on the multi-pass property that is true in practice of the most
common models of `Sequence`.

But I agree that we should give much more careful consideration to
finite / infinite generally and for..in specifically.

Now that I have been thinking about the finite / infinite distinction
more closely I have begun to notice a lot of code that is written
generically using `Sequence` where a for..in loop is really what is
required, however the “finite sequence” precondition is not explicitly
stated. Interestingly, this is the case with the standard library’s
eager `map` (but not the case with `dropLast` which explicitly notes
the precondition). I have been somewhat surprised to realize how
common this “bug” is (i.e. not stating a precondition). I think we
have gotten away with it thus far because the sequences most people
use most of the time in practice are finite. But that doesn’t mean we
should accept this as good enough - IMO it is way to easy to forget to
document this precondition (and obviously easier for users to overlook
than preconditions that are actually encoded in the type system,
violations of which are caught at compile time).

The fact that this pattern is so pervasive is what I meant when I said
for..in “naturally” requires a finite sequence.

IMO it’s better to encode preconditions in the type system when that
is practical, and especially when the precondition is shared by a vast
majority of code written using a particular construct (in this case a
for..in loop written using the most generic for..in-able protocol).

I think the safest solution is to take the position that writing an
infinite loop is relatively uncommon and is a more “advanced”
technique, and thus should be done explicitly. Do people really write
infinite loops often enough that the convenience of using for..in when
writing infinite loops outweighs the safety benefit of preventing
accidental infinite loops? I haven’t seen a compelling argument for
this.

Good questions. I'd also add: “do infinite sequences come up often
enough that accidentally looping on them forever is a problem?”

That’s a good question as well. In practice the frequency of infinite sequences likely depends on the domain.

IMO this falls into the same category as “do single pass sequences come up often enough that attempting to iterate over them twice is a problem. To paraphrase your previous post:

Today, people see a beautiful, simple protocol (Sequence) to which many
things conform. They don't recognize that there's a semantic restriction
(you can’t assume it is finite!) on it, so they write libraries
of functions that may iterate a Sequence to termination. They test
their libraries with the most commonly-available Sequences, e.g. Arrays
and Ranges, which happen to be finite. Their tests pass! But their
constraints are wrong, their whole model of how to write generic code
over sequences is wrong, and some of their code is wrong.

IMO this is a problematic programming model.

I definitely don’t mean to put words in your mouth here, but the logical structure of the argument appears identical to me regardless of which issue it is applied to. I am only trying to make that point.

···

On Jun 27, 2016, at 11:46 AM, Dave Abrahams <dabrahams@apple.com> wrote:
on Mon Jun 27 2016, Matthew Johnson <matthew-AT-anandabits.com <http://matthew-at-anandabits.com/>> wrote:

On Jun 26, 2016, at 10:56 PM, Jonathan Hull via swift-evolution >>> <swift-evolution@swift.org> wrote:

If we adopt that position then for..in would need to be built on top
of a guaranteed finite construct. This would allow programmers to
continue writing generic code agains the most generic for..in-able
construct while eliminating a precondition that is often (usually?)
unstated and likely unconsidered.

If we do decide to move forward with infinite for..in loops I think we
need to establish strong guidance around how to properly write generic
code with these protocols. Should such code really be constrained to
`Collection` rather than `Sequence` (i.e. should a potentially
infinite `Sequence` have an eager map)? If this is the guidance,
should it be paired with guidance that all finite sequences should
conform to `Collection`? Or is it sufficient to just educate
developers about this issue and expect people to document the “finite
Sequence” precondition when the constraint is `Sequence` rather than
`Collection`?

I hope we will give serious consideration to these questions while
this topic is open for discussion.

-Matthew

Thanks,
Jon

on Wed Jun 22 2016, David Waite <david-AT-alkaline-solutions.com >>>> <http://david-at-alkaline-solutions.com/>> wrote:

On Jun 22, 2016, at 2:57 PM, Dave Abrahams via swift-evolution >>>>>> <swift-evolution at swift.org <http://swift.org/> >>>>>> <https://lists.swift.org/mailman/listinfo/swift-evolution>> >>>>>> wrote:

<Ahem> “Iterators,” please.

That makes me happy - for some reason I thought it was still GeneratorProtocol

destructively, but such Generators would not conform to the needs of
Sequence. As such, the most significant impact would be the inability
to use such Generators in a for..in loop,

Trying to evaluate this statement, it's clear we're missing lots of
detail here:

* Would you remove Sequence?
* If so, what Protocol would embody “for...in-able?”

No, I would just remove the allowance in the documentation and API
design for a destructive/consuming iteration. Sequence would be the
interface to getting access to repeatable iteration, without the need
for meeting the other requirements for Collection.

That would be wrong unless there exist substantial examples of a
multipass Sequence that *can't* meet the other requirements of
Collection without loss of efficiency. And since I can write an adaptor
that turns any multipass sequence into a Collection, I think it's
trivial to prove that no such examples exist.

--
-Dave

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

--
-Dave


(Dave Abrahams) #12

Dave Abrahams via swift-evolution <swift-evolution@...> writes:

I should be clear up-front about the main problem I'm trying to solve:

Today, people see a beautiful, simple protocol (Sequence) to which many
things conform. They don't recognize that there's a semantic restriction
(assume you can make only a single-pass!) on it, so they write libraries
of functions that may make multiple passes over Sequences. They test
their libraries with the most commonly-available Sequences, e.g. Arrays
and Ranges, which happen to be multi-pass. Their tests pass! But their
constraints are wrong, their whole model of how to write generic code
over sequences is wrong, and some of their code is wrong.

This is a mistake that I see more often than I want to: designers of
development tools underestimate abilities of developers, and end up
restricting many of them.

In what way am I underestimating the abilities of developers?

My point is that we should impose minimal requirements on custom data
types. Quoting yourself:

In this case, protocols used to interface with the language
at the lowest levels may be purely about syntax.

So there should be a protocol for building into for-in loops that
requires only that the sequence should be iterable once. Currently, we
have Sequence for that. If we remove destructive consumption from
Sequence, then there should be a super-protocol SinglePassSequence or
Forinable (excuse the pun).

Yes.

But even if we do this, I still don't understand the intentions. If a
Sequence is multi-pass, then we can satisfy requirements of
Collection, as already noted in the discussion. Why should we keep
Sequence at all, then? Then let's remove it and rename
SinglePassSequence to Sequence???

This kinds of decisions are all part of the discussion.

···

on Mon Jun 27 2016, Anton Zhilin <swift-evolution@swift.org> wrote:

--
Dave


(Matthew Johnson) #13

Comments inline

I should be clear up-front about the main problem I'm trying to solve:

Today, people see a beautiful, simple protocol (Sequence) to which many
things conform. They don't recognize that there's a semantic restriction
(assume you can make only a single-pass!) on it, so they write libraries
of functions that may make multiple passes over Sequences. They test
their libraries with the most commonly-available Sequences, e.g. Arrays
and Ranges, which happen to be multi-pass. Their tests pass! But their
constraints are wrong, their whole model of how to write generic code
over sequences is wrong, and some of their code is wrong.

IMO this is a problematic programming model.

Agreed.

Can’t a Sequence be potentially infinite, whereas a collection has a
defined count/endIndex? Other than that, I agree with your statement.

I agree that that is currently what the documentation allows and requires.
Maybe we do need to separate finiteness from multipass-ness. There's
certainly no reason one can't make multiple passes over a portion of an
infinite sequence.

I have a use-case for this below. Graphical manipulation using a repeatable sequence of random numbers.

[Just to complicate things... I wonder if finiteness is really
meaningful. It's easy to create a finite sequence that's so long that
it's “effectively infinite.”]

See below… I am definitely guilty of this. That said, if we had explicit infinite sequences (with subscripts), I would use those instead of collections for these use-cases.

Here is what I see as the appropriate structure:

Iterator: Single destructive pass, potentially infinite, (should be
for-in able)

[Note: the best way to represent “single destructive pass” today is to
constrain Iterator to be a reference type. Otherwise, it's both easy to
create an Iterator that can be used to make multiple passes (by
copying), and to create a truly single-pass Iterator that suggests it
has value semantics. These are both potentially misleading situations]

Hmm… This is a really good point. I can see why you are thinking of making it a reference type.

If a particular iterator can safely be cloned mid-stream, then it can provide its own interface to allow that. The value type makes a promise which can’t always be kept.

Sequence: Guaranteed non-destructive multi-pass (vends Iterators),
potentially infinite, (should be subscript-able, gain most of
collection, but lose anything that relies on it ending)

Collection: Multi-pass, guaranteed finite, (no changes from current
form, except extra inits from Iterator/Sequence with end conditions)

This is a reasonable structure, but there are important details missing.

1. Presumably these are all for-in-able. What makes something
  for-in-able?

I would think the potentially infinite should require for-in-until (even if you explicitly set until to false to create an infinite loop), but collection would allow for-in (with optional until). That way you have to acknowledge the possibility of an infinite sequence/iterator.

This is an interesting idea. I think I like it!

2. Presumably Collection refines Sequence. Does Sequence refine
  Iterator? IMO that would create the same problematic programming
  model we have today.

Sequence vends iterators. (a sequence is NOT a refinement of iterator, it just creates them as needed)

Perhaps the language should accept types conforming to either of two
unrelated protocols (your Sequence and Iterator, as you've described
them, with no refinement relationship) for for-in.

Yes.

Right now we are allowed to have an infinite sequence, but calling

dropLast or non-lazy map will cause an infinite loop. These cases
could be made much safer by considering the potentially infinite and
finite cases separately...

The other thing I am concerned about here is that we're addressing real
use-cases with these distinctions. For example, do people commonly get
in trouble with infinite sequences today?

Probably not… because they avoid infinite sequences to avoid getting into trouble. I was bitten a few times early on (I assumed map was lazy) and just avoided them until recently.

I think they would be used more often if you could guarantee that their use was safe (i.e. being forced to consider the infinite possibility). I would like to have a bunch of infinite sequences that could easily be refined to collections. The ones I would use most often would be the natural numbers and random numbers. Also imagine, an infinite sequence of random colors which look fairly good together. Markov chains, die rolls… there are a lot of potential uses that become interesting once the threat of accidental infinite loop has been removed...

As a real world example of the "effectively infinite” sequence, this weekend I created a Swift 3 collection of repeatable random values (of any type conforming to a protocol). I would have used sequence if the subscript behavior was available on it, since I do intend it to be an infinite sequence in most cases (Apologies for the lack of comments, it is not yet prepared for public consumption):
https://gist.github.com/jonhull/3655672529f8cf5b2eb248583d2cafb9

The use case here is to create a hand-drawn look for a CGPath by breaking it up into pieces and then jiggling the pieces about (using random CGVectors). I quickly realized that I needed a repeatable source of randomness (otherwise the drawing would move each time it was redrawn), and thus a multi-pass sequence.

Thank you for sharing this example. It's a good one.

···

Sent from my iPad

On Jun 27, 2016, at 8:32 PM, Jonathan Hull via swift-evolution <swift-evolution@swift.org> wrote:

On Jun 27, 2016, at 9:39 AM, Dave Abrahams <dabrahams@apple.com> wrote:

on Sun Jun 26 2016, Jonathan Hull <jhull-AT-gbis.com> wrote:

I am a little bit nervous every time I use this, as it has the potential for an “effectively infinite” loop, but is proving useful throughout my project.

Thanks,
Jon

Thanks,
Jon

on Wed Jun 22 2016, David Waite <david-AT-alkaline-solutions.com> wrote:

On Jun 22, 2016, at 2:57 PM, Dave Abrahams via swift-evolution >>>>>> <swift-evolution at swift.org >>>>>> <https://lists.swift.org/mailman/listinfo/swift-evolution>> >>>>>> wrote:

<Ahem> “Iterators,” please.

That makes me happy - for some reason I thought it was still GeneratorProtocol

destructively, but such Generators would not conform to the needs of
Sequence. As such, the most significant impact would be the inability
to use such Generators in a for..in loop,

Trying to evaluate this statement, it's clear we're missing lots of
detail here:

* Would you remove Sequence?
* If so, what Protocol would embody “for...in-able?”

No, I would just remove the allowance in the documentation and API
design for a destructive/consuming iteration. Sequence would be the
interface to getting access to repeatable iteration, without the need
for meeting the other requirements for Collection.

That would be wrong unless there exist substantial examples of a
multipass Sequence that *can't* meet the other requirements of
Collection without loss of efficiency. And since I can write an adaptor
that turns any multipass sequence into a Collection, I think it's
trivial to prove that no such examples exist.

--
-Dave

--
-Dave

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Dave Abrahams) #14

Comments inline

I should be clear up-front about the main problem I'm trying to solve:

Today, people see a beautiful, simple protocol (Sequence) to which many
things conform. They don't recognize that there's a semantic restriction
(assume you can make only a single-pass!) on it, so they write libraries
of functions that may make multiple passes over Sequences. They test
their libraries with the most commonly-available Sequences, e.g. Arrays
and Ranges, which happen to be multi-pass. Their tests pass! But their
constraints are wrong, their whole model of how to write generic code
over sequences is wrong, and some of their code is wrong.

IMO this is a problematic programming model.

Agreed.

Can’t a Sequence be potentially infinite, whereas a collection has a
defined count/endIndex? Other than that, I agree with your statement.

I agree that that is currently what the documentation allows and requires.
Maybe we do need to separate finiteness from multipass-ness. There's
certainly no reason one can't make multiple passes over a portion of an
infinite sequence.

I have a use-case for this below. Graphical manipulation using a repeatable sequence of random numbers.

[Just to complicate things... I wonder if finiteness is really
meaningful. It's easy to create a finite sequence that's so long that
it's “effectively infinite.”]

See below… I am definitely guilty of this. That said, if we had
explicit infinite sequences (with subscripts), I would use those
instead of collections for these use-cases.

Here is what I see as the appropriate structure:

Iterator: Single destructive pass, potentially infinite, (should be
for-in able)

[Note: the best way to represent “single destructive pass” today is to
constrain Iterator to be a reference type. Otherwise, it's both easy to
create an Iterator that can be used to make multiple passes (by
copying), and to create a truly single-pass Iterator that suggests it
has value semantics. These are both potentially misleading situations]

Hmm… This is a really good point. I can see why you are thinking of making it a reference type.

If a particular iterator can safely be cloned mid-stream, then it can
provide its own interface to allow that. The value type makes a
promise which can’t always be kept.

Sequence: Guaranteed non-destructive multi-pass (vends Iterators),
potentially infinite, (should be subscript-able, gain most of
collection, but lose anything that relies on it ending)

Collection: Multi-pass, guaranteed finite, (no changes from current
form, except extra inits from Iterator/Sequence with end conditions)

This is a reasonable structure, but there are important details missing.

1. Presumably these are all for-in-able. What makes something
  for-in-able?

I would think the potentially infinite should require for-in-until
(even if you explicitly set until to false to create an infinite
loop), but collection would allow for-in (with optional until). That
way you have to acknowledge the possibility of an infinite
sequence/iterator.

Are you proposing a new language feature? We could also do this with

    for i in allIntegers.until(isPrime)

2. Presumably Collection refines Sequence. Does Sequence refine
  Iterator? IMO that would create the same problematic programming
  model we have today.

Sequence vends iterators. (a sequence is NOT a refinement of iterator,
it just creates them as needed)

Perhaps the language should accept types conforming to either of two
unrelated protocols (your Sequence and Iterator, as you've described
them, with no refinement relationship) for for-in.

Yes.

Right now we are allowed to have an infinite sequence, but calling
dropLast or non-lazy map will cause an infinite loop. These cases
could be made much safer by considering the potentially infinite and
finite cases separately...

The other thing I am concerned about here is that we're addressing real
use-cases with these distinctions. For example, do people commonly get
in trouble with infinite sequences today?

Probably not… because they avoid infinite sequences to avoid getting
into trouble. I was bitten a few times early on (I assumed map was
lazy) and just avoided them until recently.

I think they would be used more often if you could guarantee that
their use was safe (i.e. being forced to consider the infinite
possibility). I would like to have a bunch of infinite sequences that
could easily be refined to collections. The ones I would use most
often would be the natural numbers and random numbers. Also imagine,
an infinite sequence of random colors which look fairly good together.
Markov chains, die rolls… there are a lot of potential uses that
become interesting once the threat of accidental infinite loop has
been removed...

As a real world example of the "effectively infinite” sequence, this
weekend I created a Swift 3 collection of repeatable random values (of
any type conforming to a protocol). I would have used sequence if the
subscript behavior was available on it, since I do intend it to be an
infinite sequence in most cases (Apologies for the lack of comments,
it is not yet prepared for public consumption):
https://gist.github.com/jonhull/3655672529f8cf5b2eb248583d2cafb9

The use case here is to create a hand-drawn look for a CGPath by
breaking it up into pieces and then jiggling the pieces about (using
random CGVectors). I quickly realized that I needed a repeatable
source of randomness (otherwise the drawing would move each time it
was redrawn), and thus a multi-pass sequence.

I am a little bit nervous every time I use this, as it has the
potential for an “effectively infinite” loop, but is proving useful
throughout my project.

Thanks for the example; that helps.

···

on Mon Jun 27 2016, Jonathan Hull <jhull-AT-gbis.com> wrote:

On Jun 27, 2016, at 9:39 AM, Dave Abrahams <dabrahams@apple.com> wrote:
on Sun Jun 26 2016, Jonathan Hull <jhull-AT-gbis.com <http://jhull-at-gbis.com/>> wrote:

Thanks,
Jon

Thanks,
Jon

on Wed Jun 22 2016, David Waite <david-AT-alkaline-solutions.com >>>> <http://david-at-alkaline-solutions.com/>> wrote:

On Jun 22, 2016, at 2:57 PM, Dave Abrahams via swift-evolution >>>>>> <swift-evolution at swift.org <http://swift.org/> >>>>>> <https://lists.swift.org/mailman/listinfo/swift-evolution >>>>>> <https://lists.swift.org/mailman/listinfo/swift-evolution>>> >>>>>> wrote:

<Ahem> “Iterators,” please.

That makes me happy - for some reason I thought it was still GeneratorProtocol

destructively, but such Generators would not conform to the needs of
Sequence. As such, the most significant impact would be the inability
to use such Generators in a for..in loop,

Trying to evaluate this statement, it's clear we're missing lots of
detail here:

* Would you remove Sequence?
* If so, what Protocol would embody “for...in-able?”

No, I would just remove the allowance in the documentation and API
design for a destructive/consuming iteration. Sequence would be the
interface to getting access to repeatable iteration, without the need
for meeting the other requirements for Collection.

That would be wrong unless there exist substantial examples of a
multipass Sequence that *can't* meet the other requirements of
Collection without loss of efficiency. And since I can write an adaptor
that turns any multipass sequence into a Collection, I think it's
trivial to prove that no such examples exist.

--
-Dave

--
-Dave

--
Dave


(Dave Abrahams) #15

Can’t a Sequence be potentially infinite, whereas a collection has a
defined count/endIndex? Other than that, I agree with your

statement.

Here is what I see as the appropriate structure:

Iterator: Single destructive pass, potentially infinite, (should be for-in able)
Sequence: Guaranteed non-destructive multi-pass (vends Iterators),
potentially infinite, (should be subscript-able, gain most of
collection, but lose anything that relies on it ending)
Collection: Multi-pass, guaranteed finite, (no changes from current
form, except extra inits from Iterator/Sequence with end conditions)

Right now we are allowed to have an infinite sequence, but calling
dropLast or non-lazy map will cause an infinite loop. These cases
could be made much safer by considering the potentially infinite and
finite cases separately…

I think this is pointing in the right general direction. It would
make working with `Sequence` much more straightforward and allow us to
depend on the multi-pass property that is true in practice of the most
common models of `Sequence`.

But I agree that we should give much more careful consideration to
finite / infinite generally and for..in specifically.

Now that I have been thinking about the finite / infinite distinction
more closely I have begun to notice a lot of code that is written
generically using `Sequence` where a for..in loop is really what is
required, however the “finite sequence” precondition is not explicitly
stated. Interestingly, this is the case with the standard library’s
eager `map` (but not the case with `dropLast` which explicitly notes
the precondition). I have been somewhat surprised to realize how
common this “bug” is (i.e. not stating a precondition). I think we
have gotten away with it thus far because the sequences most people
use most of the time in practice are finite. But that doesn’t mean we
should accept this as good enough - IMO it is way to easy to forget to
document this precondition (and obviously easier for users to overlook
than preconditions that are actually encoded in the type system,
violations of which are caught at compile time).

The fact that this pattern is so pervasive is what I meant when I said
for..in “naturally” requires a finite sequence.

IMO it’s better to encode preconditions in the type system when that
is practical, and especially when the precondition is shared by a vast
majority of code written using a particular construct (in this case a
for..in loop written using the most generic for..in-able protocol).

I think the safest solution is to take the position that writing an
infinite loop is relatively uncommon and is a more “advanced”
technique, and thus should be done explicitly. Do people really write
infinite loops often enough that the convenience of using for..in when
writing infinite loops outweighs the safety benefit of preventing
accidental infinite loops? I haven’t seen a compelling argument for
this.

Good questions. I'd also add: “do infinite sequences come up often
enough that accidentally looping on them forever is a problem?”

That’s a good question as well. In practice the frequency of infinite
sequences likely depends on the domain.

IMO this falls into the same category as “do single pass sequences
come up often enough that attempting to iterate over them twice is a
problem. To paraphrase your previous post:

Today, people see a beautiful, simple protocol (Sequence) to which many
things conform. They don't recognize that there's a semantic restriction
(you can’t assume it is finite!) on it, so they write libraries
of functions that may iterate a Sequence to termination. They test
their libraries with the most commonly-available Sequences, e.g. Arrays
and Ranges, which happen to be finite. Their tests pass! But their
constraints are wrong, their whole model of how to write generic code
over sequences is wrong, and some of their code is wrong.

IMO this is a problematic programming model.

I definitely don’t mean to put words in your mouth here, but the
logical structure of the argument appears identical to me regardless
of which issue it is applied to. I am only trying to make that point.

Oh, I fully agree. We're *in* this discussion because neither
single-pass nor infinite sequences come up very often. It raises the
question of whether the conceptual framework ought to accomodate them at
all, and if they should be included, how they should be represented.

The quotation you cited above isn't arguing that single-pass sequences
are important enough to represent. It is trying to point out that the
refinement relationship between single- and multipass sequences has an
undesirable effect. I'm certain the same thing could occur for infinite
and finite sequences.

FWIW, Max pointed out to me on Friday that Scala's concept for
possibly-single-pass sequences is called “TraversibleOnce.” I think
that name goes a long way to solving the problem. I'm not sure how we'd
apply the same idea to finite/infinite sequences, though.

···

on Mon Jun 27 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:

On Jun 27, 2016, at 11:46 AM, Dave Abrahams <dabrahams@apple.com> wrote:
on Mon Jun 27 2016, Matthew Johnson <matthew-AT-anandabits.com <http://matthew-at-anandabits.com/>> wrote:

On Jun 26, 2016, at 10:56 PM, Jonathan Hull via swift-evolution >>>> <swift-evolution@swift.org> wrote:

If we adopt that position then for..in would need to be built on top
of a guaranteed finite construct. This would allow programmers to
continue writing generic code agains the most generic for..in-able
construct while eliminating a precondition that is often (usually?)
unstated and likely unconsidered.

If we do decide to move forward with infinite for..in loops I think we
need to establish strong guidance around how to properly write generic
code with these protocols. Should such code really be constrained to
`Collection` rather than `Sequence` (i.e. should a potentially
infinite `Sequence` have an eager map)? If this is the guidance,
should it be paired with guidance that all finite sequences should
conform to `Collection`? Or is it sufficient to just educate
developers about this issue and expect people to document the “finite
Sequence” precondition when the constraint is `Sequence` rather than
`Collection`?

I hope we will give serious consideration to these questions while
this topic is open for discussion.

-Matthew

Thanks,
Jon

on Wed Jun 22 2016, David Waite <david-AT-alkaline-solutions.com >>>>> <http://david-at-alkaline-solutions.com/>> wrote:

On Jun 22, 2016, at 2:57 PM, Dave Abrahams via swift-evolution >>>>>>> <swift-evolution at swift.org <http://swift.org/> >>>>>>> <https://lists.swift.org/mailman/listinfo/swift-evolution >>>>>>> <https://lists.swift.org/mailman/listinfo/swift-evolution>>> >>>>>>> wrote:

<Ahem> “Iterators,” please.

That makes me happy - for some reason I thought it was still GeneratorProtocol

destructively, but such Generators would not conform to the needs of
Sequence. As such, the most significant impact would be the inability
to use such Generators in a for..in loop,

Trying to evaluate this statement, it's clear we're missing lots of
detail here:

* Would you remove Sequence?
* If so, what Protocol would embody “for...in-able?”

No, I would just remove the allowance in the documentation and API
design for a destructive/consuming iteration. Sequence would be the
interface to getting access to repeatable iteration, without the need
for meeting the other requirements for Collection.

That would be wrong unless there exist substantial examples of a
multipass Sequence that *can't* meet the other requirements of
Collection without loss of efficiency. And since I can write an adaptor
that turns any multipass sequence into a Collection, I think it's
trivial to prove that no such examples exist.

--
-Dave

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

--
-Dave

--
Dave


(Matthew Johnson) #16

Can’t a Sequence be potentially infinite, whereas a collection has a
defined count/endIndex? Other than that, I agree with your

statement.

Here is what I see as the appropriate structure:

Iterator: Single destructive pass, potentially infinite, (should be for-in able)
Sequence: Guaranteed non-destructive multi-pass (vends Iterators),
potentially infinite, (should be subscript-able, gain most of
collection, but lose anything that relies on it ending)
Collection: Multi-pass, guaranteed finite, (no changes from current
form, except extra inits from Iterator/Sequence with end conditions)

Right now we are allowed to have an infinite sequence, but calling
dropLast or non-lazy map will cause an infinite loop. These cases
could be made much safer by considering the potentially infinite and
finite cases separately…

I think this is pointing in the right general direction. It would
make working with `Sequence` much more straightforward and allow us to
depend on the multi-pass property that is true in practice of the most
common models of `Sequence`.

But I agree that we should give much more careful consideration to
finite / infinite generally and for..in specifically.

Now that I have been thinking about the finite / infinite distinction
more closely I have begun to notice a lot of code that is written
generically using `Sequence` where a for..in loop is really what is
required, however the “finite sequence” precondition is not explicitly
stated. Interestingly, this is the case with the standard library’s
eager `map` (but not the case with `dropLast` which explicitly notes
the precondition). I have been somewhat surprised to realize how
common this “bug” is (i.e. not stating a precondition). I think we
have gotten away with it thus far because the sequences most people
use most of the time in practice are finite. But that doesn’t mean we
should accept this as good enough - IMO it is way to easy to forget to
document this precondition (and obviously easier for users to overlook
than preconditions that are actually encoded in the type system,
violations of which are caught at compile time).

The fact that this pattern is so pervasive is what I meant when I said
for..in “naturally” requires a finite sequence.

IMO it’s better to encode preconditions in the type system when that
is practical, and especially when the precondition is shared by a vast
majority of code written using a particular construct (in this case a
for..in loop written using the most generic for..in-able protocol).

I think the safest solution is to take the position that writing an
infinite loop is relatively uncommon and is a more “advanced”
technique, and thus should be done explicitly. Do people really write
infinite loops often enough that the convenience of using for..in when
writing infinite loops outweighs the safety benefit of preventing
accidental infinite loops? I haven’t seen a compelling argument for
this.

Good questions. I'd also add: “do infinite sequences come up often
enough that accidentally looping on them forever is a problem?”

That’s a good question as well. In practice the frequency of infinite
sequences likely depends on the domain.

IMO this falls into the same category as “do single pass sequences
come up often enough that attempting to iterate over them twice is a
problem. To paraphrase your previous post:

Today, people see a beautiful, simple protocol (Sequence) to which many
things conform. They don't recognize that there's a semantic restriction
(you can’t assume it is finite!) on it, so they write libraries
of functions that may iterate a Sequence to termination. They test
their libraries with the most commonly-available Sequences, e.g. Arrays
and Ranges, which happen to be finite. Their tests pass! But their
constraints are wrong, their whole model of how to write generic code
over sequences is wrong, and some of their code is wrong.

IMO this is a problematic programming model.

I definitely don’t mean to put words in your mouth here, but the
logical structure of the argument appears identical to me regardless
of which issue it is applied to. I am only trying to make that point.

Oh, I fully agree. We're *in* this discussion because neither
single-pass nor infinite sequences come up very often. It raises the
question of whether the conceptual framework ought to accomodate them at
all, and if they should be included, how they should be represented.

We’ve considered good practical examples of both. They exist and people use them. I think we’re better off having a common vocabulary for them rather than requiring everyone who uses these concepts to invent their own.

I agree that the question of how best to represent them is crucial.

The quotation you cited above isn't arguing that single-pass sequences
are important enough to represent. It is trying to point out that the
refinement relationship between single- and multipass sequences has an
undesirable effect. I'm certain the same thing could occur for infinite
and finite sequences.

I’m not sure I agree. Neither are clearly modeled in the current library so I don’t think we know what the impact would be if they were. I’m willing to bet that people would choose the correct constraints if they were available (modeled accurately in the standard library, well documented, etc).

What both of these *definitely* have in common is that they are purely semantic requirements that cannot be stated explicitly as requirements using syntax of the language. You have reminded us many times that protocols should not just be about syntax, but should also emphasize semantics. We shouldn’t shy away from introducing a protocol that doesn’t add any syntactic requirements if it is necessary to capture important semantic requirements.

FWIW, Max pointed out to me on Friday that Scala's concept for
possibly-single-pass sequences is called “TraversibleOnce.” I think
that name goes a long way to solving the problem. I'm not sure how we'd
apply the same idea to finite/infinite sequences, though.

I’ve was thinking about how to best represent these while mowing the lawn this afternoon and had an interesting thought. If we add one additional protocol and pin down some of the semantics that have been discussed in this thread we could have a pretty straightforward model that captures the various semantics using a protocol in each of 4 quadrants:

                        Single Pass Multi Pass

Potentially Sequence
Infinite IteratorProtocol <-------------------------| makeIterator()
                                  > >
                                  > Collection
Finite FiniteIteratorProtocol <------------------- makeIterator()

The semantic refinement on `Sequence` is that it must now produce a new iterator every time one is requested (and they must all produce identical sequences of values as long as the sequence was not mutated in between calls to `makeIterator`).

The semantic refinement on `Collection` is that it must now be finite and its `Iterator` must conform to `FiniteIteratorProtocol` rather than `IteratorProtocol`.

`FiniteIteratorProtocol` is the new protocol which introduces the semantic requirement that calls to `next` must “eventually” produce `nil`. “Eventually” might be a bit fuzzy but should carry the intent that your program won’t hang or be killed by the OS (i.e. diverge) if you iterate until `nil` is reached.

You mentioned in another thread that for..in wouldn’t necessarily have to use a single protocol (or refinement relationship). Under this model I would propose that we make for..in operate on top of `FiniteIteratorProtocol`, but also have the ability to get the iterator from a `Collection`. (I’m using still advocating for restricting the for..in sugar to finite iteration but the same idea would work at the “potentially infinite” level using `IteratorProtocol and `Sequence`). The only change from current state (aside from the finite restriction) is to allow you to directly provide an iterator to the for..in loop rathe than requiring the compiler to get a new one via a call to `makeIterator`.

If we do adopt something like this model, I think a couple of guidelines are clear regardless of whether for..in works exclusively with finite concepts or also admits potentially infinite ones. All code that requires terminating iteration should be written with `Collection` or `FiniteIteratorProtocol`, not `Sequence` or `IteratorProtocol`. Following from this, all finite sequences should conform to `Collection`, not just `Sequence` so it can continue to work with all code that requires finite sequences. One advantage of restricting for..in to the finite concepts is that it encourages people to follow this guideline so that their types will be usable with for..in, finite constraints, etc.

In order to make it easy to follow the second guideline, the library should strive to make it as easy as possible to implement forward only `Collection` if you can vend an iterator and guarantee it is finite. Ideally it would be no more difficult than it is to implement `Sequence` today. Further customization of the implementation would be done as an optimization, not just to achieve basic capabilities.

One other topic that came up is whether the single pass concepts should be required to have reference semantics. If we go in that direction it is clear that in the previous model `IteratorProtocol` would have a class requirement which would be inherited by `FiniteIteratorProtocol`.

However, I am not convinced we should require reference semantics here. This requirement doesn’t exist today and I haven’t heard anyone say it has directly been a problem. The problem seems to be the due to the fact that `Sequence` is missing a semantic requirement everyone expects. I don’t know the rationale that was originally behind that decision and am curious. Is it because for..in works via `Sequence` and it was desirable to have for..in work with single pass sequences? The above suggested tweak to how for..in addresses that.

Further, consider an algorithm which has multiple “choices” available and would like to be able to take advantage of backtracking. This can be achieved trivially with value semantic iterators by simply making a copy at any time during iteration and returning to the copy if a dead end is reached. You could sort of mimic this with a reference semantic iterator by conforming the iterator itself to `Sequence` or `Collection` and calling `makeIterator` to “clone” it. Obviously this would only be possible with intimate knowledge of the iterator and would usually not be possible retroactively, whereas it would be trivial to take advantage of iterators that have value semantics.

-Matthew

···

On Jun 27, 2016, at 2:41 PM, Dave Abrahams <dabrahams@apple.com> wrote:
on Mon Jun 27 2016, Matthew Johnson <matthew-AT-anandabits.com <http://matthew-at-anandabits.com/>> wrote:

On Jun 27, 2016, at 11:46 AM, Dave Abrahams <dabrahams@apple.com <mailto:dabrahams@apple.com>> wrote:
on Mon Jun 27 2016, Matthew Johnson <matthew-AT-anandabits.com <http://matthew-at-anandabits.com/> <http://matthew-at-anandabits.com/>> wrote:

On Jun 26, 2016, at 10:56 PM, Jonathan Hull via swift-evolution >>>>> <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

If we adopt that position then for..in would need to be built on top
of a guaranteed finite construct. This would allow programmers to
continue writing generic code agains the most generic for..in-able
construct while eliminating a precondition that is often (usually?)
unstated and likely unconsidered.

If we do decide to move forward with infinite for..in loops I think we
need to establish strong guidance around how to properly write generic
code with these protocols. Should such code really be constrained to
`Collection` rather than `Sequence` (i.e. should a potentially
infinite `Sequence` have an eager map)? If this is the guidance,
should it be paired with guidance that all finite sequences should
conform to `Collection`? Or is it sufficient to just educate
developers about this issue and expect people to document the “finite
Sequence” precondition when the constraint is `Sequence` rather than
`Collection`?

I hope we will give serious consideration to these questions while
this topic is open for discussion.

-Matthew

Thanks,
Jon

on Wed Jun 22 2016, David Waite <david-AT-alkaline-solutions.com <http://david-at-alkaline-solutions.com/> >>>>>> <http://david-at-alkaline-solutions.com/ <http://david-at-alkaline-solutions.com/>>> wrote:

On Jun 22, 2016, at 2:57 PM, Dave Abrahams via swift-evolution >>>>>>>> <swift-evolution at swift.org <http://swift.org/> <http://swift.org/> >>>>>>>> <https://lists.swift.org/mailman/listinfo/swift-evolution >>>>>>>> <https://lists.swift.org/mailman/listinfo/swift-evolution>>> >>>>>>>> wrote:

<Ahem> “Iterators,” please.

That makes me happy - for some reason I thought it was still GeneratorProtocol

destructively, but such Generators would not conform to the needs of
Sequence. As such, the most significant impact would be the inability
to use such Generators in a for..in loop,

Trying to evaluate this statement, it's clear we're missing lots of
detail here:

* Would you remove Sequence?
* If so, what Protocol would embody “for...in-able?”

No, I would just remove the allowance in the documentation and API
design for a destructive/consuming iteration. Sequence would be the
interface to getting access to repeatable iteration, without the need
for meeting the other requirements for Collection.

That would be wrong unless there exist substantial examples of a
multipass Sequence that *can't* meet the other requirements of
Collection without loss of efficiency. And since I can write an adaptor
that turns any multipass sequence into a Collection, I think it's
trivial to prove that no such examples exist.

--
-Dave

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

--
-Dave

--
Dave


(Jon Hull) #17

I was, but I think I would be ok with this too, as long as the compiler caught the issue and suggested a fix. The key is that we are forced to deal with the possibility of an infinite sequence/iterator. Also:

for i in infSequence.maxLoops(3000) //or a better name for this idea

Thanks,
Jon

···

On Jun 28, 2016, at 10:51 AM, Dave Abrahams <dabrahams@apple.com> wrote:

1. Presumably these are all for-in-able. What makes something
for-in-able?

I would think the potentially infinite should require for-in-until
(even if you explicitly set until to false to create an infinite
loop), but collection would allow for-in (with optional until). That
way you have to acknowledge the possibility of an infinite
sequence/iterator.

Are you proposing a new language feature? We could also do this with

   for i in allIntegers.until(isPrime)


(Jon Hull) #18

One more question on this.

How would we handle the opt-out safety for people who are intentionally trying to create an infinite loop?

I don’t think we can make .until() safely lazy for iterators which have reference semantics and are destructive single-pass, so it has to be eager. If .until() is eager, then .until(false) will fall into an infinite loop before the for-in loop is even run.

Maybe there just a way to plug an iterator/sequence directly in, and then silence the warning? or we could do for-in-until...

Thanks,
Jon

···

On Jun 28, 2016, at 10:51 AM, Dave Abrahams <dabrahams@apple.com> wrote:

This is a reasonable structure, but there are important details missing.

1. Presumably these are all for-in-able. What makes something
for-in-able?

I would think the potentially infinite should require for-in-until
(even if you explicitly set until to false to create an infinite
loop), but collection would allow for-in (with optional until). That
way you have to acknowledge the possibility of an infinite
sequence/iterator.

Are you proposing a new language feature? We could also do this with

   for i in allIntegers.until(isPrime)


(Matthew Johnson) #19

Comments inline

I should be clear up-front about the main problem I'm trying to solve:

Today, people see a beautiful, simple protocol (Sequence) to which many
things conform. They don't recognize that there's a semantic restriction
(assume you can make only a single-pass!) on it, so they write libraries
of functions that may make multiple passes over Sequences. They test
their libraries with the most commonly-available Sequences, e.g. Arrays
and Ranges, which happen to be multi-pass. Their tests pass! But their
constraints are wrong, their whole model of how to write generic code
over sequences is wrong, and some of their code is wrong.

IMO this is a problematic programming model.

Agreed.

Can’t a Sequence be potentially infinite, whereas a collection has a
defined count/endIndex? Other than that, I agree with your statement.

I agree that that is currently what the documentation allows and requires.
Maybe we do need to separate finiteness from multipass-ness. There's
certainly no reason one can't make multiple passes over a portion of an
infinite sequence.

I have a use-case for this below. Graphical manipulation using a repeatable sequence of random numbers.

[Just to complicate things... I wonder if finiteness is really
meaningful. It's easy to create a finite sequence that's so long that
it's “effectively infinite.”]

See below… I am definitely guilty of this. That said, if we had
explicit infinite sequences (with subscripts), I would use those
instead of collections for these use-cases.

Here is what I see as the appropriate structure:

Iterator: Single destructive pass, potentially infinite, (should be
for-in able)

[Note: the best way to represent “single destructive pass” today is to
constrain Iterator to be a reference type. Otherwise, it's both easy to
create an Iterator that can be used to make multiple passes (by
copying), and to create a truly single-pass Iterator that suggests it
has value semantics. These are both potentially misleading situations]

Hmm… This is a really good point. I can see why you are thinking of making it a reference type.

If a particular iterator can safely be cloned mid-stream, then it can
provide its own interface to allow that. The value type makes a
promise which can’t always be kept.

Sequence: Guaranteed non-destructive multi-pass (vends Iterators),
potentially infinite, (should be subscript-able, gain most of
collection, but lose anything that relies on it ending)

Collection: Multi-pass, guaranteed finite, (no changes from current
form, except extra inits from Iterator/Sequence with end conditions)

This is a reasonable structure, but there are important details missing.

1. Presumably these are all for-in-able. What makes something
for-in-able?

I would think the potentially infinite should require for-in-until
(even if you explicitly set until to false to create an infinite
loop), but collection would allow for-in (with optional until). That
way you have to acknowledge the possibility of an infinite
sequence/iterator.

Are you proposing a new language feature?

That was my impression. It’s an interesting idea. It wouldn’t guarantee termination but would require developers to consider termination and therefore prevent accidents, which are the concerns I am raising.

We could also do this with

   for i in allIntegers.until(isPrime)

Is `until` lazy or eager here? I imagine you’re thinking it would be lazy because making it eager would introduce unnecessary allocation. However, if it’s lazy it is an exception to the explicit laziness Swift has adopted.

Wouldn’t it be better to require explicit laziness `allIntegers.lazy.until` if for..in is going to be require finite sequences and we’re going to use a library solution to support infinite sequences? It’s more verbose but more consistent with how laziness is currently handled. It also doesn’t privilege any specific operator (which isn’t necessary if we do this in the library).

If we went with a language solution I imagine it would look something like this:

for i in allIntegers until i.isPrime && i > 1000 where i.isEven {
    // all even integers < the first prime > 1000
}

IIRC the `until` clause has already been discussed as syntactic sugar for early termination. Its utility wouldn’t be limited to looping over infinite sequences, however it would be *required* when you loop over an infinite sequence.

This sugar wouldn’t have to be introduced in Swift 3. We could make for..in require finite sequences in Swift 3 and add it later if there is sufficient demand. In the meantime people could iterate infinite sequences manually and / or we could add library support via lazy operators that select a prefix (if we are willing to live with the fact that in practice termination may depend on arguments to the operator).

···

On Jun 28, 2016, at 12:51 PM, Dave Abrahams via swift-evolution <swift-evolution@swift.org> wrote:
on Mon Jun 27 2016, Jonathan Hull <jhull-AT-gbis.com> wrote:

On Jun 27, 2016, at 9:39 AM, Dave Abrahams <dabrahams@apple.com> wrote:
on Sun Jun 26 2016, Jonathan Hull <jhull-AT-gbis.com <http://jhull-at-gbis.com/>> wrote:

2. Presumably Collection refines Sequence. Does Sequence refine
Iterator? IMO that would create the same problematic programming
model we have today.

Sequence vends iterators. (a sequence is NOT a refinement of iterator,
it just creates them as needed)

Perhaps the language should accept types conforming to either of two
unrelated protocols (your Sequence and Iterator, as you've described
them, with no refinement relationship) for for-in.

Yes.

Right now we are allowed to have an infinite sequence, but calling
dropLast or non-lazy map will cause an infinite loop. These cases
could be made much safer by considering the potentially infinite and
finite cases separately...

The other thing I am concerned about here is that we're addressing real
use-cases with these distinctions. For example, do people commonly get
in trouble with infinite sequences today?

Probably not… because they avoid infinite sequences to avoid getting
into trouble. I was bitten a few times early on (I assumed map was
lazy) and just avoided them until recently.

I think they would be used more often if you could guarantee that
their use was safe (i.e. being forced to consider the infinite
possibility). I would like to have a bunch of infinite sequences that
could easily be refined to collections. The ones I would use most
often would be the natural numbers and random numbers. Also imagine,
an infinite sequence of random colors which look fairly good together.
Markov chains, die rolls… there are a lot of potential uses that
become interesting once the threat of accidental infinite loop has
been removed...

As a real world example of the "effectively infinite” sequence, this
weekend I created a Swift 3 collection of repeatable random values (of
any type conforming to a protocol). I would have used sequence if the
subscript behavior was available on it, since I do intend it to be an
infinite sequence in most cases (Apologies for the lack of comments,
it is not yet prepared for public consumption):
https://gist.github.com/jonhull/3655672529f8cf5b2eb248583d2cafb9

The use case here is to create a hand-drawn look for a CGPath by
breaking it up into pieces and then jiggling the pieces about (using
random CGVectors). I quickly realized that I needed a repeatable
source of randomness (otherwise the drawing would move each time it
was redrawn), and thus a multi-pass sequence.

I am a little bit nervous every time I use this, as it has the
potential for an “effectively infinite” loop, but is proving useful
throughout my project.

Thanks for the example; that helps.

Thanks,
Jon

Thanks,
Jon

on Wed Jun 22 2016, David Waite <david-AT-alkaline-solutions.com >>>>> <http://david-at-alkaline-solutions.com/>> wrote:

On Jun 22, 2016, at 2:57 PM, Dave Abrahams via swift-evolution >>>>>>> <swift-evolution at swift.org <http://swift.org/> >>>>>>> <https://lists.swift.org/mailman/listinfo/swift-evolution >>>>>>> <https://lists.swift.org/mailman/listinfo/swift-evolution>>> >>>>>>> wrote:

<Ahem> “Iterators,” please.

That makes me happy - for some reason I thought it was still GeneratorProtocol

destructively, but such Generators would not conform to the needs of
Sequence. As such, the most significant impact would be the inability
to use such Generators in a for..in loop,

Trying to evaluate this statement, it's clear we're missing lots of
detail here:

* Would you remove Sequence?
* If so, what Protocol would embody “for...in-able?”

No, I would just remove the allowance in the documentation and API
design for a destructive/consuming iteration. Sequence would be the
interface to getting access to repeatable iteration, without the need
for meeting the other requirements for Collection.

That would be wrong unless there exist substantial examples of a
multipass Sequence that *can't* meet the other requirements of
Collection without loss of efficiency. And since I can write an adaptor
that turns any multipass sequence into a Collection, I think it's
trivial to prove that no such examples exist.

--
-Dave

--
-Dave

--
Dave
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution


(Dave Abrahams) #20

Can’t a Sequence be potentially infinite, whereas a collection has a
defined count/endIndex? Other than that, I agree with your

statement.

Here is what I see as the appropriate structure:

Iterator: Single destructive pass, potentially infinite, (should be for-in able)
Sequence: Guaranteed non-destructive multi-pass (vends Iterators),
potentially infinite, (should be subscript-able, gain most of
collection, but lose anything that relies on it ending)
Collection: Multi-pass, guaranteed finite, (no changes from current
form, except extra inits from Iterator/Sequence with end conditions)

Right now we are allowed to have an infinite sequence, but calling
dropLast or non-lazy map will cause an infinite loop. These cases
could be made much safer by considering the potentially infinite and
finite cases separately…

I think this is pointing in the right general direction. It would
make working with `Sequence` much more straightforward and allow us to
depend on the multi-pass property that is true in practice of the most
common models of `Sequence`.

But I agree that we should give much more careful consideration to
finite / infinite generally and for..in specifically.

Now that I have been thinking about the finite / infinite distinction
more closely I have begun to notice a lot of code that is written
generically using `Sequence` where a for..in loop is really what is
required, however the “finite sequence” precondition is not explicitly
stated. Interestingly, this is the case with the standard library’s
eager `map` (but not the case with `dropLast` which explicitly notes
the precondition). I have been somewhat surprised to realize how
common this “bug” is (i.e. not stating a precondition). I think we
have gotten away with it thus far because the sequences most people
use most of the time in practice are finite. But that doesn’t mean we
should accept this as good enough - IMO it is way to easy to forget to
document this precondition (and obviously easier for users to overlook
than preconditions that are actually encoded in the type system,
violations of which are caught at compile time).

The fact that this pattern is so pervasive is what I meant when I said
for..in “naturally” requires a finite sequence.

IMO it’s better to encode preconditions in the type system when that
is practical, and especially when the precondition is shared by a vast
majority of code written using a particular construct (in this case a
for..in loop written using the most generic for..in-able protocol).

I think the safest solution is to take the position that writing an
infinite loop is relatively uncommon and is a more “advanced”
technique, and thus should be done explicitly. Do people really write
infinite loops often enough that the convenience of using for..in when
writing infinite loops outweighs the safety benefit of preventing
accidental infinite loops? I haven’t seen a compelling argument for
this.

Good questions. I'd also add: “do infinite sequences come up often
enough that accidentally looping on them forever is a problem?”

That’s a good question as well. In practice the frequency of infinite
sequences likely depends on the domain.

IMO this falls into the same category as “do single pass sequences
come up often enough that attempting to iterate over them twice is a
problem. To paraphrase your previous post:

Today, people see a beautiful, simple protocol (Sequence) to which many
things conform. They don't recognize that there's a semantic restriction
(you can’t assume it is finite!) on it, so they write libraries
of functions that may iterate a Sequence to termination. They test
their libraries with the most commonly-available Sequences, e.g. Arrays
and Ranges, which happen to be finite. Their tests pass! But their
constraints are wrong, their whole model of how to write generic code
over sequences is wrong, and some of their code is wrong.

IMO this is a problematic programming model.

I definitely don’t mean to put words in your mouth here, but the
logical structure of the argument appears identical to me regardless
of which issue it is applied to. I am only trying to make that point.

Oh, I fully agree. We're *in* this discussion because neither
single-pass nor infinite sequences come up very often. It raises the
question of whether the conceptual framework ought to accomodate them at
all, and if they should be included, how they should be represented.

We’ve considered good practical examples of both. They exist and
people use them. I think we’re better off having a common vocabulary
for them rather than requiring everyone who uses these concepts to
invent their own.

I agree that the question of how best to represent them is crucial.

The quotation you cited above isn't arguing that single-pass sequences
are important enough to represent. It is trying to point out that the
refinement relationship between single- and multipass sequences has an
undesirable effect. I'm certain the same thing could occur for infinite
and finite sequences.

I’m not sure I agree. Neither are clearly modeled in the current
library so I don’t think we know what the impact would be if they
were. I’m willing to bet that people would choose the correct
constraints if they were available (modeled accurately in the standard
library, well documented, etc).

IMO it's just a symptom of concept refinement when the available models
almost all conform to the more-refined concept.

What both of these *definitely* have in common is that they are purely
semantic requirements that cannot be stated explicitly as requirements
using syntax of the language. You have reminded us many times that
protocols should not just be about syntax, but should also emphasize
semantics. We shouldn’t shy away from introducing a protocol that
doesn’t add any syntactic requirements if it is necessary to capture
important semantic requirements.

FWIW, Max pointed out to me on Friday that Scala's concept for
possibly-single-pass sequences is called “TraversibleOnce.” I think
that name goes a long way to solving the problem. I'm not sure how we'd
apply the same idea to finite/infinite sequences, though.

I’ve was thinking about how to best represent these while mowing the
lawn this afternoon and had an interesting thought. If we add one
additional protocol and pin down some of the semantics that have been
discussed in this thread we could have a pretty straightforward model
that captures the various semantics using a protocol in each of 4
quadrants:

                        Single Pass Multi Pass

Potentially Sequence
Infinite IteratorProtocol <-------------------------| makeIterator()
                                  > >
                                  > Collection
Finite FiniteIteratorProtocol <------------------- makeIterator()

The semantic refinement on `Sequence` is that it must now produce a
new iterator every time one is requested (and they must all produce
identical sequences of values as long as the sequence was not mutated
in between calls to `makeIterator`).

The semantic refinement on `Collection` is that it must now be finite
and its `Iterator` must conform to `FiniteIteratorProtocol` rather
than `IteratorProtocol`.

`FiniteIteratorProtocol` is the new protocol which introduces the
semantic requirement that calls to `next` must “eventually” produce
`nil`. “Eventually” might be a bit fuzzy but should carry the intent
that your program won’t hang or be killed by the OS (i.e. diverge) if
you iterate until `nil` is reached.

I talked to Dmitri and Max about all this last night and I think we have
a simpler answer, which we'll post shortly. Please let us know if you
think it is missing something.

You mentioned in another thread that for..in wouldn’t necessarily have
to use a single protocol (or refinement relationship). Under this
model I would propose that we make for..in operate on top of
`FiniteIteratorProtocol`, but also have the ability to get the
iterator from a `Collection`. (I’m using still advocating for
restricting the for..in sugar to finite iteration but the same idea
would work at the “potentially infinite” level using `IteratorProtocol
and `Sequence`). The only change from current state (aside from the
finite restriction) is to allow you to directly provide an iterator to
the for..in loop rathe than requiring the compiler to get a new one
via a call to `makeIterator`.

If we do adopt something like this model, I think a couple of
guidelines are clear regardless of whether for..in works exclusively
with finite concepts or also admits potentially infinite ones. All
code that requires terminating iteration should be written with
`Collection` or `FiniteIteratorProtocol`, not `Sequence` or
`IteratorProtocol`. Following from this, all finite sequences should
conform to `Collection`, not just `Sequence` so it can continue to
work with all code that requires finite sequences. One advantage of
restricting for..in to the finite concepts is that it encourages
people to follow this guideline so that their types will be usable
with for..in, finite constraints, etc.

In order to make it easy to follow the second guideline, the library
should strive to make it as easy as possible to implement forward only
`Collection` if you can vend an iterator and guarantee it is finite.
Ideally it would be no more difficult than it is to implement
`Sequence` today. Further customization of the implementation would
be done as an optimization, not just to achieve basic capabilities.

One other topic that came up is whether the single pass concepts
should be required to have reference semantics. If we go in that
direction it is clear that in the previous model `IteratorProtocol`
would have a class requirement which would be inherited by
`FiniteIteratorProtocol`.

However, I am not convinced we should require reference semantics
here. This requirement doesn’t exist today and I haven’t heard anyone
say it has directly been a problem. The problem seems to be the due
to the fact that `Sequence` is missing a semantic requirement everyone
expects. I don’t know the rationale that was originally behind that
decision and am curious. Is it because for..in works via `Sequence`
and it was desirable to have for..in work with single pass sequences?
The above suggested tweak to how for..in addresses that.

Further, consider an algorithm which has multiple “choices” available
and would like to be able to take advantage of backtracking. This can
be achieved trivially with value semantic iterators by simply making a
copy at any time during iteration and returning to the copy if a dead
end is reached. You could sort of mimic this with a reference
semantic iterator by conforming the iterator itself to `Sequence` or
`Collection` and calling `makeIterator` to “clone” it. Obviously this
would only be possible with intimate knowledge of the iterator and
would usually not be possible retroactively, whereas it would be
trivial to take advantage of iterators that have value semantics.

Seems to me that in either case you need intimate knowledge of the
iterator. You either need to know that it has value semantics, or that
it conforms to Sequence, right?

···

on Mon Jun 27 2016, Matthew Johnson <matthew-AT-anandabits.com> wrote:

On Jun 27, 2016, at 2:41 PM, Dave Abrahams <dabrahams@apple.com> wrote:
on Mon Jun 27 2016, Matthew Johnson <matthew-AT-anandabits.com <http://matthew-at-anandabits.com/>> wrote:

On Jun 27, 2016, at 11:46 AM, Dave Abrahams <dabrahams@apple.com <mailto:dabrahams@apple.com>> wrote:
on Mon Jun 27 2016, Matthew Johnson <matthew-AT-anandabits.com >>>> <http://matthew-at-anandabits.com/> >>>> <http://matthew-at-anandabits.com/ >>>> <http://matthew-at-anandabits.com/>>> wrote:

On Jun 26, 2016, at 10:56 PM, Jonathan Hull via swift-evolution >>>>>> <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

-Matthew

If we adopt that position then for..in would need to be built on top
of a guaranteed finite construct. This would allow programmers to
continue writing generic code agains the most generic for..in-able
construct while eliminating a precondition that is often (usually?)
unstated and likely unconsidered.

If we do decide to move forward with infinite for..in loops I think we
need to establish strong guidance around how to properly write generic
code with these protocols. Should such code really be constrained to
`Collection` rather than `Sequence` (i.e. should a potentially
infinite `Sequence` have an eager map)? If this is the guidance,
should it be paired with guidance that all finite sequences should
conform to `Collection`? Or is it sufficient to just educate
developers about this issue and expect people to document the “finite
Sequence” precondition when the constraint is `Sequence` rather than
`Collection`?

I hope we will give serious consideration to these questions while
this topic is open for discussion.

-Matthew

Thanks,
Jon

on Wed Jun 22 2016, David Waite <david-AT-alkaline-solutions.com <http://david-at-alkaline-solutions.com/> >>>>>>> <http://david-at-alkaline-solutions.com/ >>>>>>> <http://david-at-alkaline-solutions.com/> >>>>>>> <http://david-at-alkaline-solutions.com/ >>>>>>> <http://david-at-alkaline-solutions.com/>>>> wrote:

On Jun 22, 2016, at 2:57 PM, Dave Abrahams via swift-evolution
<swift-evolution at swift.org <http://swift.org/> <http://swift.org/>
<https://lists.swift.org/mailman/listinfo/swift-evolution
<https://lists.swift.org/mailman/listinfo/swift-evolution>
<https://lists.swift.org/mailman/listinfo/swift-evolution>>>
wrote:

<Ahem> “Iterators,” please.

That makes me happy - for some reason I thought it was still GeneratorProtocol

destructively, but such Generators would not conform to the needs of
Sequence. As such, the most significant impact would be the inability
to use such Generators in a for..in loop,

Trying to evaluate this statement, it's clear we're missing lots of
detail here:

* Would you remove Sequence?
* If so, what Protocol would embody “for...in-able?”

No, I would just remove the allowance in the documentation and API
design for a destructive/consuming iteration. Sequence would be the
interface to getting access to repeatable iteration, without the need
for meeting the other requirements for Collection.

That would be wrong unless there exist substantial examples of a
multipass Sequence that *can't* meet the other requirements of
Collection without loss of efficiency. And since I can write an adaptor
that turns any multipass sequence into a Collection, I think it's
trivial to prove that no such examples exist.

--
-Dave

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

--
-Dave

--
Dave

--
Dave