I have also been thinking about this problem for the last week or so
(as well as the finite/infinite bit). I don’t really have anything
detailed that is ready to share (and it sounds like you are headed in
a different direction now). I still wanted to share the gist of my
thoughts, in case they help spark ideas in others…
Hi Jonathan,
Thanks for your thoughts...
My thought was to follow the first rejected approach: removing
sequence and letting the Iterator protocol model single-pass.
Iterators would be reference types.
I followed a similar path, and my version also has a pretty large
duplication of API between Iterator and Collection… the difference
though, is I think I have a way to avoid most external duplication of
API.
Basically, I added back in a super-minimal protocol to fill the
structural gap left by Sequence. I call it “IteratorProvider” and it
only has a single function which vends an iterator. Collection
adheres to this, and Iterator adheres to it by returning itself. All
of the other methods from Sequence remain on Iterator. Thus anyone
with API that only needs a single pass would take a IteratorProvider
and then work on the iterator it provides.
That leaves us back where we are now: people will see that
IteratorProvider is a simple, universal protocol for both single-and
multi-pass sequences, write algorithm libraries that depend on
multi-pass-ness, and test them with the most prevalent examples, which
happen to be multipass.
The big difference is that Collection and Iterator are still separate
protocols,
That's currently the case.
iterator is a reference type,
That's slightly different, but we have reasons for not wanting to make
that requirement.
and most of the methods from sequence are now on iterator.
I think this makes more sense semantically than the current model (or
renaming sequence). I also really think it is important to have
iterators be reference types (anything else is really a lie)
Once they're reference types, you might as well just make them conform
to the same protocol as collections for algorithms, because none of the
“mutating/nonmutating” distinctions will be honored anyway. Furthermore
it implies efficiency costs we're not prepared to pay. The main problem
is that the right way to model single-pass traversal is with move-only
types, which can't be copied... and that's not something we can express
in Swift today.
The rejected “consumedIn” idea also gave me an idea of how to reduce
the internal API repetition, if desired.
Have a fileprivate method on Iterator (I will call it “consumedIn”
here, but it is private, so call it whatever) that wraps the Iterator
in a collection. The multi-pass-ness of that secret collection is a
lie, but it is fileprivate so it should never get into the wild where
someone can find that out. Then you would just define map() etc… on an
extension of Iterator and have them forward to “self.consumedIn.map”,
etc…. It does still have duplication of definitions, but the
implementations would be in a single spot.
Yeah, we thought about that, too. That doesn't really help the public
API, though. That's the problem we're trying to solve. And if we don't
expose the thing publicly, any user wishing to write her own algorithm
that works on both Iterator and Collection has to write it herself.
Another option, if the subterfuge of a secret collection is
undesirable, would be to make “consumedIn” be public and have it
create an array-like collection. The default implementation would
actually make an eager copy, but specialized cases could work with the
created collection to avoid copying the iterator’s contents where
possible. Then you would remove all of the eager methods from
Iterator and just use collection’s version.
Ultimately, there are lots of interesting ideas for addressing this
space, but none of them lead to an answer that we we're willing to bet
solves the design problems and is implementable in time for Swift
3... except for the renaming.
···
on Thu Jul 14 2016, Jonathan Hull <jhull-AT-gbis.com> wrote:
Food for thought…
Thanks,
Jon
Hi,
I'd like to continue the discussion of the issue raised by David Waite
inhttp://thread.gmane.org/gmane.comp.lang.swift.evolution/21295/: <inhttp://thread.gmane.org/gmane.comp.lang.swift.evolution/21295/:>
> My main motivation for proposing this is the potential for
developer confusion. As stated during one of the previous threads on
the naming of map, flatMap, filter, etc. methods on Sequence,
Sequence has a naming requirement not typical of the rest of the
Swift standard library in that many methods on Sequence may or may
not be destructive. As such, naming methods for any extensions on
Sequence is challenging as the names need to not imply immutability.
I'd like to focus on a particular point: methods on Sequence can
consume elements, but the APIs are not markedmutating.
Dave Abrahams, Max Moiseev, and I have discussed this issue and we
agree this problem is severe and worth solving, we also think that the
likely solutions would be source-breaking, so it is important that we
discuss it now.
We have discussed a few options.
- Rejected option: remove Sequence, let IteratorProtocol model
single-pass data streams
- Rejected option: use a syntactic marker, like sequence.consumedIn.map {}
- Rejected option: mutating APIs on Sequence, non-mutating APIs on Collection
Proposed: rename Sequence to IterableOnce or TraversableOnce. We think
that Sequence does not convey the single-pass restriction clearly. The
term "sequence" has been used in math (as in "integer sequence"), and
since the math domain does not have mutation, "sequence" can be
understood to mean "multi-pass", since you can traverse a sequence of
integers an arbitrary number of times.
We think that only the last option is viable in the Swift language as
it exists now, without creating an undue burden for API vendors and
users.
For more details about rejection options, please see the full writeup:
fix-confusion-about-consuming-algorithms-on-sequence.md · GitHub
Dmitri
--
Dave