[…]
* A Swift `Sequence` is, to put it simplistically, a thing that can be
iterated over in a `for...in` loop. If it would make you happy, for the
rest of the discussion, let's suppose we called the protocol `ForLoopable`
instead of `Sequence`.
ForLoopable is so ugly. Since we’re just iterating over the elements,
how about, oh, say, `Iterable`? Hey, that looks familiar.
I'm not trying to bikeshed the name of `Sequence`. I'm picking an
intentionally unwieldy name for the purposes of discussing the semantics of
this particular protocol. The point is that the underlying issue has
nothing to do with the name; it can be `Iterable` or it can be `SpongeBob`
for all I care.
I’m not trying to bikeshed the name either, The underlying issue is that
(what is currently) Sequence actually encompasses two separate
functionalities, and those functionalities need to be separated with their
separate semantic requirements documented. “Sequence: Iterable,”
“OrderedSequence: Sequence,” “SpongeBob: ForLoopable,” the names are 100%
irrelevant at this point; what’s important is that one is not necessarily
ordered and the other guarantees an order.
What are the "two separate functionalities”?
Iteration, with convenience methods that don’t imply or rely on an order
that may not be there; and convenience methods applicable to sequences that
do have an intrinsic order.
Sets, as a mathematical concept, have no intrinsic order. However,
instances of `Set`, which can be iterated over, *do* have at least one
order which can be said to be intrinsic in the following sense: as long as
iteration is possible, no API design can prevent that order from being
observed and associated with the instance. Put another way, if you can use
an instance of a type in a for...in loop, intrinsic to that functionality
is a publicly visible order.
All the extension methods on Sequence are ways of spelling things that you
can write in a few lines of code using a `for...in` loop; they're in the
stdlib to allow a more functional style which some prefer. If you accept
that a type should support iteration with a `for...in` loop, then what is
your basis for claiming that these extension methods are "separate
functionalities”?
Just because you *can* express something in code doesn’t mean you should,
or that it’s correct. It is objectively false to say a Set has a first or
last object, because the objects therein have no order. You can take a
random object from the set and call it “first”, but that doesn’t make that
a correct definition of Set.first. A Set has no order, a specific iteration
has an “order” only in the sense that all and only the objects in the set
have to come out one at a time, but that doesn’t mean the Set itself has an
order, specifically a first or last object.
Since Set conforms to Collection, it is guaranteed that if one element of
an instance of Set comes out first one time, it'll come out first every
time from that instance. If it helps, think of Swift's Set as modeling
(imperfectly, as all models must) both a mathematical set and a multi-pass
sequence, just as Swift's Int models both an integer and a sequence of bits.
You’re a fan of the principal of least surprise. Tell me, which would be
less surprising: Set.dropFirst() actually drops a random element, or Set
doesn’t have a dropFirst()? And if you think dropFirst() removing an
element at random is not surprising, please explain why.
I think Set.dropFirst removing the first element that I observe on
iteration is the least surprising answer, because Swift tells me that the
stdlib Set models a set but that it is also a sequence.
[…]
* If a type `T` conforms to `ForLoopable` and an instance `t` of that
type has at least one element, then *something* has to be the first element
in a `for element in t { ... }` loop. Put another way, every instance of a
type that conforms to `ForLoopable` must have at least one publicly
observable order (although, intriguingly, I'm not sure it has to be a
repeatable one). It is possible, therefore, to have a semantic answer to
the question of which element is `first` or (if finite) `last`; one can
also `drop(while:)`, etc., and perform lexicographical comparisons.
As a side effect of Swift being a procedural language each iteration
happens to occur in some order, yes, but that order is meaningless and
reflects nothing about the Set itself. In fact, I’d say that *`first`,
`last`, etc. are not even defined on the original Set per se, only on the
specific order that a particular iteration resulted in*. And that order
is not necessarily predictable, nor necessarily stable, as you yourself
said.
Consider an Iterable that gives a different order every time it’s
iterated.
Should calling `.first` or `last` give a different object every time?
That’s absurd.
Should an object lexicographically compare not equal to itself? Even
more absurd.
What's your basis for saying that such behavior is absurd? It is
explicitly permitted for instances of types conforming to `SpongeBob` to be
single-pass and/or infinite. For a single-pass `SpongeBob`, `first` will
certainly return a different value every time it is invoked.
Is `first` mutating? No. Should it be? No! `first` and `last` are a peek
at the state of the object.
You're right, `first` should not be mutating; that's actually an important
design consideration, as Ole pointed out, and it's not actually available
on `Sequence` for that reason. However, `first { _ in true }` is available
and is potentially mutating, as are all methods on Sequence by design.
Is `elementsEqual` (or *shudder* lexicographicallyEqual) reflexive? IMO it
clearly should be. Especially with the “lexicographically” part—from
everything I can find, a lexicographical ordering is by definition
reflexive. Do you have a citation for the idea that lexicographical
equality can legitimately be non-reflexive?
Clearly (tautologically?), such a function should be reflexive for any
argument ordered with respect to itself. However, if there is no
lexicographical comparison possible, then a thing cannot compare
lexicographically equal to anything, even itself.
And that’s PRECISELY why lexicographicallyEqual does not make sense to
apply to unordered sets. There is no lexicographical comparison possible,
so why do you keep insisting they should have a method that falsely claims
to lexicographically compare them?
I agree! It doesn't make sense if no comparison is possible! But Swift
tells me that a `Set` is a `Sequence`!
A random number generator fulfills all the semantic requirements of
conforming to `SpongeBob`, and in fact I do just that in NumericAnnex
<https://github.com/xwu/NumericAnnex/blob/master/Sources/PRNG.swift#L53>\.
`first` gives a different value every time, and a randomly generated
`SpongeBob` would unsurprisingly compare lexicographically not equal to
itself.
> IMO that’s a bug in the implementation; lexicographical equality is
reflexive, period.
> Presumably the `elementsEqual` method contains something along these
lines (take from SequenceAlgorithms.swift.gyb):
var iter1 = self.makeIterator()
var iter2 = other.makeIterator()
> By creating two iterators, you’re mutating while iterating. Turns out
there’s even a warning against this in Sequence.swift:
/// Using Multiple Iterators
/// ========================
///
/// Whenever you use multiple iterators (or `for`-`in` loops) over a single
/// sequence, be sure you know that the specific sequence supports repeated
/// iteration, either because you know its concrete type or because the
/// sequence is also constrained to the `Collection` protocol.
///
/// Obtain each separate iterator from separate calls to the sequence's
/// `makeIterator()` method rather than by copying. Copying an iterator is
/// safe, but advancing one copy of an iterator by calling its `next()`
method
/// may invalidate other copies of that iterator. `for`-`in` loops are
safe in
/// this regard.
*> The default implementation of elementsEqual is therefore unsafe* because
it has the potential for using an invalidated iterator.
You are misunderstanding the warning in the second paragraph here. The
implementation (not a default implementation, unless I'm mistaken, as it
cannot be overridden)
makes each iterator using separate calls to `makeIterator()`, just as the
documentation tells you to do. Calling next() on one iterator does not
invalidate the other iterator, because the second is not a copy of the
first.
Indeed, I misread that comment.
That said, is there a well-defined behavior when iterating a one-shot
sequence with two iterators?
(It *is* a default implementation, btw)
Not sure about that; I don't see a protocol requirement, in which case it
can only be shadowed in a concrete type but it can't be overridden. How to
accommodate single-pass sequences is an interesting question. Off the top
of my head, an iterator would have to be or wrap a reference type.
You are, however, right that calling `rng.elementsEqual(rng)` is not
advised. It isn't unsafe; it's just not useful. That said, calling
`array.elementsEqual(array)` is equally safe and equally useless, and the
uselessness of such a reflexive comparison is neither here nor there.
Funny how you complain about my code being useless and yet you insist
below, "If it's not providing you with utility, then don't use it.”
Regardless, you’re wrong to dismiss this case. `foo.elementsEqual(foo)` on
its own makes little sense, sure, but you could easily find yourself in a
method comparing two iterators you obtained from elsewhere, and
occasionally they happen to be the identical object. Allowing an iterator
to return `false` for .elementsEqual on itself is unexpected and dangerous.
You will always have to account for this possibility, because Swift's
`Equatable` explicitly allows "special values" to be not equal to
themselves. This is, at least in part, in order to accommodate the IEEE
decree that NaN != NaN:
let x = [Double.nan]
x.elementsEqual(x) // false
Changing this behavior is way beyond the scope of this thread (and has been
the topic of hours (actually, weeks and weeks) of fun on this list
previously).
On the other hand, if I have a collection of objects that I want iterated
in a particular order, I can use a container that iterates in a specific,
known, well-defined way, and use that to construct the sequence of
objects. That’s clearly an Iterable collection, but the guarantee is
stronger than that. Since it iterates objects in a specific sequence, the
logical way to express that would be `Sequence: Iterable`. Again, we’ve
seen that before.
Now, since a Sequence is guaranteed to iterate the same every time,
suddenly our `first`, `last`, `drop*`, etc. methods have a meaning inherent
to the collection itself, rather than a specific iteration.
What you call a "Sequence" here would have to be multi-pass, finite, and
ordered.
> Ordered, yes, but it’s only admittedly poor wording that suggests
multi-pass, and I don’t think anything there suggests finite.
If a Sequence is "guaranteed to iterate the same every time," then surely
it must be multi-pass; what's the alternative?
Not sure if you just missed the very next sentence or are actively
ignoring it just to be argumentative. I already acknowledged that that
phrase didn’t convey the meaning I intended, and a Sequence is not and
should not be required to be multi-pass.
I entirely misunderstood your next sentence as asserting that being
multi-pass makes the iteration order well-defined.
It would be better to say that the iteration order is well-defined. That
will almost always mean documented, and usually predictable though
obviously e.g. RNGs and iterating in random order will not be predictable
by design.
Wouldn't it then suffice to document, say, that a set's iteration order is
the insertion order?
That's actually more semantically constrained than what Swift calls a
`Collection` (which requires conforming types to be multi-pass and(?)
finite). By contrast, Swift's `SpongeBob` protocol explicitly permits
conforming single-pass, infinite, and/or unordered types.
I think you’re talking about Sequence here, I’ve lost track of your
nonsense by now. Yes, the current Swift protocol named Sequence allows
unordered types. You seem to keep asserting that but not actually
addressing my argument, which is *that allowing Sequences to be
unordered with the current API is undesired and actively harmful, and
should* *therefore** be changed*.
What is harmful about it?
Well, for one, the issue you raised this thread about—two sets that are
`==` could return either true or false for `elementsEqual`, depending on
how they arrived at their current state. That’s not acceptable, and the
problem isn’t with the name of the method.
Apple documentation calls this one of the "order-dependent" methods. It is
surely acceptable for a type that conforms to an order-dependent protocol
to have methods that are order-dependent; they do, however, have to be
clearly order-dependent to avoid confusion on unordered types.
Then there are all the methods that imply a specific order of iteration.
If the “sequence” is unordered, who knows what you’ll get? It is incredibly
easy for an engineer to write a method that implicitly relies on a passed
sequence being intrinsically ordered and another engineer to pass it an
unordered “sequence.” The first engineer could neglect to document the
dependency, or even not notice it; or the second engineer could just fail
to read the documentation thoroughly enough. There is currently no way for
the compiler to enforce passing only an object that is (or at least claims
to be) intrinsically ordered.
It is also incredibly easy for such an engineer to use `for...in` instead
to accomplish the same task, generic over ordered and unordered sequences
whatever you name such distinguished protocols. I think your beef really
still boils down to Set being compatible with `for...in` at all, as Jon
acknowledges.
As long as it is possible to iterate over a `SpongeBob`, it is meaningful
to ask what element is first obtained upon iteration or to drop the first
element obtained upon iteration.
And as long as iteration is not required to be repeatable (and it isn't),
it is perfectly acceptable for these algorithms to return a different
result every time.
It is “meaningful” in the sense that it can technically be programmed.
The actual results are meaningless beyond returning or dropping a random*
element.
*: Don’t nitpick the word “random”, you know exactly what I mean. It’s
just shorter and no less clear than “technically more-or-less deterministic
but not documented, not generally predictable, and probably but not
necessarily consistent from one call to the next."
I fail to see the issue here. If it's not providing you with utility, then
don't use it.
I have no problem with functions I don’t use provided they are
well-defined and reasonably accurately named. Functions requiring an order
on unordered collections don’t pass that bar.
As I said, you're welcome to tackle the protocol hierarchy, but I really
doubt it's within the realm of realistic endpoints for Swift 5. I'm just
trying to propose a narrowly targeted pragmatic solution to one specific
limited harm that might be deliverable by the next point release. As a
great man once said, Swift is a pragmatic language.
Since Collections do guarantee multi-pass iteration, Brent's example of
`set.dropFirst().reduce(set.first!, ...)` provides just one instance
where an unordered Collection can profitably make use of `first`.
Permitting generic algorithms that can operate on either arrays or sets,
for example, is the desired effect of having such a protocol; a generic
algorithm that takes a Collection can ask for the first element, and in the
case of an unordered Collection, an arbitrary element will do just fine.
The generic algorithms should be on a protocol that specifies everything
they require. If one can work on anything you can iterate over, put it on
Iterable. If another requires the objects to be ordered, put it on
Sequence. Need to express that an algorithm requires a multi-pass sequence?
Make a MultiPassSequence protocol and put the algorithm on an extension
containing that requirement. Use protocols to express requirements, as they
were designed for. Don’t just tack a bunch of methods onto a protocol that
isn’t sufficient to describe their requirements and say, “oh, by the way,
only use this method if your implementation meets these conditions…"
The benefits are likely outweighed by the costs of such an approach taken
to completion, because there are many axes to differentiate. The protocol
hierarchy for collections is already daunting, leading to monstrosities
such as `MutableRangeReplaceableRandomAccessSlice`. It stretches the bounds
of sensibility to have a
`LazyUnorderedInfiniteMultiPassMutableRangeReplaceableRandomAccessSlice`.
The Swift stdlib deliberately eschews modeling everything in protocol
hierarchies with the highest level of granularity. There's some fudging,
deliberately, to find a happy medium between obtuse and approachable,
between too many/too specialized and not enough. For example, I pushed for
protocols such as `Field` and `Ring` at the top of the numeric hierarchy,
which might allow complex number types to slot into the hierarchy more
sensibly, for example. But we have a compromise protocol `Numeric` which
doesn't quite have the same guarantees but is much more approachable.
Notice that we also don't break down numeric protocols into `Addable`,
`Subtractable`, etc.; we also have that fudge factor built into
`Equatable`, as I mentioned.
`first` is the first object in the Sequence. It doesn’t matter how the
sequence came to be in that order; it doesn’t matter whether or not the
sequence has already been iterated or how many times. `first` is the first
object that is, was, and always will be presented by the Sequence’s
Iterator. (Until the collection is mutated, obviously).
*To summarize,*
A Set has no intrinsic order. You can iterate over it, and a specific
iteration of a set has an order, but that order is not tied to the Set
itself beyond including all and only the items therein. Therefore, the Set
itself has no intrinsic `first`, `last`, lexicographical comparison, etc.;
only its iterations do, and they are not themselves Sets.
A Sequence does have an intrinsic order. The order of iteration reflects
the order inherent to the Sequence. Therefore, a Sequence has a `first`,
`last`, lexicographical comparison, etc.
Just in case it’s not obvious, `Set` here is pretty much interchangeable
with any other unordered iterable.
What you want to call a "Sequence" is what Swift calls a `Collection`,
with additional restrictions. What you want to be called "Iterable" is what
Swift calls `Sequence` (or now, `SpongeBob`). Clearly, shuffling names will
not make these protocols support any more functionality, so that can be put
aside.
No, no, no! What I want to call “Iterable” is specified below. It is
about HALF of what’s currently in Sequence—the half that has to do with
iterating, whence the name.
What I want to call Sequence is precisely what Swift now calls
Sequence—the methods that are in Iterable by virtue of adopting Iterable,
PLUS some methods that only make sense on iterable groups of objects where
the iteration order is well-defined.
Now, with that out of the way, why do you think that only `Collection`
types should have `first` and `last`? These helper properties and methods
are simply convenient ways to spell things that can be done with
`for...in`--the whole point of supplying them is to allow people to work
with these types in a more functional style.
Apparently “collection" was a bad choice of word. What I clearly meant
was not the current Swift Collection protocol, but rather an unordered
assemblage of objects. UnorderedCollection, perhaps, or if that’s still
going to cause issues, try UnorderedAssemblage. What `first` and `last`
really mean in an UnorderedAssemblage is give me some object from the
assembled objects, I don’t care which one. For which it’s much more clear
to have an `anyObject()` as on NSSet; as another user has pointed out,
`assemblage.makeIterator().next()` works just as well. (I just checked,
and that’s actually how `first` is implemented. But it’s on Collection,
which is guaranteed to be multipass,)
Again, the point of having a protocol-based design is to allow useful
_generic_ algorithms to be written; that `first` and `last` would be
equivalent to an arbitrary element in the case that a collection is
unordered is not at all an argument against these types conforming to
`Collection`; if anything, it's an argument for it.
If a protocol demands the first object, you should give it the first
object. If you don’t have a first object, maybe you shouldn’t conform to
the protocol. If the protocol really just needs any old object, call it
`anyObject`.
Sure, but we *do* have a first element; it just happens to be the first
that is obtainable on iteration. That you could make a good case for any
other element to be first doesn't mean that this one isn't a perfectly
cromulent first.
Just as `Sequence.underestimatedCount` is equivalent to `Collection.count`
···
On Sun, Oct 15, 2017 at 2:29 AM, Kevin Nattinger <swift@nattinger.net> wrote:
On Oct 14, 2017, at 7:54 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:
On Sat, Oct 14, 2017 at 6:17 PM, Kevin Nattinger <swift@nattinger.net> > wrote:
for types that conform to `Collection`, or the instance
`BinaryInteger.bitWidth` is equivalent to a static `bitWidth` for types
that conform to `FixedWidthInteger`.
I don’t see how those are relevant, they all mean what they claim to,
unlike Set.first/dropFirst/etc.
public protocol Iterable {
associatedtype Iterator: IteratorProtocol
func map<T>(...) -> [T] // Iterable where .Iterator.Element == T
func filter(...) -> [Iterator.Element] // Iterable where
.Iterator.Element == Self.Iterator.Element
func forEach(...)
func makeIterator() -> Iterator
var underestimatedCount: Int { get }
}
public protocol Sequence: Iterable { // Maybe OrderedSequence just
to make the well-defined-order requirement explicit
associatedtype SubSequence
func dropFirst(...) -> SubSequence // Sequence where
.Iterator.Element == Self.Iterator.Element
func dropLast(...) -> SubSequence // " "
func drop(while...) -> SubSequence // " "
func prefix(...) -> SubSequence // " "
func prefix(while...) -> SubSequence // " "
func suffix(...) -> SubSequence // " "
func split(...where...) -> [SubSequence] // Iterable where
.Iterator.Element == (Sequence where .Iterator.Element ==
Self.Iterator.Element)
}
And just to be explicit,
struct Set: Iterable {…}
struct Dictionary: Iterable {…}
struct Array: Sequence {…}
etc.
Hopefully at some point:
struct OrderedSet: Sequence {…}