[Draft] Rename Sequence.elementsEqual

Jon_Hull · October 16, 2017, 2:36am

Ok, just to summarize our options so far (in order of effort & theoretical effectiveness):

1) Do nothing - This is the easiest thing to do, but it won’t fix any problems

2) Rename the elementsEqual method: (Again, fairly easy to do, and will hopefully reduce confusion, but doesn’t fix the underlying issue)
  • lexicographicallyEquals
  • elementOrderingEqual (this is my favorite of the names)
  • sequentiallyEquals
  • orderedEqual
  • pairwiseEqual
  • iterativelyEquals
  I might be missing some...

3) Change the implementation of Set so that it will return the same order whenever it has the same elements. This solves the issue with set, but not with unordered sequences in general (e.g. comparing a set with the keys of a dictionary). Still on balance, this is my current favorite.

4) Add an UnorderedX (name to be bikeshed) protocol that defines a sequence/collection as unordered, which provides a method which will return a sequence/collection of defined order. Generic algorithms can check for conformance to the UnorderedX protocol, and provide different implementations as needed (mostly by operating on the provided sequence/collection). The advantage is that generic algorithms can be made which take whether the sequence is ordered/unordered into account, and we don’t lose any speed. The downside is that you have to remember to check if it is ordered or not, and failing to do so may result in generic algorithms which have the current issues. We could guarantee that the standard library would be correct though.

5) Add an UnorderedX protocol that defines a sequence/collection as unordered, takes the unordered elements and uses them to provide a partial implementation to Sequence/Collection where the elements have a defined order. The advantage to this is that you can now build correct generic algorithms that depend on a stable/defined ordering that will “just work" (e.g. elementsEqual will work when comparing a set to the keys of a dictionary). The disadvantage is that it will be a bit slower for things that don’t care about ordering (e.g. many things involving for-in) unless you specifically call a method that says you don’t care about the order.

6) Rework the Sequence protocols to account for Ordered and Unordered (e.g. replacing ‘first’ with ‘any’ and then adding ‘first’ back in only for ordered sequences). Reworking the protocol hierarchy would be the most permanent and theoretically “correct” fix, but has the potential to be massively source-breaking.

Thanks,
Jon

xwu · October 16, 2017, 4:58am

That ordering can be arbitrary, but it shouldn’t leak internal

representation such that the method used to create identical things affects
the outcome of generic methods because of differences in internal
representation.

It would be better to say that the iteration order is well-defined.
That will almost always mean documented, and usually predictable though
obviously e.g. RNGs and iterating in random order will not be predictable
by design.

That's actually more semantically constrained than what Swift calls a
`Collection` (which requires conforming types to be multi-pass and(?)
finite). By contrast, Swift's `SpongeBob` protocol explicitly permits
conforming single-pass, infinite, and/or unordered types.

I think you’re talking about Sequence here, I’ve lost track of your
nonsense by now. Yes, the current Swift protocol named Sequence allows
unordered types. You seem to keep asserting that but not actually
addressing my argument, which is *that allowing Sequences to be
unordered with the current API is undesired and actively harmful, and
should* *therefore** be changed*.

What is harmful about it?

After thinking about it, I think the harmful bit is that unordered
sequences are leaking internal representation (In your example, this is
causing people to be surprised when two sets with identical elements are
generating different sequences/orderings based on how they were created).
You are correct when you say that this problem is even true for for-in.

I would not say it is a problem. Rather, by definition, iteration
involves retrieving one element after another; if you're allowed to do that
with Set, then the elements of a Set are observably ordered in some way.
Since it's not an OrderedSet--i.e., order doesn't matter--then the only
sensible conclusion is that the order of elements obtained in a for...in
loop must be arbitrary. If you think this is harmful, then you must believe
that one should be prohibited from iterating over an instance of Set.
Otherwise, Set is inescapably a Sequence by the Swift definition of
Sequence. All extension methods on Sequence like drop(while:) are really
just conveniences for common things that you can do with iterated access;
to my mind, they're essentially just alternative ways of spelling various
for...in loops.

I think an argument could be made that you shouldn’t be able to iterate
over a set without first defining an ordering on it (even if that ordering
is somewhat arbitrary). Maybe we have something like a “Sequenc(e)able”
protocol which defines things which can be turned into a sequence when
combined with some sort of ordering. One possible ordering could be the
internal representation (At least in that case we are calling it out
specifically). If I had to say “setA.arbitraryOrder.elementsEqual(setB.arbitraryOrder)”
I would definitely be less surprised when it returns false even though setA
== setB.

Well, that's a totally different direction, then; you're arguing that
`Set` and `Dictionary` should not conform to `Sequence` altogether. That's
fine (it's also a direction that some of us explored off-list a while ago),
but at this point in Swift's evolution, realistically, it's not within the
realm of possible changes.

I am actually suggesting something slightly different. Basically, Set
and Dictionary’s conformance to Collection would have a different
implementation. They would conform to another protocol declaring that they
are unordered. That protocol would fill in part of the conformance to
sequence/collection using a default ordering, which is mostly arbitrary,
but guaranteed to produce the same ordering for the same list of elements
(even across collection types). This would be safer, but a tiny bit slower
than what we have now (We could also potentially develop a way for
collections like set to amortize the cost). For those who need to recover
speed, the new protocol would also define a property which quickly returns
a sequence/iterator using the internal ordering (I arbitrarily called it
.arbitraryOrder).

I believe it would not be source breaking.

That is indeed something slightly different.

In an ideal world--and my initial understanding of what you were
suggesting--Set and Dictionary would each have a member like `collection`,
which would expose the underlying data as a `SetCollection` or
`DictionaryCollection` that in turn would conform to `Collection`;
meanwhile, Set and Dictionary themselves would not offer methods such as
`prefix`, or indexing by subscript, which are not compatible with being
unordered. For those who want a particular ordering, there'd be something
like `collection(ordered areInIncreasingOrder: (T, T) -> Bool) ->
{Set|Dictionary}Collection`.

What you suggest here instead would be minimally source-breaking. However,
I'm unsure of where these guarantees provide benefit to justify the
performance cost. Certainly not for `first` or `dropFirst(_:)`, which still
yields an arbitrary result which doesn't make sense for something
_unordered_. We *could* have an underscored customization point named
something like `_customOrderingPass` that is only invoked from
`elementsEqual` or other such methods to pre-rearrange the internal
ordering of unordered collections in some deterministic way before
comparison. Is that what you have in mind?

Something like that. Whatever we do, there will be a tradeoff between
speed, correctness, and ergonomics.

My suggestion trades speed for correctness, and provides a way to recover
speed through additional typing (which is slightly less ergonomic).

You haven't convinced me that this is at all improved in "correctness." It
trades one arbitrary iteration order for another on a type that tries to
model an unordered collection.

We could do something like you suggest. I don’t think the method would
need to be underscored… the ordering pass could just be a method on the
protocol which defines it as unordered. Then we could provide a special
conformance for things where order really matters based on adherence to
that protocol. That might be an acceptable tradeoff. It would give us
speed at the cost of having the correct implementation being less ergonomic
and more error prone (you have to remember to check that it is unordered
and call the ordering method when it mattered).

I’d still be a bit worried that people would make incorrect generic
algorithms based on expecting an order from unordered things, but at least
it would be possible for them check and handle it correctly. I think I
could get behind that tradeoff/compromise, given where we are in the swift
process and Swift's obsession with speed (though I still slightly prefer
the safer default). At least the standard library would handle all the
things correctly, and that is what will affect the majority of programmers.

What is an example of such an "incorrect" generic algorithm that would be
made correct by such a scheme?

···

On Sun, Oct 15, 2017 at 8:51 PM, Jonathan Hull <jhull@gbis.com> wrote:

On Oct 14, 2017, at 10:48 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

xwu · October 16, 2017, 5:22am

As I wrote above, Set cannot use a different iteration order for each
iterator because it conforms to `Collection`, thereby guaranteeing a
multi-pass sequence.

···

On Sun, Oct 15, 2017 at 11:59 PM, Thorsten Seitz <tseitz42@icloud.com> wrote:

Am 16.10.2017 um 00:46 schrieb Xiaodi Wu <xiaodi.wu@gmail.com>:

On Sun, Oct 15, 2017 at 2:39 PM, Kevin Nattinger <swift@nattinger.net> > wrote:

[…]
Swift's Sequence protocol does not require the order of iteration to
"convey any meaning"; it doesn't even require it to be deterministic.

And that’s EXACTLY why none of the functions on Sequence should rely on
the order conveying meaning. `ElementsEqual` (for example) DOES rely on
the order of iteration conveying a meaning not required by the protocol,
and renaming it `lexicographicallyEquals` does not change that fact. Either
Sequence needs to require a meaningful order or `elementsEqual` should be
moved to a protocol that does.

What's your basis for saying that `elementsEqual` requires orders of
iteration that "convey a meaning"? It merely answers the question of
whether iterating over `a` is substitutable for iterating over `b`, a
question applicable to instances of any type which offers iterated access.

As Set has no intrinsic order of elements just imagine it was implemented
to have each iterator created on it to iterate in a random order. This
would satisfy the Sequence protocol (or rather its Iterable part hich
should be split off) and clearly show that the order is meaningless for
Sets.

trs · October 16, 2017, 4:57am

[…]
Sets, as a mathematical concept, have no intrinsic order. However, instances of `Set`, which can be iterated over, *do* have at least one order which can be said to be intrinsic in the following sense: as long as iteration is possible, no API design can prevent that order from being observed and associated with the instance. Put another way, if you can use an instance of a type in a for...in loop, intrinsic to that functionality is a publicly visible order.

You keep saying this, I keep saying it’s only a technical “order” that is an implementation detail

You keep saying it's an implementation detail, which it emphatically is *not*. It's a *public guarantee* by the protocol that you can use an instance of `Set` in a `for...in` loop, thereby exposing a publicly visible order. An implementation detail is something

Being able to use a Set in a for...in loop does *not* make it ordered! The purpose is is just being able to do something with each element. That a for...loop works sequentially is just a side effect. Just imagine we had parallelized for...in loops.

that could go away with an alternative implementation. By contrast, no implementation that permits an instance of `Set` being iterated over in a `for...in` loop can avoid exposing at least one publicly visible order, because it's not a matter of implementation. Put another way, by allowing iteration, a type necessarily exposes at least one publicly visible order and thereby models a sequence in at least one way.

Wrong. I could easily implement Set or another type to iterate in a random order over its elements, so each iteration would expose a different (meaningless) order.
This demonstrates that being able to be used in a for...in loop is about doing somthing with each element and not about element order. The latter is an additional property which should be expressed in an additional protocol like Kevin suggested.

-Thorsten

···

Am 16.10.2017 um 00:41 schrieb Xiaodi Wu via swift-evolution <swift-evolution@swift.org>:
On Sun, Oct 15, 2017 at 2:32 PM, Kevin Nattinger <swift@nattinger.net> wrote:

and can’t be relied upon for anything and so we shouldn’t provide methods that rely on it. I think this part of the discussion has reached the “agree to disagree” stage.

[…]
You’re a fan of the principal of least surprise. Tell me, which would be less surprising: Set.dropFirst() actually drops a random element, or Set doesn’t have a dropFirst()? And if you think dropFirst() removing an element at random is not surprising, please explain why.

I think Set.dropFirst removing the first element that I observe on iteration is the least surprising answer, because Swift tells me that the stdlib Set models a set but that it is also a sequence.

Your logic is backwards. You’re saying it’s “least surprising” because that’s how it’s currently implemented, not that it should be implemented that way because it’s least surprising.

No, I'm saying it's least surprising because any type that supports iterated access thereby exposes an order--not as an implementation detail but as a matter of public API--and in the absence of any other order, "first" must refer to that order so exposed.

[…]

And that’s PRECISELY why lexicographicallyEqual does not make sense to apply to unordered sets. There is no lexicographical comparison possible, so why do you keep insisting they should have a method that falsely claims to lexicographically compare them?

I agree! It doesn't make sense if no comparison is possible! But Swift tells me that a `Set` is a `Sequence`!

You keep making the circular argument that a Set should do things because it currently does them. If you want to argue against changing things, argue that things shouldn’t be changed, not that the current implementation is correct because it is the current implementation.

No, I'm arguing that `Set`, by supporting iterated access, is not wrong to conform to a protocol called `Sequence` because it does have an intrinsic and publicly observable order, which is not an accident of a particular implementation but is inherent to any type that supports iterated access. Now, whether it's the *best choice* to conform `Set` to `Sequence` and offer order-dependent functions is a matter of taste, but it isn't *wrong*.
[…]
You will always have to account for this possibility, because Swift's `Equatable` explicitly allows "special values" to be not equal to themselves. This is, at least in part, in order to accommodate the IEEE decree that NaN != NaN:
let x = [Double.nan]
x.elementsEqual(x) // false
NaN is special, one-shot and unordered sequences are not. Unless you think that all unordered and single-pass sequences should compare false for `elementsEqual`, this is irrelevant for any sequence that doesn’t contain NaN and well-defined (false) for any that does.
Certainly, not all single-pass sequences should compare false to themselves, but some should: for instance, an infinite single-pass stream of all 1's should compare true to itself, but an infinite single-pass stream of alternating 1's and 0's should compare false to itself. If you write generic code that calls `elementsEqual`, it is pervasively incorrect to test for identity by assuming that elementsEqual will return true on reflexive comparison. NaN is only one of many reasons why such code would blow up.

Changing this behavior is way beyond the scope of this thread (and has been the topic of hours (actually, weeks and weeks) of fun on this list previously).

Yes, I’ve seen the discussion on NaN and Comparable. It’s not the same discussion.

[…]

It would be better to say that the iteration order is well-defined. That will almost always mean documented, and usually predictable though obviously e.g. RNGs and iterating in random order will not be predictable by design.

Wouldn't it then suffice to document, say, that a set's iteration order is the insertion order?

Now this actually gave me pause. I guess it does match what I said, but I still take issue with the fact that two Sets could compare `==` but not `elementsEqual`. I think that defining iteration order as insertion order adds a piece of publicly documented state that goes beyond what a Set really is. What you describe is really an OrderedSet, just without the random-access manipulation.

a) There is no semantic requirement on the part of `==` to be equivalent to an elementwise comparison when it is defined on a collection; in fact, one could imagine that some exotic sequence might legitimately define equality in a way that has nothing to do with elementwise comparison. Put another way, `==` returning `true` does not imply `elementsEqual` returning `true`, and `elementsEqual` returning `true` does not imply `==` returning `true`. This applies equally to ordered collections and is independent of the question of how to model unordered collections.

b) You keep writing that some Foo is really some Bar, but those are really just names. What would be the harm if Swift's `Set` indeed simply models an ordered set without random-access manipulation?

I’ll have to mull this over to see if I can come up with a coherent and (more) specific requirement for what makes an Iterable a Sequence, since clearly “documented” isn’t enough. Perhaps something along the lines that any two Sequences that compare equal must iterate the same.

[…]
Apple documentation calls this one of the "order-dependent" methods. It is surely acceptable for a type that conforms to an order-dependent protocol to have methods that are order-dependent; they do, however, have to be clearly order-dependent to avoid confusion on unordered types.

I’m not clear on what you’re trying to get across here. It seems you’re saying unordered types shouldn’t have order-dependent methods, which is exactly what I’ve been arguing.

No, I'm saying, essentially, that there are no truly unordered types in Swift; `Set` and `Dictionary` lead double lives modeling unordered collections on the one hand and ordered collections on the other. The order-dependent methods can continue to exist; they just need to be clearly named so that users know when they're using an instance of `Set` in the manner of an unordered collection and when they're using an instance of `Set` in the manner of an ordered collection.

[...]

Then there are all the methods that imply a specific order of iteration. If the “sequence” is unordered, who knows what you’ll get? It is incredibly easy for an engineer to write a method that implicitly relies on a passed sequence being intrinsically ordered and another engineer to pass it an unordered “sequence.” The first engineer could neglect to document the dependency, or even not notice it; or the second engineer could just fail to read the documentation thoroughly enough. There is currently no way for the compiler to enforce passing only an object that is (or at least claims to be) intrinsically ordered.

It is also incredibly easy for such an engineer to use `for...in` instead to accomplish the same task, generic over ordered and unordered sequences whatever you name such distinguished protocols. I think your beef really still boils down to Set being compatible with `for...in` at all, as Jon acknowledges.

Not providing ordered functions for unordered collections makes the developers think about what they actually need. If any object will do, they can use for…in, .makeIterator().next(), or an `anyObject()` we provide as a convenience. If they actually need the first from some specific order, it’s a reminder they need to sort the objects first to get the right one.

The whole point of protocol hierarchies is to enable useful generic algorithms. Here, the purpose of having a protocol that unites both ordered and unordered collections is to permit the writing of generic algorithms that operate on both; a user would want the first item from an ordered collection or an arbitrary item (but the same one on multiple passes) from an unordered collection. The name for that is currently `first`. Brent demonstrated a trivial one-line example of such a use.

That’s particularly useful for functions that actually need an ordered sequence; using OrderedSequence instead of Iterable (just as placeholders) would be a firm reminder not to pass in an unordered collection.

[…]

As I said, you're welcome to tackle the protocol hierarchy, but I really doubt it's within the realm of realistic endpoints for Swift 5. I'm just trying to propose a narrowly targeted pragmatic solution to one specific limited harm that might be deliverable by the next point release. As a great man once said, Swift is a pragmatic language.

If you want a pragmatic solution, fix the bug in functionality, don’t try and rename the method to something obscure to cover it up.

What I'm arguing is that there *is no bug in functionality*, only a naming problem. It is true that the current protocol hierarchy would not be my preferred design, but that's a matter of taste in terms of, again, where to draw the line between too much modeling or not enough. But that's not tantamount to a *bug*.

If you want to limit the harm, override `equalObjects` on unordered sequences to use `==` (very strongly preferred), or always `false` (less desirable, but at least consistent)

[…]

The Swift stdlib deliberately eschews modeling everything in protocol hierarchies with the highest level of granularity. There's some fudging, deliberately, to find a happy medium between obtuse and approachable, between too many/too specialized and not enough. For example, I pushed for protocols such as `Field` and `Ring` at the top of the numeric hierarchy, which might allow complex number types to slot into the hierarchy more sensibly, for example. But we have a compromise protocol `Numeric` which doesn't quite have the same guarantees but is much more approachable. Notice that we also don't break down numeric protocols into `Addable`, `Subtractable`, etc.; we also have that fudge factor built into `Equatable`, as I mentioned.

Eh, one or two corner cases on a protocol is probably fine. What’s not fine is over half (Sequence) or almost all (Collection) the methods not being applicable. There is a very clear gap there. We don’t need to fix everything, but this is something that can and should be addressed.

This would be based on the premise that an instance of `Set` has no intrinsic order; I disagree for the reasons above.
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

adamkemp · October 16, 2017, 6:37am

FWIW, in .Net the equivalent of Sequence is IEnumerable, and the equivalent of IteratorProtocol is IEnumerator. In .Net there are extension methods for IEnumerable for things like First, Last, and SequenceEqual. Like Swift’s Sequence, .Net’s IEnumerable makes no promises about the order of iteration being semantically meaningful (that’s up to the implementation) or non-destructive. That does mean you should be aware of what kind of thing you’re iterating over before using some of the extension methods, but they are still useful. And, as in Swift, .Net’s Set type implements IEnumerable. I don’t think this is a flaw in .Net, and I don’t think it’s a flaw in Swift.

It seems some people are getting hung up on the natural language definition of Sequence and whether that name accurately describes what Swift’s type represents. I don’t think the name is inaccurate. It’s still a Sequence, even if it’s an arbitrary one.

Some people are also getting hung up on whether it makes sense to have these operations on collections that don’t define an order. Maybe not always, and maybe not usually, but I can tell you I’ve at least used First on a Set in .Net, and I’d rather that Set implement IEnumerable and get all the useful things that come with it (even if that means also getting a few that don’t make sense) than have to deal with distinct types that are only occasionally different.

I think plenty of .Net developers have been productive for years without this being a problem. I think Swift developers can handle it too.

I’m not even sure a name change is necessary for this method at all, but I’m not at all in favor of anything beyond that.

xwu · October 16, 2017, 4:54am

We can also document the iteration order as arbitrary (i.e., the status
quo).

···

On Sun, Oct 15, 2017 at 8:35 PM, Jonathan Hull <jhull@gbis.com> wrote:

On Oct 15, 2017, at 3:41 PM, Xiaodi Wu via swift-evolution < > swift-evolution@swift.org> wrote:

I’ll have to mull this over to see if I can come up with a coherent and

(more) specific requirement for what makes an Iterable a Sequence, since
clearly “documented” isn’t enough. Perhaps something along the lines that
any two Sequences that compare equal must iterate the same.

[…]
Apple documentation calls this one of the "order-dependent" methods. It
is surely acceptable for a type that conforms to an order-dependent
protocol to have methods that are order-dependent; they do, however, have
to be clearly order-dependent to avoid confusion on unordered types.

I’m not clear on what you’re trying to get across here. It seems you’re
saying unordered types shouldn’t have order-dependent methods, which is
exactly what I’ve been arguing.

No, I'm saying, essentially, that there are no truly unordered types in
Swift; `Set` and `Dictionary` lead double lives modeling unordered
collections on the one hand and ordered collections on the other. The
order-dependent methods can continue to exist; they just need to be clearly
named so that users know when they're using an instance of `Set` in the
manner of an unordered collection and when they're using an instance of
`Set` in the manner of an ordered collection.

Is the order of elements returned from Set/Dictionary guaranteed not to
change based on implementation? For instance, the elements of a dictionary
may shift around as I add items, and they may be different in different
runs of a program.

If we document/say that Set will always return elements in the order they
were added, then it may prevent us from using a more efficient
implementation of Sets in a future version of Swift.

Just to check what we are saying. I am saying that we can’t really build
generic algorithms on something which is undefined (because we are leaking
implementation details, and depending on private implementation details
leads to problems). You are saying that the leaking of implementation
details is a feature, not a bug… and that we should just document them and
consider them fixed ABI?

xwu · October 16, 2017, 5:02am

Ok, just to summarize our options so far (in order of effort & theoretical
effectiveness):

1) *Do nothing* - This is the easiest thing to do, but it won’t fix any
problems

2) *Rename the elementsEqual method*: (Again, fairly easy to do, and will
hopefully reduce confusion, but doesn’t fix the underlying issue)
• lexicographicallyEquals
• elementOrderingEqual (this is my favorite of the names)
• sequentiallyEquals
• orderedEqual
• pairwiseEqual
• iterativelyEquals
I might be missing some...

I'll be revising the proposal to incorporate these suggests, settling on
`elementsEqualInIterationOrder`.

3) *Change the implementation of Set* so that it will return the same
order whenever it has the same elements. This solves the issue with set,
but not with unordered sequences in general (e.g. comparing a set with the
keys of a dictionary). Still on balance, this is my current favorite.

As I mentioned earlier, I think 4-6 are wildly out of the scope of this
discussion. The starting premise is that Set : Collection and Sequence :
Collection, and that this will not change for Swift 5.

4) *Add an UnorderedX* (name to be bikeshed) *protocol* *that defines a

···

On Sun, Oct 15, 2017 at 9:36 PM, Jonathan Hull <jhull@gbis.com> wrote:

sequence/collection as unordered*, which provides a method which will *return
a sequence/collection of defined order*. Generic algorithms can check for
conformance to the UnorderedX protocol, and provide different
implementations as needed (mostly by operating on the provided
sequence/collection). The advantage is that generic algorithms can be made
which take whether the sequence is ordered/unordered into account, and we
don’t lose any speed. The downside is that you have to remember to check if
it is ordered or not, and failing to do so may result in generic algorithms
which have the current issues. We could guarantee that the standard
library would be correct though.

5) *Add an UnorderedX* *protocol* *that defines a sequence/collection as
unordered*, takes the unordered elements and uses them to *provide a
partial implementation to Sequence/Collection* where the elements have a
defined order. The advantage to this is that you can now build correct
generic algorithms that depend on a stable/defined ordering that will “just
work" (e.g. elementsEqual will work when comparing a set to the keys of a
dictionary). The disadvantage is that it will be a bit slower for things
that don’t care about ordering (e.g. many things involving for-in) unless
you specifically call a method that says you don’t care about the order.

6) *Rework the Sequence protocols to account for Ordered and Unordered* (e.g.
replacing ‘first’ with ‘any’ and then adding ‘first’ back in only for
ordered sequences). Reworking the protocol hierarchy would be the most
permanent and theoretically “correct” fix, but has the potential to be
massively source-breaking.

Thanks,
Jon

xwu · October 16, 2017, 5:19am

[…]
Sets, as a mathematical concept, have no intrinsic order. However,
instances of `Set`, which can be iterated over, *do* have at least one
order which can be said to be intrinsic in the following sense: as long as
iteration is possible, no API design can prevent that order from being
observed and associated with the instance. Put another way, if you can use
an instance of a type in a for...in loop, intrinsic to that functionality
is a publicly visible order.

You keep saying this, I keep saying it’s only a technical “order” that is
an implementation detail

You keep saying it's an implementation detail, which it emphatically is
*not*. It's a *public guarantee* by the protocol that you can use an
instance of `Set` in a `for...in` loop, thereby exposing a publicly visible
order. An implementation detail is something

Being able to use a Set in a for...in loop does *not* make it ordered! The
purpose is is just being able to do something with each element. That a
for...loop works sequentially is just a side effect. Just imagine we had
parallelized for...in loops.

No, it is not at all a "side effect." A for...in loop is a way of
controlling the flow of code which accesses elements in a sequence one
after another, and the correct behavior of code inside the loop depends on
these semantics. A "parallel for" loop would be a totally different thing;
arbitrary for...in loops can't be automatically "upgraded" to a "parallel
for" loop because they have different semantics, and types that support
"parallel for" would likely have to conform to a protocol other than
`Sequence`.

that could go away with an alternative implementation. By contrast, no
implementation that permits an instance of `Set` being iterated over in a
`for...in` loop can avoid exposing at least one publicly visible order,
because it's not a matter of implementation. Put another way, by allowing
iteration, a type necessarily exposes at least one publicly visible order
and thereby models a sequence in at least one way.

Wrong. I could easily implement Set or another type to iterate in a random
order over its elements, so each iteration would expose a different
(meaningless) order.

No, you cannot implement `Set` in this way because it conforms to
`Collection`, which guarantees a multi-pass sequence. Set *must expose the
same order on every iteration*.

This demonstrates that being able to be used in a for...in loop is about
doing somthing with each element and not about element order.

Again, Sequence *already doesn't* require the order to be meaningful or
even repeatable. Notice that, above, I said at least one order. A
conforming type could expose as many different orders as iterations, but
that changes nothing. I can still map an instance of that type to get an
array of elements, and that array will reveal some order which is the
result of the inner workings of the type.

The latter is an additional property which should be expressed in an

additional protocol like Kevin suggested.

What useful generic algorithms would this protocol support that are not
already possible?

-Thorsten

···

On Sun, Oct 15, 2017 at 11:57 PM, Thorsten Seitz <tseitz42@icloud.com> wrote:

Am 16.10.2017 um 00:41 schrieb Xiaodi Wu via swift-evolution < > swift-evolution@swift.org>:
On Sun, Oct 15, 2017 at 2:32 PM, Kevin Nattinger <swift@nattinger.net> > wrote:

and can’t be relied upon for anything and so we shouldn’t provide methods
that rely on it. I think this part of the discussion has reached the “agree
to disagree” stage.

[…]

You’re a fan of the principal of least surprise. Tell me, which would be

less surprising: Set.dropFirst() actually drops a random element, or Set
doesn’t have a dropFirst()? And if you think dropFirst() removing an
element at random is not surprising, please explain why.

I think Set.dropFirst removing the first element that I observe on
iteration is the least surprising answer, because Swift tells me that the
stdlib Set models a set but that it is also a sequence.

Your logic is backwards. You’re saying it’s “least surprising” because
that’s how it’s currently implemented, not that it should be implemented
that way because it’s least surprising.

No, I'm saying it's least surprising because any type that supports
iterated access thereby exposes an order--not as an implementation detail
but as a matter of public API--and in the absence of any other order,
"first" must refer to that order so exposed.

[…]

And that’s PRECISELY why lexicographicallyEqual does not make sense to
apply to unordered sets. There is no lexicographical comparison possible,
so why do you keep insisting they should have a method that falsely claims
to lexicographically compare them?

I agree! It doesn't make sense if no comparison is possible! But Swift
tells me that a `Set` is a `Sequence`!

You keep making the circular argument that a Set should do things because
it currently does them. If you want to argue against changing things, argue
that things shouldn’t be changed, not that the current implementation is
correct because it is the current implementation.

No, I'm arguing that `Set`, by supporting iterated access, is not wrong to
conform to a protocol called `Sequence` because it does have an intrinsic
and publicly observable order, which is not an accident of a particular
implementation but is inherent to any type that supports iterated access.
Now, whether it's the *best choice* to conform `Set` to `Sequence` and
offer order-dependent functions is a matter of taste, but it isn't *wrong*.
[…]
You will always have to account for this possibility, because Swift's
`Equatable` explicitly allows "special values" to be not equal to
themselves. This is, at least in part, in order to accommodate the IEEE
decree that NaN != NaN:
let x = [Double.nan]
x.elementsEqual(x) // false
NaN is special, one-shot and unordered sequences are not. Unless you
think that all unordered and single-pass sequences should compare false for
`elementsEqual`, this is irrelevant for any sequence that doesn’t contain
NaN and well-defined (false) for any that does.
Certainly, not all single-pass sequences should compare false to
themselves, but some should: for instance, an infinite single-pass stream
of all 1's should compare true to itself, but an infinite single-pass
stream of alternating 1's and 0's should compare false to itself. If you
write generic code that calls `elementsEqual`, it is pervasively incorrect
to test for identity by assuming that elementsEqual will return true on
reflexive comparison. NaN is only one of many reasons why such code would
blow up.

Changing this behavior is way beyond the scope of this thread (and has
been the topic of hours (actually, weeks and weeks) of fun on this list
previously).

Yes, I’ve seen the discussion on NaN and Comparable. It’s not the same
discussion.

[…]

It would be better to say that the iteration order is well-defined. That

will almost always mean documented, and usually predictable though
obviously e.g. RNGs and iterating in random order will not be predictable
by design.

Wouldn't it then suffice to document, say, that a set's iteration order

is the insertion order?

Now this actually gave me pause. I guess it does match what I said, but I
still take issue with the fact that two Sets could compare `==` but not
`elementsEqual`. I think that defining iteration order as insertion order
adds a piece of publicly documented state that goes beyond what a Set
really is. What you describe is really an OrderedSet, just without the
random-access manipulation.

a) There is no semantic requirement on the part of `==` to be equivalent
to an elementwise comparison when it is defined on a collection; in fact,
one could imagine that some exotic sequence might legitimately define
equality in a way that has nothing to do with elementwise comparison. Put
another way, `==` returning `true` does not imply `elementsEqual` returning
`true`, and `elementsEqual` returning `true` does not imply `==` returning
`true`. This applies equally to ordered collections and is independent of
the question of how to model unordered collections.

b) You keep writing that some Foo is really some Bar, but those are really
just names. What would be the harm if Swift's `Set` indeed simply models an
ordered set without random-access manipulation?

I’ll have to mull this over to see if I can come up with a coherent and

(more) specific requirement for what makes an Iterable a Sequence, since
clearly “documented” isn’t enough. Perhaps something along the lines that
any two Sequences that compare equal must iterate the same.

[…]
Apple documentation calls this one of the "order-dependent" methods. It
is surely acceptable for a type that conforms to an order-dependent
protocol to have methods that are order-dependent; they do, however, have
to be clearly order-dependent to avoid confusion on unordered types.

I’m not clear on what you’re trying to get across here. It seems you’re
saying unordered types shouldn’t have order-dependent methods, which is
exactly what I’ve been arguing.

No, I'm saying, essentially, that there are no truly unordered types in
Swift; `Set` and `Dictionary` lead double lives modeling unordered
collections on the one hand and ordered collections on the other. The
order-dependent methods can continue to exist; they just need to be clearly
named so that users know when they're using an instance of `Set` in the
manner of an unordered collection and when they're using an instance of
`Set` in the manner of an ordered collection.

[...]

Then there are all the methods that imply a specific order of iteration.
If the “sequence” is unordered, who knows what you’ll get? It is incredibly
easy for an engineer to write a method that implicitly relies on a passed
sequence being intrinsically ordered and another engineer to pass it an
unordered “sequence.” The first engineer could neglect to document the
dependency, or even not notice it; or the second engineer could just fail
to read the documentation thoroughly enough. There is currently no way for
the compiler to enforce passing only an object that is (or at least claims
to be) intrinsically ordered.

It is also incredibly easy for such an engineer to use `for...in` instead
to accomplish the same task, generic over ordered and unordered sequences
whatever you name such distinguished protocols. I think your beef really
still boils down to Set being compatible with `for...in` at all, as Jon
acknowledges.

Not providing ordered functions for unordered collections makes the
developers think about what they actually need. If any object will do, they
can use for…in, .makeIterator().next(), or an `anyObject()` we provide as a
convenience. If they actually need the first from some specific order, it’s
a reminder they need to sort the objects first to get the right one.

The whole point of protocol hierarchies is to enable useful generic
algorithms. Here, the purpose of having a protocol that unites both ordered
and unordered collections is to permit the writing of generic algorithms
that operate on both; a user would want the first item from an ordered
collection or an arbitrary item (but the same one on multiple passes) from
an unordered collection. The name for that is currently `first`. Brent
demonstrated a trivial one-line example of such a use.

That’s particularly useful for functions that actually need an ordered

sequence; using OrderedSequence instead of Iterable (just as placeholders)
would be a firm reminder not to pass in an unordered collection.

[…]

As I said, you're welcome to tackle the protocol hierarchy, but I really
doubt it's within the realm of realistic endpoints for Swift 5. I'm just
trying to propose a narrowly targeted pragmatic solution to one specific
limited harm that might be deliverable by the next point release. As a
great man once said, Swift is a pragmatic language.

If you want a pragmatic solution, fix the bug in functionality, don’t try
and rename the method to something obscure to cover it up.

What I'm arguing is that there *is no bug in functionality*, only a naming
problem. It is true that the current protocol hierarchy would not be my
preferred design, but that's a matter of taste in terms of, again, where to
draw the line between too much modeling or not enough. But that's not
tantamount to a *bug*.

If you want to limit the harm, override `equalObjects` on unordered
sequences to use `==` (very strongly preferred), or always `false` (less
desirable, but at least consistent)

[…]

The Swift stdlib deliberately eschews modeling everything in protocol
hierarchies with the highest level of granularity. There's some fudging,
deliberately, to find a happy medium between obtuse and approachable,
between too many/too specialized and not enough. For example, I pushed for
protocols such as `Field` and `Ring` at the top of the numeric hierarchy,
which might allow complex number types to slot into the hierarchy more
sensibly, for example. But we have a compromise protocol `Numeric` which
doesn't quite have the same guarantees but is much more approachable.
Notice that we also don't break down numeric protocols into `Addable`,
`Subtractable`, etc.; we also have that fudge factor built into
`Equatable`, as I mentioned.

Eh, one or two corner cases on a protocol is probably fine. What’s not
fine is over half (Sequence) or almost all (Collection) the methods not
being applicable. There is a very clear gap there. We don’t need to fix
everything, but this is something that can and should be addressed.

This would be based on the premise that an instance of `Set` has no
intrinsic order; I disagree for the reasons above.

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Jon_Hull · October 16, 2017, 10:48am

To start with, the one you gave as an example at the beginning of this discussion: Two sets with identical elements which have different internal storage and thus give different orderings as sequences. You yourself have argued that the confusion around this is enough of a problem that we need to make a source-breaking change (renaming it) to warn people that the results of the ‘elementsEqual’ algorithm are undefined for sets and dictionaries. I don’t see why a non-source-breaking change is suddenly off-limits.

But more than that, any generic algorithm which is assuming that the sequence is coming from an ordered source (i.e. many things using first/last). Some uses of first are ok because the programmer actually means ‘any’, but anywhere where they actually mean first/last may be problematic.

Currently, there is no way to test for ordered-ness, so there is no way for even a careful programmer to mitigate this problem. By adding a protocol which states that something is unordered, we can either branch on it, or create a separate version of an algorithm for things which conform.

Thanks,
Jon

···

On Oct 15, 2017, at 9:58 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Sun, Oct 15, 2017 at 8:51 PM, Jonathan Hull <jhull@gbis.com <mailto:jhull@gbis.com>> wrote:

On Oct 14, 2017, at 10:48 PM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:

That ordering can be arbitrary, but it shouldn’t leak internal representation such that the method used to create identical things affects the outcome of generic methods because of differences in internal representation.

It would be better to say that the iteration order is well-defined. That will almost always mean documented, and usually predictable though obviously e.g. RNGs and iterating in random order will not be predictable by design.

That's actually more semantically constrained than what Swift calls a `Collection` (which requires conforming types to be multi-pass and(?) finite). By contrast, Swift's `SpongeBob` protocol explicitly permits conforming single-pass, infinite, and/or unordered types.

I think you’re talking about Sequence here, I’ve lost track of your nonsense by now. Yes, the current Swift protocol named Sequence allows unordered types. You seem to keep asserting that but not actually addressing my argument, which is that allowing Sequences to be unordered with the current API is undesired and actively harmful, and should therefore be changed.

What is harmful about it?

After thinking about it, I think the harmful bit is that unordered sequences are leaking internal representation (In your example, this is causing people to be surprised when two sets with identical elements are generating different sequences/orderings based on how they were created). You are correct when you say that this problem is even true for for-in.

I would not say it is a problem. Rather, by definition, iteration involves retrieving one element after another; if you're allowed to do that with Set, then the elements of a Set are observably ordered in some way. Since it's not an OrderedSet--i.e., order doesn't matter--then the only sensible conclusion is that the order of elements obtained in a for...in loop must be arbitrary. If you think this is harmful, then you must believe that one should be prohibited from iterating over an instance of Set. Otherwise, Set is inescapably a Sequence by the Swift definition of Sequence. All extension methods on Sequence like drop(while:) are really just conveniences for common things that you can do with iterated access; to my mind, they're essentially just alternative ways of spelling various for...in loops.

I think an argument could be made that you shouldn’t be able to iterate over a set without first defining an ordering on it (even if that ordering is somewhat arbitrary). Maybe we have something like a “Sequenc(e)able” protocol which defines things which can be turned into a sequence when combined with some sort of ordering. One possible ordering could be the internal representation (At least in that case we are calling it out specifically). If I had to say “setA.arbitraryOrder.elementsEqual(setB.arbitraryOrder)” I would definitely be less surprised when it returns false even though setA == setB.

Well, that's a totally different direction, then; you're arguing that `Set` and `Dictionary` should not conform to `Sequence` altogether. That's fine (it's also a direction that some of us explored off-list a while ago), but at this point in Swift's evolution, realistically, it's not within the realm of possible changes.

I am actually suggesting something slightly different. Basically, Set and Dictionary’s conformance to Collection would have a different implementation. They would conform to another protocol declaring that they are unordered. That protocol would fill in part of the conformance to sequence/collection using a default ordering, which is mostly arbitrary, but guaranteed to produce the same ordering for the same list of elements (even across collection types). This would be safer, but a tiny bit slower than what we have now (We could also potentially develop a way for collections like set to amortize the cost). For those who need to recover speed, the new protocol would also define a property which quickly returns a sequence/iterator using the internal ordering (I arbitrarily called it .arbitraryOrder).

I believe it would not be source breaking.

That is indeed something slightly different.

In an ideal world--and my initial understanding of what you were suggesting--Set and Dictionary would each have a member like `collection`, which would expose the underlying data as a `SetCollection` or `DictionaryCollection` that in turn would conform to `Collection`; meanwhile, Set and Dictionary themselves would not offer methods such as `prefix`, or indexing by subscript, which are not compatible with being unordered. For those who want a particular ordering, there'd be something like `collection(ordered areInIncreasingOrder: (T, T) -> Bool) -> {Set|Dictionary}Collection`.

What you suggest here instead would be minimally source-breaking. However, I'm unsure of where these guarantees provide benefit to justify the performance cost. Certainly not for `first` or `dropFirst(_:)`, which still yields an arbitrary result which doesn't make sense for something _unordered_. We *could* have an underscored customization point named something like `_customOrderingPass` that is only invoked from `elementsEqual` or other such methods to pre-rearrange the internal ordering of unordered collections in some deterministic way before comparison. Is that what you have in mind?

Something like that. Whatever we do, there will be a tradeoff between speed, correctness, and ergonomics.

My suggestion trades speed for correctness, and provides a way to recover speed through additional typing (which is slightly less ergonomic).

You haven't convinced me that this is at all improved in "correctness." It trades one arbitrary iteration order for another on a type that tries to model an unordered collection.

We could do something like you suggest. I don’t think the method would need to be underscored… the ordering pass could just be a method on the protocol which defines it as unordered. Then we could provide a special conformance for things where order really matters based on adherence to that protocol. That might be an acceptable tradeoff. It would give us speed at the cost of having the correct implementation being less ergonomic and more error prone (you have to remember to check that it is unordered and call the ordering method when it mattered).

I’d still be a bit worried that people would make incorrect generic algorithms based on expecting an order from unordered things, but at least it would be possible for them check and handle it correctly. I think I could get behind that tradeoff/compromise, given where we are in the swift process and Swift's obsession with speed (though I still slightly prefer the safer default). At least the standard library would handle all the things correctly, and that is what will affect the majority of programmers.

What is an example of such an "incorrect" generic algorithm that would be made correct by such a scheme?

trs · October 16, 2017, 5:11am

Ok, just to summarize our options so far (in order of effort & theoretical effectiveness):

1) Do nothing - This is the easiest thing to do, but it won’t fix any problems

2) Rename the elementsEqual method: (Again, fairly easy to do, and will hopefully reduce confusion, but doesn’t fix the underlying issue)
  • lexicographicallyEquals
  • elementOrderingEqual (this is my favorite of the names)
  • sequentiallyEquals
  • orderedEqual
  • pairwiseEqual
  • iterativelyEquals
  I might be missing some...

This is just wasting time IMO. Lets tackle the real problem.

3) Change the implementation of Set so that it will return the same order whenever it has the same elements. This solves the issue with set, but not with unordered sequences in general (e.g. comparing a set with the keys of a dictionary). Still on balance, this is my current favorite.

4) Add an UnorderedX (name to be bikeshed) protocol that defines a sequence/collection as unordered, which provides a method which will return a sequence/collection of defined order. Generic algorithms can check for conformance to the UnorderedX protocol, and provide different implementations as needed (mostly by operating on the provided sequence/collection). The advantage is that generic algorithms can be made which take whether the sequence is ordered/unordered into account, and we don’t lose any speed. The downside is that you have to remember to check if it is ordered or not, and failing to do so may result in generic algorithms which have the current issues. We could guarantee that the standard library would be correct though.

5) Add an UnorderedX protocol that defines a sequence/collection as unordered, takes the unordered elements and uses them to provide a partial implementation to Sequence/Collection where the elements have a defined order. The advantage to this is that you can now build correct generic algorithms that depend on a stable/defined ordering that will “just work" (e.g. elementsEqual will work when comparing a set to the keys of a dictionary).

Maybe instead of `elementsEqual` we need a `unorderedEquals` method to solve the problems `elementsEqual` was written for?

The disadvantage is that it will be a bit slower for things that don’t care about ordering (e.g. many things involving for-in) unless you specifically call a method that says you don’t care about the order.

6) Rework the Sequence protocols to account for Ordered and Unordered (e.g. replacing ‘first’ with ‘any’ and then adding ‘first’ back in only for ordered sequences). Reworking the protocol hierarchy would be the most permanent and theoretically “correct” fix, but has the potential to be massively source-breaking.

This is what should be fixed IMO.

-Thorsten

···

Am 16.10.2017 um 04:36 schrieb Jonathan Hull via swift-evolution <swift-evolution@swift.org>:

benjamin.g · October 16, 2017, 7:56am

[…]
Sets, as a mathematical concept, have no intrinsic order. However, instances of `Set`, which can be iterated over, *do* have at least one order which can be said to be intrinsic in the following sense: as long as iteration is possible, no API design can prevent that order from being observed and associated with the instance. Put another way, if you can use an instance of a type in a for...in loop, intrinsic to that functionality is a publicly visible order.

You keep saying this, I keep saying it’s only a technical “order” that is an implementation detail

You keep saying it's an implementation detail, which it emphatically is *not*. It's a *public guarantee* by the protocol that you can use an instance of `Set` in a `for...in` loop, thereby exposing a publicly visible order. An implementation detail is something

Being able to use a Set in a for...in loop does *not* make it ordered! The purpose is is just being able to do something with each element. That a for...loop works sequentially is just a side effect. Just imagine we had parallelized for...in loops.

No, it is not at all a "side effect." A for...in loop is a way of controlling the flow of code which accesses elements in a sequence one after another, and the correct behavior of code inside the loop depends on these semantics. A "parallel for" loop would be a totally different thing; arbitrary for...in loops can't be automatically "upgraded" to a "parallel for" loop because they have different semantics, and types that support "parallel for" would likely have to conform to a protocol other than `Sequence`.

that could go away with an alternative implementation. By contrast, no implementation that permits an instance of `Set` being iterated over in a `for...in` loop can avoid exposing at least one publicly visible order, because it's not a matter of implementation. Put another way, by allowing iteration, a type necessarily exposes at least one publicly visible order and thereby models a sequence in at least one way.

Wrong. I could easily implement Set or another type to iterate in a random order over its elements, so each iteration would expose a different (meaningless) order.

No, you cannot implement `Set` in this way because it conforms to `Collection`, which guarantees a multi-pass sequence. Set *must expose the same order on every iteration*.

This demonstrates that being able to be used in a for...in loop is about doing somthing with each element and not about element order.

Again, Sequence *already doesn't* require the order to be meaningful or even repeatable. Notice that, above, I said at least one order. A conforming type could expose as many different orders as iterations, but that changes nothing. I can still map an instance of that type to get an array of elements, and that array will reveal some order which is the result of the inner workings of the type.

The latter is an additional property which should be expressed in an additional protocol like Kevin suggested.

What useful generic algorithms would this protocol support that are not already possible?

Imho , the original issue isn't to create more powerful generic algorithms, but to prevent errors on concrete ones using set.first ( which could mislead people into believing swift sets are ordered sets).
As another person said, it seems sets aren't a sequence, but rather "sequenceable" ( aka : you can spawn a sequence out of it). Whereas arrays are storing their elements as a sequence and so conflating the two notions of sequence and collection isn't as much of a gap (everything that applies to the iterator may very well apply to the array as well).

Do you have a link on a design document that explained the rational for conflating the notion of collections with the iterator ?
At this point i'm not trying to argue anymore because you think pretty sure that it's a good idea, and the fact that .net also has the same design makes me think i may have missed something, so i'm just trying to understand this choice.

···

Le 16 oct. 2017 à 07:20, Xiaodi Wu via swift-evolution <swift-evolution@swift.org> a écrit :

On Sun, Oct 15, 2017 at 11:57 PM, Thorsten Seitz <tseitz42@icloud.com> wrote:

Am 16.10.2017 um 00:41 schrieb Xiaodi Wu via swift-evolution <swift-evolution@swift.org>:
On Sun, Oct 15, 2017 at 2:32 PM, Kevin Nattinger <swift@nattinger.net> wrote:

-Thorsten
and can’t be relied upon for anything and so we shouldn’t provide methods that rely on it. I think this part of the discussion has reached the “agree to disagree” stage.

[…]
You’re a fan of the principal of least surprise. Tell me, which would be less surprising: Set.dropFirst() actually drops a random element, or Set doesn’t have a dropFirst()? And if you think dropFirst() removing an element at random is not surprising, please explain why.

I think Set.dropFirst removing the first element that I observe on iteration is the least surprising answer, because Swift tells me that the stdlib Set models a set but that it is also a sequence.

Your logic is backwards. You’re saying it’s “least surprising” because that’s how it’s currently implemented, not that it should be implemented that way because it’s least surprising.

No, I'm saying it's least surprising because any type that supports iterated access thereby exposes an order--not as an implementation detail but as a matter of public API--and in the absence of any other order, "first" must refer to that order so exposed.

[…]

And that’s PRECISELY why lexicographicallyEqual does not make sense to apply to unordered sets. There is no lexicographical comparison possible, so why do you keep insisting they should have a method that falsely claims to lexicographically compare them?

I agree! It doesn't make sense if no comparison is possible! But Swift tells me that a `Set` is a `Sequence`!

You keep making the circular argument that a Set should do things because it currently does them. If you want to argue against changing things, argue that things shouldn’t be changed, not that the current implementation is correct because it is the current implementation.

No, I'm arguing that `Set`, by supporting iterated access, is not wrong to conform to a protocol called `Sequence` because it does have an intrinsic and publicly observable order, which is not an accident of a particular implementation but is inherent to any type that supports iterated access. Now, whether it's the *best choice* to conform `Set` to `Sequence` and offer order-dependent functions is a matter of taste, but it isn't *wrong*.
[…]
You will always have to account for this possibility, because Swift's `Equatable` explicitly allows "special values" to be not equal to themselves. This is, at least in part, in order to accommodate the IEEE decree that NaN != NaN:
let x = [Double.nan]
x.elementsEqual(x) // false
NaN is special, one-shot and unordered sequences are not. Unless you think that all unordered and single-pass sequences should compare false for `elementsEqual`, this is irrelevant for any sequence that doesn’t contain NaN and well-defined (false) for any that does.
Certainly, not all single-pass sequences should compare false to themselves, but some should: for instance, an infinite single-pass stream of all 1's should compare true to itself, but an infinite single-pass stream of alternating 1's and 0's should compare false to itself. If you write generic code that calls `elementsEqual`, it is pervasively incorrect to test for identity by assuming that elementsEqual will return true on reflexive comparison. NaN is only one of many reasons why such code would blow up.

Changing this behavior is way beyond the scope of this thread (and has been the topic of hours (actually, weeks and weeks) of fun on this list previously).

Yes, I’ve seen the discussion on NaN and Comparable. It’s not the same discussion.

[…]

It would be better to say that the iteration order is well-defined. That will almost always mean documented, and usually predictable though obviously e.g. RNGs and iterating in random order will not be predictable by design.

Wouldn't it then suffice to document, say, that a set's iteration order is the insertion order?

Now this actually gave me pause. I guess it does match what I said, but I still take issue with the fact that two Sets could compare `==` but not `elementsEqual`. I think that defining iteration order as insertion order adds a piece of publicly documented state that goes beyond what a Set really is. What you describe is really an OrderedSet, just without the random-access manipulation.

a) There is no semantic requirement on the part of `==` to be equivalent to an elementwise comparison when it is defined on a collection; in fact, one could imagine that some exotic sequence might legitimately define equality in a way that has nothing to do with elementwise comparison. Put another way, `==` returning `true` does not imply `elementsEqual` returning `true`, and `elementsEqual` returning `true` does not imply `==` returning `true`. This applies equally to ordered collections and is independent of the question of how to model unordered collections.

b) You keep writing that some Foo is really some Bar, but those are really just names. What would be the harm if Swift's `Set` indeed simply models an ordered set without random-access manipulation?

I’ll have to mull this over to see if I can come up with a coherent and (more) specific requirement for what makes an Iterable a Sequence, since clearly “documented” isn’t enough. Perhaps something along the lines that any two Sequences that compare equal must iterate the same.

[…]
Apple documentation calls this one of the "order-dependent" methods. It is surely acceptable for a type that conforms to an order-dependent protocol to have methods that are order-dependent; they do, however, have to be clearly order-dependent to avoid confusion on unordered types.

I’m not clear on what you’re trying to get across here. It seems you’re saying unordered types shouldn’t have order-dependent methods, which is exactly what I’ve been arguing.

No, I'm saying, essentially, that there are no truly unordered types in Swift; `Set` and `Dictionary` lead double lives modeling unordered collections on the one hand and ordered collections on the other. The order-dependent methods can continue to exist; they just need to be clearly named so that users know when they're using an instance of `Set` in the manner of an unordered collection and when they're using an instance of `Set` in the manner of an ordered collection.

[...]

Then there are all the methods that imply a specific order of iteration. If the “sequence” is unordered, who knows what you’ll get? It is incredibly easy for an engineer to write a method that implicitly relies on a passed sequence being intrinsically ordered and another engineer to pass it an unordered “sequence.” The first engineer could neglect to document the dependency, or even not notice it; or the second engineer could just fail to read the documentation thoroughly enough. There is currently no way for the compiler to enforce passing only an object that is (or at least claims to be) intrinsically ordered.

It is also incredibly easy for such an engineer to use `for...in` instead to accomplish the same task, generic over ordered and unordered sequences whatever you name such distinguished protocols. I think your beef really still boils down to Set being compatible with `for...in` at all, as Jon acknowledges.

Not providing ordered functions for unordered collections makes the developers think about what they actually need. If any object will do, they can use for…in, .makeIterator().next(), or an `anyObject()` we provide as a convenience. If they actually need the first from some specific order, it’s a reminder they need to sort the objects first to get the right one.

The whole point of protocol hierarchies is to enable useful generic algorithms. Here, the purpose of having a protocol that unites both ordered and unordered collections is to permit the writing of generic algorithms that operate on both; a user would want the first item from an ordered collection or an arbitrary item (but the same one on multiple passes) from an unordered collection. The name for that is currently `first`. Brent demonstrated a trivial one-line example of such a use.

That’s particularly useful for functions that actually need an ordered sequence; using OrderedSequence instead of Iterable (just as placeholders) would be a firm reminder not to pass in an unordered collection.

[…]

As I said, you're welcome to tackle the protocol hierarchy, but I really doubt it's within the realm of realistic endpoints for Swift 5. I'm just trying to propose a narrowly targeted pragmatic solution to one specific limited harm that might be deliverable by the next point release. As a great man once said, Swift is a pragmatic language.

If you want a pragmatic solution, fix the bug in functionality, don’t try and rename the method to something obscure to cover it up.

What I'm arguing is that there *is no bug in functionality*, only a naming problem. It is true that the current protocol hierarchy would not be my preferred design, but that's a matter of taste in terms of, again, where to draw the line between too much modeling or not enough. But that's not tantamount to a *bug*.

If you want to limit the harm, override `equalObjects` on unordered sequences to use `==` (very strongly preferred), or always `false` (less desirable, but at least consistent)

[…]

The Swift stdlib deliberately eschews modeling everything in protocol hierarchies with the highest level of granularity. There's some fudging, deliberately, to find a happy medium between obtuse and approachable, between too many/too specialized and not enough. For example, I pushed for protocols such as `Field` and `Ring` at the top of the numeric hierarchy, which might allow complex number types to slot into the hierarchy more sensibly, for example. But we have a compromise protocol `Numeric` which doesn't quite have the same guarantees but is much more approachable. Notice that we also don't break down numeric protocols into `Addable`, `Subtractable`, etc.; we also have that fudge factor built into `Equatable`, as I mentioned.

Eh, one or two corner cases on a protocol is probably fine. What’s not fine is over half (Sequence) or almost all (Collection) the methods not being applicable. There is a very clear gap there. We don’t need to fix everything, but this is something that can and should be addressed.

This would be based on the premise that an instance of `Set` has no intrinsic order; I disagree for the reasons above.
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

trs · October 16, 2017, 2:13pm

[…]
Sets, as a mathematical concept, have no intrinsic order. However, instances of `Set`, which can be iterated over, *do* have at least one order which can be said to be intrinsic in the following sense: as long as iteration is possible, no API design can prevent that order from being observed and associated with the instance. Put another way, if you can use an instance of a type in a for...in loop, intrinsic to that functionality is a publicly visible order.

You keep saying this, I keep saying it’s only a technical “order” that is an implementation detail

You keep saying it's an implementation detail, which it emphatically is *not*. It's a *public guarantee* by the protocol that you can use an instance of `Set` in a `for...in` loop, thereby exposing a publicly visible order. An implementation detail is something

Being able to use a Set in a for...in loop does *not* make it ordered! The purpose is is just being able to do something with each element. That a for...loop works sequentially is just a side effect. Just imagine we had parallelized for...in loops.

No, it is not at all a "side effect." A for...in loop is a way of controlling the flow of code which accesses elements in a sequence one after another, and the correct behavior of code inside the loop depends on these semantics. A "parallel for" loop would be a totally different thing; arbitrary for...in loops can't be automatically "upgraded" to a "parallel for" loop because they have different semantics, and types that support "parallel for" would likely have to conform to a protocol other than `Sequence`.

Exactly.

that could go away with an alternative implementation. By contrast, no implementation that permits an instance of `Set` being iterated over in a `for...in` loop can avoid exposing at least one publicly visible order, because it's not a matter of implementation. Put another way, by allowing iteration, a type necessarily exposes at least one publicly visible order and thereby models a sequence in at least one way.

Wrong. I could easily implement Set or another type to iterate in a random order over its elements, so each iteration would expose a different (meaningless) order.

No, you cannot implement `Set` in this way because it conforms to `Collection`, which guarantees a multi-pass sequence. Set *must expose the same order on every iteration*.

Set conforming to Collection is even worse than just conforming to Sequence as a quote from the documentation shows: "In addition to the operations that collections inherit from the Sequence protocol, you gain access to methods that depend on accessing an element at a specific position in a collection."
Clearly the elements of a Set do not have specific positions.

Furthermore my argument still stands for types derived from Sequence, so sidestepping it by pointing out that the situation is even worse for the current implementation of Set won't work. Can you explain why a type derived from Sequence which will iterate over its elements in random order should have methods like dropFirst()?

This demonstrates that being able to be used in a for...in loop is about doing somthing with each element and not about element order.

Again, Sequence *already doesn't* require the order to be meaningful or even repeatable.

I know. That's why I am arguing that certain methods of Sequence make no sense. At least we have reached that common ground.
So why do you insist on Sequence having methods like dropFirst() if the notion of "first" does not make any sense?

Notice that, above, I said at least one order. A conforming type could
expose as many different orders as iterations, but that changes nothing. I can still map an instance of that type to get an array of elements, and that array will reveal some order which is the result of the inner workings of the type.

The latter is an additional property which should be expressed in an additional protocol like Kevin suggested.

What useful generic algorithms would this protocol support that are not already possible?

It would allow expressing generic algorithms depending on an order.

-Thorsten

···

Am 16.10.2017 um 07:19 schrieb Xiaodi Wu <xiaodi.wu@gmail.com>:
On Sun, Oct 15, 2017 at 11:57 PM, Thorsten Seitz <tseitz42@icloud.com> wrote:

Am 16.10.2017 um 00:41 schrieb Xiaodi Wu via swift-evolution <swift-evolution@swift.org>:
On Sun, Oct 15, 2017 at 2:32 PM, Kevin Nattinger <swift@nattinger.net> wrote:

-Thorsten
and can’t be relied upon for anything and so we shouldn’t provide methods that rely on it. I think this part of the discussion has reached the “agree to disagree” stage.

[…]
You’re a fan of the principal of least surprise. Tell me, which would be less surprising: Set.dropFirst() actually drops a random element, or Set doesn’t have a dropFirst()? And if you think dropFirst() removing an element at random is not surprising, please explain why.

I think Set.dropFirst removing the first element that I observe on iteration is the least surprising answer, because Swift tells me that the stdlib Set models a set but that it is also a sequence.

Your logic is backwards. You’re saying it’s “least surprising” because that’s how it’s currently implemented, not that it should be implemented that way because it’s least surprising.

No, I'm saying it's least surprising because any type that supports iterated access thereby exposes an order--not as an implementation detail but as a matter of public API--and in the absence of any other order, "first" must refer to that order so exposed.

[…]

And that’s PRECISELY why lexicographicallyEqual does not make sense to apply to unordered sets. There is no lexicographical comparison possible, so why do you keep insisting they should have a method that falsely claims to lexicographically compare them?

I agree! It doesn't make sense if no comparison is possible! But Swift tells me that a `Set` is a `Sequence`!

You keep making the circular argument that a Set should do things because it currently does them. If you want to argue against changing things, argue that things shouldn’t be changed, not that the current implementation is correct because it is the current implementation.

No, I'm arguing that `Set`, by supporting iterated access, is not wrong to conform to a protocol called `Sequence` because it does have an intrinsic and publicly observable order, which is not an accident of a particular implementation but is inherent to any type that supports iterated access. Now, whether it's the *best choice* to conform `Set` to `Sequence` and offer order-dependent functions is a matter of taste, but it isn't *wrong*.
[…]
You will always have to account for this possibility, because Swift's `Equatable` explicitly allows "special values" to be not equal to themselves. This is, at least in part, in order to accommodate the IEEE decree that NaN != NaN:
let x = [Double.nan]
x.elementsEqual(x) // false
NaN is special, one-shot and unordered sequences are not. Unless you think that all unordered and single-pass sequences should compare false for `elementsEqual`, this is irrelevant for any sequence that doesn’t contain NaN and well-defined (false) for any that does.
Certainly, not all single-pass sequences should compare false to themselves, but some should: for instance, an infinite single-pass stream of all 1's should compare true to itself, but an infinite single-pass stream of alternating 1's and 0's should compare false to itself. If you write generic code that calls `elementsEqual`, it is pervasively incorrect to test for identity by assuming that elementsEqual will return true on reflexive comparison. NaN is only one of many reasons why such code would blow up.

Changing this behavior is way beyond the scope of this thread (and has been the topic of hours (actually, weeks and weeks) of fun on this list previously).

Yes, I’ve seen the discussion on NaN and Comparable. It’s not the same discussion.

[…]

It would be better to say that the iteration order is well-defined. That will almost always mean documented, and usually predictable though obviously e.g. RNGs and iterating in random order will not be predictable by design.

Wouldn't it then suffice to document, say, that a set's iteration order is the insertion order?

Now this actually gave me pause. I guess it does match what I said, but I still take issue with the fact that two Sets could compare `==` but not `elementsEqual`. I think that defining iteration order as insertion order adds a piece of publicly documented state that goes beyond what a Set really is. What you describe is really an OrderedSet, just without the random-access manipulation.

a) There is no semantic requirement on the part of `==` to be equivalent to an elementwise comparison when it is defined on a collection; in fact, one could imagine that some exotic sequence might legitimately define equality in a way that has nothing to do with elementwise comparison. Put another way, `==` returning `true` does not imply `elementsEqual` returning `true`, and `elementsEqual` returning `true` does not imply `==` returning `true`. This applies equally to ordered collections and is independent of the question of how to model unordered collections.

b) You keep writing that some Foo is really some Bar, but those are really just names. What would be the harm if Swift's `Set` indeed simply models an ordered set without random-access manipulation?

I’ll have to mull this over to see if I can come up with a coherent and (more) specific requirement for what makes an Iterable a Sequence, since clearly “documented” isn’t enough. Perhaps something along the lines that any two Sequences that compare equal must iterate the same.

[…]
Apple documentation calls this one of the "order-dependent" methods. It is surely acceptable for a type that conforms to an order-dependent protocol to have methods that are order-dependent; they do, however, have to be clearly order-dependent to avoid confusion on unordered types.

I’m not clear on what you’re trying to get across here. It seems you’re saying unordered types shouldn’t have order-dependent methods, which is exactly what I’ve been arguing.

No, I'm saying, essentially, that there are no truly unordered types in Swift; `Set` and `Dictionary` lead double lives modeling unordered collections on the one hand and ordered collections on the other. The order-dependent methods can continue to exist; they just need to be clearly named so that users know when they're using an instance of `Set` in the manner of an unordered collection and when they're using an instance of `Set` in the manner of an ordered collection.

[...]

Then there are all the methods that imply a specific order of iteration. If the “sequence” is unordered, who knows what you’ll get? It is incredibly easy for an engineer to write a method that implicitly relies on a passed sequence being intrinsically ordered and another engineer to pass it an unordered “sequence.” The first engineer could neglect to document the dependency, or even not notice it; or the second engineer could just fail to read the documentation thoroughly enough. There is currently no way for the compiler to enforce passing only an object that is (or at least claims to be) intrinsically ordered.

It is also incredibly easy for such an engineer to use `for...in` instead to accomplish the same task, generic over ordered and unordered sequences whatever you name such distinguished protocols. I think your beef really still boils down to Set being compatible with `for...in` at all, as Jon acknowledges.

Not providing ordered functions for unordered collections makes the developers think about what they actually need. If any object will do, they can use for…in, .makeIterator().next(), or an `anyObject()` we provide as a convenience. If they actually need the first from some specific order, it’s a reminder they need to sort the objects first to get the right one.

The whole point of protocol hierarchies is to enable useful generic algorithms. Here, the purpose of having a protocol that unites both ordered and unordered collections is to permit the writing of generic algorithms that operate on both; a user would want the first item from an ordered collection or an arbitrary item (but the same one on multiple passes) from an unordered collection. The name for that is currently `first`. Brent demonstrated a trivial one-line example of such a use.

That’s particularly useful for functions that actually need an ordered sequence; using OrderedSequence instead of Iterable (just as placeholders) would be a firm reminder not to pass in an unordered collection.

[…]

As I said, you're welcome to tackle the protocol hierarchy, but I really doubt it's within the realm of realistic endpoints for Swift 5. I'm just trying to propose a narrowly targeted pragmatic solution to one specific limited harm that might be deliverable by the next point release. As a great man once said, Swift is a pragmatic language.

If you want a pragmatic solution, fix the bug in functionality, don’t try and rename the method to something obscure to cover it up.

What I'm arguing is that there *is no bug in functionality*, only a naming problem. It is true that the current protocol hierarchy would not be my preferred design, but that's a matter of taste in terms of, again, where to draw the line between too much modeling or not enough. But that's not tantamount to a *bug*.

If you want to limit the harm, override `equalObjects` on unordered sequences to use `==` (very strongly preferred), or always `false` (less desirable, but at least consistent)

[…]

The Swift stdlib deliberately eschews modeling everything in protocol hierarchies with the highest level of granularity. There's some fudging, deliberately, to find a happy medium between obtuse and approachable, between too many/too specialized and not enough. For example, I pushed for protocols such as `Field` and `Ring` at the top of the numeric hierarchy, which might allow complex number types to slot into the hierarchy more sensibly, for example. But we have a compromise protocol `Numeric` which doesn't quite have the same guarantees but is much more approachable. Notice that we also don't break down numeric protocols into `Addable`, `Subtractable`, etc.; we also have that fudge factor built into `Equatable`, as I mentioned.

Eh, one or two corner cases on a protocol is probably fine. What’s not fine is over half (Sequence) or almost all (Collection) the methods not being applicable. There is a very clear gap there. We don’t need to fix everything, but this is something that can and should be addressed.

This would be based on the premise that an instance of `Set` has no intrinsic order; I disagree for the reasons above.
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Michael_Ilseman · October 16, 2017, 4:19pm

[…]
Sets, as a mathematical concept, have no intrinsic order. However, instances of `Set`, which can be iterated over, *do* have at least one order which can be said to be intrinsic in the following sense: as long as iteration is possible, no API design can prevent that order from being observed and associated with the instance. Put another way, if you can use an instance of a type in a for...in loop, intrinsic to that functionality is a publicly visible order.

You keep saying this, I keep saying it’s only a technical “order” that is an implementation detail

You keep saying it's an implementation detail, which it emphatically is *not*. It's a *public guarantee* by the protocol that you can use an instance of `Set` in a `for...in` loop, thereby exposing a publicly visible order. An implementation detail is something

Being able to use a Set in a for...in loop does *not* make it ordered! The purpose is is just being able to do something with each element. That a for...loop works sequentially is just a side effect. Just imagine we had parallelized for...in loops.

No, it is not at all a "side effect." A for...in loop is a way of controlling the flow of code which accesses elements in a sequence one after another, and the correct behavior of code inside the loop depends on these semantics. A "parallel for" loop would be a totally different thing; arbitrary for...in loops can't be automatically "upgraded" to a "parallel for" loop because they have different semantics, and types that support "parallel for" would likely have to conform to a protocol other than `Sequence`.

Exactly.

that could go away with an alternative implementation. By contrast, no implementation that permits an instance of `Set` being iterated over in a `for...in` loop can avoid exposing at least one publicly visible order, because it's not a matter of implementation. Put another way, by allowing iteration, a type necessarily exposes at least one publicly visible order and thereby models a sequence in at least one way.

Wrong. I could easily implement Set or another type to iterate in a random order over its elements, so each iteration would expose a different (meaningless) order.

No, you cannot implement `Set` in this way because it conforms to `Collection`, which guarantees a multi-pass sequence. Set *must expose the same order on every iteration*.

Set conforming to Collection is even worse than just conforming to Sequence as a quote from the documentation shows: "In addition to the operations that collections inherit from the Sequence protocol, you gain access to methods that depend on accessing an element at a specific position in a collection."
Clearly the elements of a Set do not have specific positions.

That’s not at all clear to me, could you elaborate? My understanding is that elements of Set definitely *do* have a position, and that’s why you can use an index on a set to retrieve the element. The same index on the same set retrieves the same element.

···

On Oct 16, 2017, at 7:20 AM, Thorsten Seitz via swift-evolution <swift-evolution@swift.org> wrote:
Am 16.10.2017 um 07:19 schrieb Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>>:

On Sun, Oct 15, 2017 at 11:57 PM, Thorsten Seitz <tseitz42@icloud.com <mailto:tseitz42@icloud.com>> wrote:
Am 16.10.2017 um 00:41 schrieb Xiaodi Wu via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>>:

On Sun, Oct 15, 2017 at 2:32 PM, Kevin Nattinger <swift@nattinger.net <mailto:swift@nattinger.net>> wrote:

Furthermore my argument still stands for types derived from Sequence, so sidestepping it by pointing out that the situation is even worse for the current implementation of Set won't work. Can you explain why a type derived from Sequence which will iterate over its elements in random order should have methods like dropFirst()?

This demonstrates that being able to be used in a for...in loop is about doing somthing with each element and not about element order.

Again, Sequence *already doesn't* require the order to be meaningful or even repeatable.

I know. That's why I am arguing that certain methods of Sequence make no sense. At least we have reached that common ground.
So why do you insist on Sequence having methods like dropFirst() if the notion of "first" does not make any sense?

Notice that, above, I said at least one order. A conforming type could
expose as many different orders as iterations, but that changes nothing. I can still map an instance of that type to get an array of elements, and that array will reveal some order which is the result of the inner workings of the type.

The latter is an additional property which should be expressed in an additional protocol like Kevin suggested.

What useful generic algorithms would this protocol support that are not already possible?

It would allow expressing generic algorithms depending on an order.

-Thorsten
-Thorsten
and can’t be relied upon for anything and so we shouldn’t provide methods that rely on it. I think this part of the discussion has reached the “agree to disagree” stage.

[…]
You’re a fan of the principal of least surprise. Tell me, which would be less surprising: Set.dropFirst() actually drops a random element, or Set doesn’t have a dropFirst()? And if you think dropFirst() removing an element at random is not surprising, please explain why.

I think Set.dropFirst removing the first element that I observe on iteration is the least surprising answer, because Swift tells me that the stdlib Set models a set but that it is also a sequence.

Your logic is backwards. You’re saying it’s “least surprising” because that’s how it’s currently implemented, not that it should be implemented that way because it’s least surprising.

No, I'm saying it's least surprising because any type that supports iterated access thereby exposes an order--not as an implementation detail but as a matter of public API--and in the absence of any other order, "first" must refer to that order so exposed.

[…]

And that’s PRECISELY why lexicographicallyEqual does not make sense to apply to unordered sets. There is no lexicographical comparison possible, so why do you keep insisting they should have a method that falsely claims to lexicographically compare them?

I agree! It doesn't make sense if no comparison is possible! But Swift tells me that a `Set` is a `Sequence`!

You keep making the circular argument that a Set should do things because it currently does them. If you want to argue against changing things, argue that things shouldn’t be changed, not that the current implementation is correct because it is the current implementation.

No, I'm arguing that `Set`, by supporting iterated access, is not wrong to conform to a protocol called `Sequence` because it does have an intrinsic and publicly observable order, which is not an accident of a particular implementation but is inherent to any type that supports iterated access. Now, whether it's the *best choice* to conform `Set` to `Sequence` and offer order-dependent functions is a matter of taste, but it isn't *wrong*.

[…]
You will always have to account for this possibility, because Swift's `Equatable` explicitly allows "special values" to be not equal to themselves. This is, at least in part, in order to accommodate the IEEE decree that NaN != NaN:
let x = [Double.nan]
x.elementsEqual(x) // false
NaN is special, one-shot and unordered sequences are not. Unless you think that all unordered and single-pass sequences should compare false for `elementsEqual`, this is irrelevant for any sequence that doesn’t contain NaN and well-defined (false) for any that does.

Certainly, not all single-pass sequences should compare false to themselves, but some should: for instance, an infinite single-pass stream of all 1's should compare true to itself, but an infinite single-pass stream of alternating 1's and 0's should compare false to itself. If you write generic code that calls `elementsEqual`, it is pervasively incorrect to test for identity by assuming that elementsEqual will return true on reflexive comparison. NaN is only one of many reasons why such code would blow up.

Changing this behavior is way beyond the scope of this thread (and has been the topic of hours (actually, weeks and weeks) of fun on this list previously).

Yes, I’ve seen the discussion on NaN and Comparable. It’s not the same discussion.

[…]

It would be better to say that the iteration order is well-defined. That will almost always mean documented, and usually predictable though obviously e.g. RNGs and iterating in random order will not be predictable by design.

Wouldn't it then suffice to document, say, that a set's iteration order is the insertion order?

Now this actually gave me pause. I guess it does match what I said, but I still take issue with the fact that two Sets could compare `==` but not `elementsEqual`. I think that defining iteration order as insertion order adds a piece of publicly documented state that goes beyond what a Set really is. What you describe is really an OrderedSet, just without the random-access manipulation.

a) There is no semantic requirement on the part of `==` to be equivalent to an elementwise comparison when it is defined on a collection; in fact, one could imagine that some exotic sequence might legitimately define equality in a way that has nothing to do with elementwise comparison. Put another way, `==` returning `true` does not imply `elementsEqual` returning `true`, and `elementsEqual` returning `true` does not imply `==` returning `true`. This applies equally to ordered collections and is independent of the question of how to model unordered collections.

b) You keep writing that some Foo is really some Bar, but those are really just names. What would be the harm if Swift's `Set` indeed simply models an ordered set without random-access manipulation?

I’ll have to mull this over to see if I can come up with a coherent and (more) specific requirement for what makes an Iterable a Sequence, since clearly “documented” isn’t enough. Perhaps something along the lines that any two Sequences that compare equal must iterate the same.

[…]
Apple documentation calls this one of the "order-dependent" methods. It is surely acceptable for a type that conforms to an order-dependent protocol to have methods that are order-dependent; they do, however, have to be clearly order-dependent to avoid confusion on unordered types.

I’m not clear on what you’re trying to get across here. It seems you’re saying unordered types shouldn’t have order-dependent methods, which is exactly what I’ve been arguing.

No, I'm saying, essentially, that there are no truly unordered types in Swift; `Set` and `Dictionary` lead double lives modeling unordered collections on the one hand and ordered collections on the other. The order-dependent methods can continue to exist; they just need to be clearly named so that users know when they're using an instance of `Set` in the manner of an unordered collection and when they're using an instance of `Set` in the manner of an ordered collection.

[...]
Then there are all the methods that imply a specific order of iteration. If the “sequence” is unordered, who knows what you’ll get? It is incredibly easy for an engineer to write a method that implicitly relies on a passed sequence being intrinsically ordered and another engineer to pass it an unordered “sequence.” The first engineer could neglect to document the dependency, or even not notice it; or the second engineer could just fail to read the documentation thoroughly enough. There is currently no way for the compiler to enforce passing only an object that is (or at least claims to be) intrinsically ordered.

It is also incredibly easy for such an engineer to use `for...in` instead to accomplish the same task, generic over ordered and unordered sequences whatever you name such distinguished protocols. I think your beef really still boils down to Set being compatible with `for...in` at all, as Jon acknowledges.

Not providing ordered functions for unordered collections makes the developers think about what they actually need. If any object will do, they can use for…in, .makeIterator().next(), or an `anyObject()` we provide as a convenience. If they actually need the first from some specific order, it’s a reminder they need to sort the objects first to get the right one.

The whole point of protocol hierarchies is to enable useful generic algorithms. Here, the purpose of having a protocol that unites both ordered and unordered collections is to permit the writing of generic algorithms that operate on both; a user would want the first item from an ordered collection or an arbitrary item (but the same one on multiple passes) from an unordered collection. The name for that is currently `first`. Brent demonstrated a trivial one-line example of such a use.

That’s particularly useful for functions that actually need an ordered sequence; using OrderedSequence instead of Iterable (just as placeholders) would be a firm reminder not to pass in an unordered collection.

[…]

As I said, you're welcome to tackle the protocol hierarchy, but I really doubt it's within the realm of realistic endpoints for Swift 5. I'm just trying to propose a narrowly targeted pragmatic solution to one specific limited harm that might be deliverable by the next point release. As a great man once said, Swift is a pragmatic language.

If you want a pragmatic solution, fix the bug in functionality, don’t try and rename the method to something obscure to cover it up.

What I'm arguing is that there *is no bug in functionality*, only a naming problem. It is true that the current protocol hierarchy would not be my preferred design, but that's a matter of taste in terms of, again, where to draw the line between too much modeling or not enough. But that's not tantamount to a *bug*.

If you want to limit the harm, override `equalObjects` on unordered sequences to use `==` (very strongly preferred), or always `false` (less desirable, but at least consistent)

[…]

The Swift stdlib deliberately eschews modeling everything in protocol hierarchies with the highest level of granularity. There's some fudging, deliberately, to find a happy medium between obtuse and approachable, between too many/too specialized and not enough. For example, I pushed for protocols such as `Field` and `Ring` at the top of the numeric hierarchy, which might allow complex number types to slot into the hierarchy more sensibly, for example. But we have a compromise protocol `Numeric` which doesn't quite have the same guarantees but is much more approachable. Notice that we also don't break down numeric protocols into `Addable`, `Subtractable`, etc.; we also have that fudge factor built into `Equatable`, as I mentioned.

Eh, one or two corner cases on a protocol is probably fine. What’s not fine is over half (Sequence) or almost all (Collection) the methods not being applicable. There is a very clear gap there. We don’t need to fix everything, but this is something that can and should be addressed.

This would be based on the premise that an instance of `Set` has no intrinsic order; I disagree for the reasons above.
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

knatt · October 16, 2017, 5:19pm

[…]
Set conforming to Collection is even worse than just conforming to Sequence as a quote from the documentation shows: "In addition to the operations that collections inherit from the Sequence protocol, you gain access to methods that depend on accessing an element at a specific position in a collection."
Clearly the elements of a Set do not have specific positions.

That’s not at all clear to me, could you elaborate? My understanding is that elements of Set definitely *do* have a position, and that’s why you can use an index on a set to retrieve the element. The same index on the same set retrieves the same element.

A Set only has "indices" because it confirms to a protocol where almost all the requirements are meaningless for an unordered collection. Or at best, for the same reason it has an "order": as a side-effect of the fact that it can be iterated over, you can give indices for each element on a specific iteration; those indices are meaningless for the set itself, and should not be used for anything but a full, order-independent iteration.

David_Sweeris · October 16, 2017, 6:22pm

Should/could we just rename `Set` to `UniquedArray` or something like that? This is starting to feel a bit like the access control debate.

- Dave Sweeris

···

On Oct 16, 2017, at 10:42, BJ Homer via swift-evolution <swift-evolution@swift.org> wrote:

On Oct 16, 2017, at 8:20 AM, Thorsten Seitz via swift-evolution <swift-evolution@swift.org> wrote:

Am 16.10.2017 um 07:19 schrieb Xiaodi Wu <xiaodi.wu@gmail.com>:

What useful generic algorithms would this protocol support that are not already possible?

It would allow expressing generic algorithms depending on an order.

-Thorsten

We can already express generic algorithms that depend on an order—any generic algorithm that works on a Sequence works on something that is ordered. A Swift Set has an undefined order right now, but a generic algorithm working on any arbitrary Sequence likely doesn’t care about what the order, just that an order exists. And a Swift Set does indeed have an order. If you have a generic algorithm that only works on inputs sorted in a particular manner, then you’ve likely either documented that or added a “sortedBy” parameter. Otherwise, you probably just want to be able to iterate through everything.

Let’s assume, though, that you wanted to write an algorithm that works only on MeaningfullyOrdered inputs.

func extractInfo<T: MeaningfullyOrdered>(_ input: T) { }
extractInfo(someArray)

What stops the caller from simply wrapping the Set in an Array?

extractInfo(Array(someSet))

The Array constructed here is going to reflect the arbitrary ordering provided by Set, but as far as the type system is concerned, the input is an Array, which is certainly meaningfully-ordered. Have we gained anything by requiring the caller to wrap the input in an array? We’ve made the call site a bit more awkward, and we’ve lost a bit of performance. We certainly need to be able to convert Sets in to Arrays; to eliminate that would be massively source-breaking, and it’s not clear that allowing that conversion is actively harmful, so it’s unlikely to change in Swift 5.

David_Sweeris · October 16, 2017, 6:22pm

Should/could we just rename `Set` to `UniquedArray` or something like that? This is starting to feel a bit like the access control debate.

- Dave Sweeris

···

On Oct 16, 2017, at 10:42, BJ Homer via swift-evolution <swift-evolution@swift.org> wrote:

On Oct 16, 2017, at 8:20 AM, Thorsten Seitz via swift-evolution <swift-evolution@swift.org> wrote:

Am 16.10.2017 um 07:19 schrieb Xiaodi Wu <xiaodi.wu@gmail.com>:

What useful generic algorithms would this protocol support that are not already possible?

It would allow expressing generic algorithms depending on an order.

-Thorsten

We can already express generic algorithms that depend on an order—any generic algorithm that works on a Sequence works on something that is ordered. A Swift Set has an undefined order right now, but a generic algorithm working on any arbitrary Sequence likely doesn’t care about what the order, just that an order exists. And a Swift Set does indeed have an order. If you have a generic algorithm that only works on inputs sorted in a particular manner, then you’ve likely either documented that or added a “sortedBy” parameter. Otherwise, you probably just want to be able to iterate through everything.

Let’s assume, though, that you wanted to write an algorithm that works only on MeaningfullyOrdered inputs.

func extractInfo<T: MeaningfullyOrdered>(_ input: T) { }
extractInfo(someArray)

What stops the caller from simply wrapping the Set in an Array?

extractInfo(Array(someSet))

The Array constructed here is going to reflect the arbitrary ordering provided by Set, but as far as the type system is concerned, the input is an Array, which is certainly meaningfully-ordered. Have we gained anything by requiring the caller to wrap the input in an array? We’ve made the call site a bit more awkward, and we’ve lost a bit of performance. We certainly need to be able to convert Sets in to Arrays; to eliminate that would be massively source-breaking, and it’s not clear that allowing that conversion is actively harmful, so it’s unlikely to change in Swift 5.

Michael_Ilseman · October 16, 2017, 6:27pm

A Set has indices because they are useful to use with the type. Regardless of whether Set conforms to Collection, or even Sequence, indices are useful and meaningful for Sets. Even if the entire protocol hierarchy were to be redesigned, Set would provide indices. If Set didn’t implement any protocols at all, Set would still provide indices. Independent of whether Set even provides an Iterator, Set would still provide Indices.

···

On Oct 16, 2017, at 10:19 AM, Kevin Nattinger <swift@nattinger.net> wrote:

[…]
Set conforming to Collection is even worse than just conforming to Sequence as a quote from the documentation shows: "In addition to the operations that collections inherit from the Sequence protocol, you gain access to methods that depend on accessing an element at a specific position in a collection."
Clearly the elements of a Set do not have specific positions.

That’s not at all clear to me, could you elaborate? My understanding is that elements of Set definitely *do* have a position, and that’s why you can use an index on a set to retrieve the element. The same index on the same set retrieves the same element.

A Set only has "indices" because it confirms to a protocol where almost all the requirements are meaningless for an unordered collection. Or at best, for the same reason it has an "order": as a side-effect of the fact that it can be iterated over, you can give indices for each element on a specific iteration; those indices are meaningless for the set itself, and should not be used for anything but a full, order-independent iteration.

bjhomer · October 16, 2017, 5:42pm

We can already express generic algorithms that depend on an order—any generic algorithm that works on a Sequence works on something that is ordered. A Swift Set has an undefined order right now, but a generic algorithm working on any arbitrary Sequence likely doesn’t care about what the order, just that an order exists. And a Swift Set does indeed have an order. If you have a generic algorithm that only works on inputs sorted in a particular manner, then you’ve likely either documented that or added a “sortedBy” parameter. Otherwise, you probably just want to be able to iterate through everything.

Let’s assume, though, that you wanted to write an algorithm that works only on MeaningfullyOrdered inputs.

func extractInfo<T: MeaningfullyOrdered>(_ input: T) { }
extractInfo(someArray)

What stops the caller from simply wrapping the Set in an Array?

extractInfo(Array(someSet))

The Array constructed here is going to reflect the arbitrary ordering provided by Set, but as far as the type system is concerned, the input is an Array, which is certainly meaningfully-ordered. Have we gained anything by requiring the caller to wrap the input in an array? We’ve made the call site a bit more awkward, and we’ve lost a bit of performance. We certainly need to be able to convert Sets in to Arrays; to eliminate that would be massively source-breaking, and it’s not clear that allowing that conversion is actively harmful, so it’s unlikely to change in Swift 5.

So I agree with Xiaodi; I don’t see what we would gain by splitting the protocols, other than some conceptual purity. Some have expressed concern over the existence of someSet.first, but even if we removed it, it would still be available as Array(someSet).first. And we still haven't any examples of actual algorithms that would surprise the user by behaving incorrectly when given an arbitrarily-ordered sequence, so it’s hard to make the argument that this restriction is actively harmful.

I agree that isOrderedEqual(to:) is a better name for elementsEqual()

-BJ

···

On Oct 16, 2017, at 8:20 AM, Thorsten Seitz via swift-evolution <swift-evolution@swift.org> wrote:

Am 16.10.2017 um 07:19 schrieb Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>>:

What useful generic algorithms would this protocol support that are not already possible?

It would allow expressing generic algorithms depending on an order.

-Thorsten

knatt · October 16, 2017, 6:57pm

What useful generic algorithms would this protocol support that are not already possible?

It would allow expressing generic algorithms depending on an order.

-Thorsten

We can already express generic algorithms that depend on an order—any generic algorithm that works on a Sequence works on something that is ordered. A Swift Set has an undefined order right now, but a generic algorithm working on any arbitrary Sequence likely doesn’t care about what the order, just that an order exists. And a Swift Set does indeed have an order. If you have a generic algorithm that only works on inputs sorted in a particular manner, then you’ve likely either documented that or added a “sortedBy” parameter. Otherwise, you probably just want to be able to iterate through everything.

Let’s assume, though, that you wanted to write an algorithm that works only on MeaningfullyOrdered inputs.

func extractInfo<T: MeaningfullyOrdered>(_ input: T) { }
extractInfo(someArray)

What stops the caller from simply wrapping the Set in an Array?

extractInfo(Array(someSet))

The Array constructed here is going to reflect the arbitrary ordering provided by Set, but as far as the type system is concerned, the input is an Array, which is certainly meaningfully-ordered. Have we gained anything by requiring the caller to wrap the input in an array?

We force the user to acknowledge that the unspecified order is acceptable to them, as opposed to sorting. Much like the `uniquingKeysWith` argument on Dictionary.init(keysAndValues:uniquingKeysWith:). That one even has a sane default behavior used in many other languages (`{$1}`) and still doesn't offer that as a default; whereas a set's iteration order has no sane default. I think it would even be acceptable (or even desired) to have the set's Iterator be MeaningfullyOrdered, so the user could bypass the array and use `someSet.makeIterator()`, which would probably have little if any performance or memory overhead.

We’ve made the call site a bit more awkward, and we’ve lost a bit of performance. We certainly need to be able to convert Sets in to Arrays; to eliminate that would be massively source-breaking, and it’s not clear that allowing that conversion is actively harmful, so it’s unlikely to change in Swift 5.

So I agree with Xiaodi; I don’t see what we would gain by splitting the protocols, other than some conceptual purity.

Conceptual purity is, e.g. why protocols with associated types are such a pain in the ass to work with. Sequence<Int> makes plenty of sense and is concise and clear (and used in, e.g. Java, C#), why should I spell it Any<T: Sequence where T.Iterator.Element == Int>, or whatever it ends up being. 4 years into the language and two of the majorly touted benefits of the language (protocols and generics) play so poorly with each other that it is one of if not THE major pain point in the language. All because a couple people wanted conceptual purity.

I'm quite sure I've seen the conceptual purity argument used in favor of many other suggestions on this list, though I don't have the time to go through and find them.

Some have expressed concern over the existence of someSet.first, but even if we removed it, it would still be available as Array(someSet).first. And we still haven't any examples of actual algorithms that would surprise the user by behaving incorrectly when given an arbitrarily-ordered sequence, so it’s hard to make the argument that this restriction is actively harmful.

equalElements(), the original motivation for this proposal, is actively harmful on unordered Sequences. On any two instances of any unordered Sequence that are `==`, it could come back true or false. And adding and removing an element to one of them (so the same instance is back in the same state), could change the result. I don't think anyone new to the language would expect that.

I agree that isOrderedEqual(to:) is a better name for elementsEqual()

Much more acceptable than the proposed name.

···

On Oct 16, 2017, at 10:43 AM, BJ Homer via swift-evolution <swift-evolution@swift.org> wrote:

On Oct 16, 2017, at 8:20 AM, Thorsten Seitz via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Am 16.10.2017 um 07:19 schrieb Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>>:

-BJ
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

knatt · October 16, 2017, 7:55pm

What useful generic algorithms would this protocol support that are not already possible?

It would allow expressing generic algorithms depending on an order.

-Thorsten

We can already express generic algorithms that depend on an order—any generic algorithm that works on a Sequence works on something that is ordered. A Swift Set has an undefined order right now, but a generic algorithm working on any arbitrary Sequence likely doesn’t care about what the order, just that an order exists. And a Swift Set does indeed have an order. If you have a generic algorithm that only works on inputs sorted in a particular manner, then you’ve likely either documented that or added a “sortedBy” parameter. Otherwise, you probably just want to be able to iterate through everything.

Let’s assume, though, that you wanted to write an algorithm that works only on MeaningfullyOrdered inputs.

func extractInfo<T: MeaningfullyOrdered>(_ input: T) { }
extractInfo(someArray)

What stops the caller from simply wrapping the Set in an Array?

extractInfo(Array(someSet))

The Array constructed here is going to reflect the arbitrary ordering provided by Set, but as far as the type system is concerned, the input is an Array, which is certainly meaningfully-ordered. Have we gained anything by requiring the caller to wrap the input in an array? We’ve made the call site a bit more awkward, and we’ve lost a bit of performance. We certainly need to be able to convert Sets in to Arrays; to eliminate that would be massively source-breaking, and it’s not clear that allowing that conversion is actively harmful, so it’s unlikely to change in Swift 5.

Should/could we just rename `Set` to `UniquedArray` or something like that? This is starting to feel a bit like the access control debate.

So essentially convert Set to OrderedSet and not offer a theoretical unordered Set? I think that would be acceptable (if we apply it to dictionaries as well), BUT that doesn't address the more general case of other, potentially custom unordered Sequences.

···

On Oct 16, 2017, at 11:23 AM, David Sweeris via swift-evolution <swift-evolution@swift.org> wrote:
On Oct 16, 2017, at 10:42, BJ Homer via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

On Oct 16, 2017, at 8:20 AM, Thorsten Seitz via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

Am 16.10.2017 um 07:19 schrieb Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>>:

- Dave Sweeris
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution