[Draft] Rename Sequence.elementsEqual

David_Sweeris · October 16, 2017, 6:15pm

From the “value semantics” PoV, yes. But from the “unordered collection of values” PoV, Sets/Dictionaries, being unordered, are semantically free to rearrange the in-memory ordering of their elements without user intervention. Maybe we need to add a mechanism for expressing the idea of a “value type which has referential tendencies“ for these “managed” types of values?

Perhaps a less abstract example would be a “RemoteDictionary”, which maps each key to a remote server where the value is actually stored... I would expect it to iterate over its values in whatever order it gets them back from all the servers. And since dictionaries are unordered and network conditions & server loads can change (quickly), I wouldn’t expect consecutive iterations to necessarily be in the same order.

- Dave Sweeris

···

On Oct 16, 2017, at 09:21, Michael Ilseman <milseman@apple.com> wrote:

On Oct 16, 2017, at 8:46 AM, David Sweeris via swift-evolution <swift-evolution@swift.org> wrote:

On Oct 16, 2017, at 07:20, Xiaodi Wu via swift-evolution <swift-evolution@swift.org> wrote:

On Mon, Oct 16, 2017 at 05:48 Jonathan Hull <jhull@gbis.com> wrote:

On Oct 15, 2017, at 9:58 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Sun, Oct 15, 2017 at 8:51 PM, Jonathan Hull <jhull@gbis.com> wrote:

On Oct 14, 2017, at 10:48 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

That ordering can be arbitrary, but it shouldn’t leak internal representation such that the method used to create identical things affects the outcome of generic methods because of differences in internal representation.

It would be better to say that the iteration order is well-defined. That will almost always mean documented, and usually predictable though obviously e.g. RNGs and iterating in random order will not be predictable by design.

That's actually more semantically constrained than what Swift calls a `Collection` (which requires conforming types to be multi-pass and(?) finite). By contrast, Swift's `SpongeBob` protocol explicitly permits conforming single-pass, infinite, and/or unordered types.

I think you’re talking about Sequence here, I’ve lost track of your nonsense by now. Yes, the current Swift protocol named Sequence allows unordered types. You seem to keep asserting that but not actually addressing my argument, which is that allowing Sequences to be unordered with the current API is undesired and actively harmful, and should therefore be changed.

What is harmful about it?

After thinking about it, I think the harmful bit is that unordered sequences are leaking internal representation (In your example, this is causing people to be surprised when two sets with identical elements are generating different sequences/orderings based on how they were created). You are correct when you say that this problem is even true for for-in.

I would not say it is a problem. Rather, by definition, iteration involves retrieving one element after another; if you're allowed to do that with Set, then the elements of a Set are observably ordered in some way. Since it's not an OrderedSet--i.e., order doesn't matter--then the only sensible conclusion is that the order of elements obtained in a for...in loop must be arbitrary. If you think this is harmful, then you must believe that one should be prohibited from iterating over an instance of Set. Otherwise, Set is inescapably a Sequence by the Swift definition of Sequence. All extension methods on Sequence like drop(while:) are really just conveniences for common things that you can do with iterated access; to my mind, they're essentially just alternative ways of spelling various for...in loops.

I think an argument could be made that you shouldn’t be able to iterate over a set without first defining an ordering on it (even if that ordering is somewhat arbitrary). Maybe we have something like a “Sequenc(e)able” protocol which defines things which can be turned into a sequence when combined with some sort of ordering. One possible ordering could be the internal representation (At least in that case we are calling it out specifically). If I had to say “setA.arbitraryOrder.elementsEqual(setB.arbitraryOrder)” I would definitely be less surprised when it returns false even though setA == setB.

Well, that's a totally different direction, then; you're arguing that `Set` and `Dictionary` should not conform to `Sequence` altogether. That's fine (it's also a direction that some of us explored off-list a while ago), but at this point in Swift's evolution, realistically, it's not within the realm of possible changes.

I am actually suggesting something slightly different. Basically, Set and Dictionary’s conformance to Collection would have a different implementation. They would conform to another protocol declaring that they are unordered. That protocol would fill in part of the conformance to sequence/collection using a default ordering, which is mostly arbitrary, but guaranteed to produce the same ordering for the same list of elements (even across collection types). This would be safer, but a tiny bit slower than what we have now (We could also potentially develop a way for collections like set to amortize the cost). For those who need to recover speed, the new protocol would also define a property which quickly returns a sequence/iterator using the internal ordering (I arbitrarily called it .arbitraryOrder).

I believe it would not be source breaking.

That is indeed something slightly different.

In an ideal world--and my initial understanding of what you were suggesting--Set and Dictionary would each have a member like `collection`, which would expose the underlying data as a `SetCollection` or `DictionaryCollection` that in turn would conform to `Collection`; meanwhile, Set and Dictionary themselves would not offer methods such as `prefix`, or indexing by subscript, which are not compatible with being unordered. For those who want a particular ordering, there'd be something like `collection(ordered areInIncreasingOrder: (T, T) -> Bool) -> {Set|Dictionary}Collection`.

What you suggest here instead would be minimally source-breaking. However, I'm unsure of where these guarantees provide benefit to justify the performance cost. Certainly not for `first` or `dropFirst(_:)`, which still yields an arbitrary result which doesn't make sense for something _unordered_. We *could* have an underscored customization point named something like `_customOrderingPass` that is only invoked from `elementsEqual` or other such methods to pre-rearrange the internal ordering of unordered collections in some deterministic way before comparison. Is that what you have in mind?

Something like that. Whatever we do, there will be a tradeoff between speed, correctness, and ergonomics.

My suggestion trades speed for correctness, and provides a way to recover speed through additional typing (which is slightly less ergonomic).

You haven't convinced me that this is at all improved in "correctness." It trades one arbitrary iteration order for another on a type that tries to model an unordered collection.

We could do something like you suggest. I don’t think the method would need to be underscored… the ordering pass could just be a method on the protocol which defines it as unordered. Then we could provide a special conformance for things where order really matters based on adherence to that protocol. That might be an acceptable tradeoff. It would give us speed at the cost of having the correct implementation being less ergonomic and more error prone (you have to remember to check that it is unordered and call the ordering method when it mattered).

I’d still be a bit worried that people would make incorrect generic algorithms based on expecting an order from unordered things, but at least it would be possible for them check and handle it correctly. I think I could get behind that tradeoff/compromise, given where we are in the swift process and Swift's obsession with speed (though I still slightly prefer the safer default). At least the standard library would handle all the things correctly, and that is what will affect the majority of programmers.

What is an example of such an "incorrect" generic algorithm that would be made correct by such a scheme?

To start with, the one you gave as an example at the beginning of this discussion: Two sets with identical elements which have different internal storage and thus give different orderings as sequences. You yourself have argued that the confusion around this is enough of a problem that we need to make a source-breaking change (renaming it) to warn people that the results of the ‘elementsEqual’ algorithm are undefined for sets and dictionaries.

No, I am arguing that the confusion about ‘elementsEqual’ is foremost a problem with its name; the result of this operation is not at all undefined for two sets but actually clearly defined: it returns true if two sets have the same elements in the same iteration order, which is a publicly observable behavior of sets (likewise dictionaries).

How is the iteration order of an unordered set or dictionary “publicly observable”? If either is implemented such that it can asynchronously optimize its storage (maybe by rebalancing a tree or merging two non-contiguous array segments or something), its iteration order could change without changing what values it contains. Seems like consecutive calls to “elementsEquals” (or whatever we’re calling it) should return the same answer, if we don’t add, remove, or mutate elements.

Sets are values. If you add, remove, or mutate any elements you have a different Set and thus a potentially different ordering of elements.

Jon_Hull · October 16, 2017, 11:10pm

To start with, the one you gave as an example at the beginning of this discussion: Two sets with identical elements which have different internal storage and thus give different orderings as sequences. You yourself have argued that the confusion around this is enough of a problem that we need to make a source-breaking change (renaming it) to warn people that the results of the ‘elementsEqual’ algorithm are undefined for sets and dictionaries.

No, I am arguing that the confusion about ‘elementsEqual’ is foremost a problem with its name; the result of this operation is not at all undefined for two sets but actually clearly defined: it returns true if two sets have the same elements in the same iteration order, which is a publicly observable behavior of sets (likewise dictionaries).

But that iteration order is undefined and could easily change due to changes in the private/internal structure of sets/dictionaries. Algorithms that rely on that “publicly observable behavior” (i.e. leaking of internals) will suddenly break.

And an algorithm in which such “sudden breakage” would occur is…?

Here are a few off the top of my head:

func hasPrefix(Sequence)->Bool
func hasSuffix(Sequence)->Bool
func containsSubsequence(Sequence)->Bool

What do these methods mean with regards to Set’s “publicly observable behavior”?

···

On Oct 16, 2017, at 1:05 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:
On Mon, Oct 16, 2017 at 10:49 Jonathan Hull <jhull@gbis.com <mailto:jhull@gbis.com>> wrote:

On Oct 16, 2017, at 7:20 AM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:

You keep claiming that this bug is a feature because it is the current behavior… but that is tautological reasoning.

Thanks,
Jon

trs · October 17, 2017, 5:55am

That ordering can be arbitrary, but it shouldn’t leak internal representation such that the method used to create identical things affects the outcome of generic methods because of differences in internal representation.

It would be better to say that the iteration order is well-defined. That will almost always mean documented, and usually predictable though obviously e.g. RNGs and iterating in random order will not be predictable by design.

That's actually more semantically constrained than what Swift calls a `Collection` (which requires conforming types to be multi-pass and(?) finite). By contrast, Swift's `SpongeBob` protocol explicitly permits conforming single-pass, infinite, and/or unordered types.

I think you’re talking about Sequence here, I’ve lost track of your nonsense by now. Yes, the current Swift protocol named Sequence allows unordered types. You seem to keep asserting that but not actually addressing my argument, which is that allowing Sequences to be unordered with the current API is undesired and actively harmful, and should therefore be changed.

What is harmful about it?

After thinking about it, I think the harmful bit is that unordered sequences are leaking internal representation (In your example, this is causing people to be surprised when two sets with identical elements are generating different sequences/orderings based on how they were created). You are correct when you say that this problem is even true for for-in.

I would not say it is a problem. Rather, by definition, iteration involves retrieving one element after another; if you're allowed to do that with Set, then the elements of a Set are observably ordered in some way. Since it's not an OrderedSet--i.e., order doesn't matter--then the only sensible conclusion is that the order of elements obtained in a for...in loop must be arbitrary. If you think this is harmful, then you must believe that one should be prohibited from iterating over an instance of Set. Otherwise, Set is inescapably a Sequence by the Swift definition of Sequence. All extension methods on Sequence like drop(while:) are really just conveniences for common things that you can do with iterated access; to my mind, they're essentially just alternative ways of spelling various for...in loops.

I think an argument could be made that you shouldn’t be able to iterate over a set without first defining an ordering on it (even if that ordering is somewhat arbitrary). Maybe we have something like a “Sequenc(e)able” protocol which defines things which can be turned into a sequence when combined with some sort of ordering. One possible ordering could be the internal representation (At least in that case we are calling it out specifically). If I had to say “setA.arbitraryOrder.elementsEqual(setB.arbitraryOrder)” I would definitely be less surprised when it returns false even though setA == setB.

Well, that's a totally different direction, then; you're arguing that `Set` and `Dictionary` should not conform to `Sequence` altogether. That's fine (it's also a direction that some of us explored off-list a while ago), but at this point in Swift's evolution, realistically, it's not within the realm of possible changes.

I am actually suggesting something slightly different. Basically, Set and Dictionary’s conformance to Collection would have a different implementation. They would conform to another protocol declaring that they are unordered. That protocol would fill in part of the conformance to sequence/collection using a default ordering, which is mostly arbitrary, but guaranteed to produce the same ordering for the same list of elements (even across collection types). This would be safer, but a tiny bit slower than what we have now (We could also potentially develop a way for collections like set to amortize the cost). For those who need to recover speed, the new protocol would also define a property which quickly returns a sequence/iterator using the internal ordering (I arbitrarily called it .arbitraryOrder).

I believe it would not be source breaking.

That is indeed something slightly different.

In an ideal world--and my initial understanding of what you were suggesting--Set and Dictionary would each have a member like `collection`, which would expose the underlying data as a `SetCollection` or `DictionaryCollection` that in turn would conform to `Collection`; meanwhile, Set and Dictionary themselves would not offer methods such as `prefix`, or indexing by subscript, which are not compatible with being unordered. For those who want a particular ordering, there'd be something like `collection(ordered areInIncreasingOrder: (T, T) -> Bool) -> {Set|Dictionary}Collection`.

What you suggest here instead would be minimally source-breaking. However, I'm unsure of where these guarantees provide benefit to justify the performance cost. Certainly not for `first` or `dropFirst(_:)`, which still yields an arbitrary result which doesn't make sense for something _unordered_. We *could* have an underscored customization point named something like `_customOrderingPass` that is only invoked from `elementsEqual` or other such methods to pre-rearrange the internal ordering of unordered collections in some deterministic way before comparison. Is that what you have in mind?

Something like that. Whatever we do, there will be a tradeoff between speed, correctness, and ergonomics.

My suggestion trades speed for correctness, and provides a way to recover speed through additional typing (which is slightly less ergonomic).

You haven't convinced me that this is at all improved in "correctness." It trades one arbitrary iteration order for another on a type that tries to model an unordered collection.

We could do something like you suggest. I don’t think the method would need to be underscored… the ordering pass could just be a method on the protocol which defines it as unordered. Then we could provide a special conformance for things where order really matters based on adherence to that protocol. That might be an acceptable tradeoff. It would give us speed at the cost of having the correct implementation being less ergonomic and more error prone (you have to remember to check that it is unordered and call the ordering method when it mattered).

I’d still be a bit worried that people would make incorrect generic algorithms based on expecting an order from unordered things, but at least it would be possible for them check and handle it correctly. I think I could get behind that tradeoff/compromise, given where we are in the swift process and Swift's obsession with speed (though I still slightly prefer the safer default). At least the standard library would handle all the things correctly, and that is what will affect the majority of programmers.

What is an example of such an "incorrect" generic algorithm that would be made correct by such a scheme?

To start with, the one you gave as an example at the beginning of this discussion: Two sets with identical elements which have different internal storage and thus give different orderings as sequences. You yourself have argued that the confusion around this is enough of a problem that we need to make a source-breaking change (renaming it) to warn people that the results of the ‘elementsEqual’ algorithm are undefined for sets and dictionaries.

No, I am arguing that the confusion about ‘elementsEqual’ is foremost a problem with its name; the result of this operation is not at all undefined for two sets but actually clearly defined: it returns true if two sets have the same elements in the same iteration order, which is a publicly observable behavior of sets (likewise dictionaries).

But it is a behavior which has absolutely no meaning at all because the order does not depend on the elements of the set but on the history of how the set has been reached its current state.
So why should I ever use this method on a set?
What is the use case?

One example: you can use it to check an instance of Set<Float> to determine if it has a NaN value. (The “obvious” way of doing it is not guaranteed to work since NaN != NaN.)

How would I do that? I'd rather expect to use a property isNaN on Float to do that.

I don’t see why a non-source-breaking change is suddenly off-limits.

But more than that, any generic algorithm which is assuming that the sequence is coming from an ordered source (i.e. many things using first/last). Some uses of first are ok because the programmer actually means ‘any’, but anywhere where they actually mean first/last may be problematic.

Such as...?

Currently, there is no way to test for ordered-ness, so there is no way for even a careful programmer to mitigate this problem. By adding a protocol which states that something is unordered, we can either branch on it, or create a separate version of an algorithm for things which conform.

It is clearly the case that Swift’s protocol hierarchy fits sets and collections imperfectly; however, it is in the nature of modeling that imperfections are present. The question is not whether it is possible to incur performance, API surface area, and other trade-offs to make the model more faithful, but rather whether this usefully solves any problem. What is the problem being mitigated? As I write above, Swift’s Set and Dictionary types meet the semantic requirements for Collection and moonlight as ordered collections. What is a generic algorithm on an ordered collection that is “not OK” for Set and Dictionary? (“elementsEqual”, as I’ve said, is not such an example.)

On the contrary, `elementsEqual` is exactly such an example, because it makes no sense to use it on a Set.

let s1 = Set([1,2,3,4,5,6])
let s2 = Set([6,5,4,3,2,1])

Both sets have different iteration orders. Comparing those sets with some other collection using `elementsEqual` will give no meaningful result because the order - and therefore the result of `elementsEqual` - is in effect random.

No, it is not such an example; it’s misleadingly named but works correctly—that is, its behavior matches exactly the documented behavior, which relies on only the semantic guarantees of Sequence, which Set correctly fulfills.

Fulfills to the letter. Again, what can you do with it if the result is random??

-Thorsten

···

Am 17.10.2017 um 00:13 schrieb Xiaodi Wu <xiaodi.wu@gmail.com>:
On Mon, Oct 16, 2017 at 14:21 Thorsten Seitz <tseitz42@icloud.com> wrote:

Am 16.10.2017 um 16:20 schrieb Xiaodi Wu via swift-evolution <swift-evolution@swift.org>:
On Mon, Oct 16, 2017 at 05:48 Jonathan Hull <jhull@gbis.com> wrote:

On Oct 15, 2017, at 9:58 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Sun, Oct 15, 2017 at 8:51 PM, Jonathan Hull <jhull@gbis.com> wrote:

On Oct 14, 2017, at 10:48 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

knatt · October 17, 2017, 5:41pm

Once we allow covariant functions to satisfy protocol requirements and have generalized existentials and recursive protocol requirements, wouldn't we be able to update thusly:

protocol Unordered {
func map<T>(…) -> Any<U: Unordered where U.Element == T>
}
protocol Ordered: Unordered {
func map<T>(…) -> Any<O: Ordered where O.Element == T>
}

Now apply that to every order-preserving function that takes a Sequence and returns another Sequence. You’ve moved the burden from users of API to implementers of API. It reminds me of the const/non-const split that C++ developers have to deal with, where a lot of functions end up being implemented twice so that you can have a const version and a non-const version (generally one just calls the other). It’s a pain. I don’t want that when working with Sequences. I don’t think it’s worth it. And FWIW, when I was programming in C# I wrote functions that took an IEnumerable<T> and return another IEnumerable<T> very often. It’s a powerful feature that would have been ruined by a change like this.

The idea is that covariance would mean you only need to implement the function once.

As I've been saying all along, elementsEqual returning a functionally random result when an unordered type is involved is a problem.

In theory. Where is the evidence that this leads to a significant number of real-world bugs? All you’ve done is describe a conceptual problem, but you haven’t connected the dots to real-world problems. Again, I can point to .Net, which has a much larger community of developers who have been working with the same “problem” since version 2.0 released in 2005. If this is a significant source of bugs then there should be evidence of that. Where is that evidence?

If there are no real-world problems, why do we feel the need to change the function name in the first place?

···

On Oct 17, 2017, at 10:36 AM, Adam Kemp <adam_kemp@apple.com> wrote:

On Oct 17, 2017, at 10:00 AM, Kevin Nattinger <swift@nattinger.net <mailto:swift@nattinger.net>> wrote:

Michael_Ilseman · October 16, 2017, 6:23pm

Sets being values are not an implementation detail. They have value semantics, and that is part of the guarantee of the type. This is perhaps the most important concept in the standard library.

···

On Oct 16, 2017, at 10:27 AM, Kevin Nattinger <swift@nattinger.net> wrote:

How is the iteration order of an unordered set or dictionary “publicly observable”? If either is implemented such that it can asynchronously optimize its storage (maybe by rebalancing a tree or merging two non-contiguous array segments or something), its iteration order could change without changing what values it contains. Seems like consecutive calls to “elementsEquals” (or whatever we’re calling it) should return the same answer, if we don’t add, remove, or mutate elements.

Sets are values. If you add, remove, or mutate any elements you have a different Set and thus a potentially different ordering of elements.

An implementation detail. We could make it a class* and AFAICT that wouldn't break any guarantees on Sequence; and the argument applies equally well to any other unordered Sequence, which has no value type or semantics constraint.

*: obviously we won't, I don't think anyone is advocating that.

Jon_Hull · October 16, 2017, 11:40pm

To expand on this, Set([1,2,3,4,5]).hasPrefix([1,2,3]) currently returns true. But let’s say a year from now, we change Set to return an ordering based on hash values (which is entirely reasonable). Suddenly the same code may return true or false.

No guarantees will be broken by doing that, but the result has still changed because we are building on top of undefined behavior. Collection says nothing about the ordering over different builds of a program.

Thanks,
Jon

···

On Oct 16, 2017, at 4:11 PM, Jonathan Hull via swift-evolution <swift-evolution@swift.org> wrote:

On Oct 16, 2017, at 1:05 PM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:

On Mon, Oct 16, 2017 at 10:49 Jonathan Hull <jhull@gbis.com <mailto:jhull@gbis.com>> wrote:

On Oct 16, 2017, at 7:20 AM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:

To start with, the one you gave as an example at the beginning of this discussion: Two sets with identical elements which have different internal storage and thus give different orderings as sequences. You yourself have argued that the confusion around this is enough of a problem that we need to make a source-breaking change (renaming it) to warn people that the results of the ‘elementsEqual’ algorithm are undefined for sets and dictionaries.

No, I am arguing that the confusion about ‘elementsEqual’ is foremost a problem with its name; the result of this operation is not at all undefined for two sets but actually clearly defined: it returns true if two sets have the same elements in the same iteration order, which is a publicly observable behavior of sets (likewise dictionaries).

But that iteration order is undefined and could easily change due to changes in the private/internal structure of sets/dictionaries. Algorithms that rely on that “publicly observable behavior” (i.e. leaking of internals) will suddenly break.

And an algorithm in which such “sudden breakage” would occur is…?

Here are a few off the top of my head:

func hasPrefix(Sequence)->Bool
func hasSuffix(Sequence)->Bool
func containsSubsequence(Sequence)->Bool

What do these methods mean with regards to Set’s “publicly observable behavior”?

You keep claiming that this bug is a feature because it is the current behavior… but that is tautological reasoning.

Thanks,
Jon

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org <mailto:swift-evolution@swift.org>
https://lists.swift.org/mailman/listinfo/swift-evolution

xwu · October 16, 2017, 11:42pm

In what way do these algorithms break? They would continue to
determine--correctly--whether an instance of Set, when iterated, begins
with, ends with, or contains (respectively) a subsequence that matches the
argument.

···

On Mon, Oct 16, 2017 at 6:10 PM, Jonathan Hull <jhull@gbis.com> wrote:

On Oct 16, 2017, at 1:05 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Mon, Oct 16, 2017 at 10:49 Jonathan Hull <jhull@gbis.com> wrote:

On Oct 16, 2017, at 7:20 AM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

To start with, the one you gave as an example at the beginning of this

discussion: Two sets with identical elements which have different internal
storage and thus give different orderings as sequences. You yourself have
argued that the confusion around this is enough of a problem that we need
to make a source-breaking change (renaming it) to warn people that the
results of the ‘elementsEqual’ algorithm are undefined for sets and
dictionaries.

No, I am arguing that the confusion about ‘elementsEqual’ is foremost a
problem with its name; the result of this operation is not at all undefined
for two sets but actually clearly defined: it returns true if two sets have
the same elements in the same iteration order, which is a publicly
observable behavior of sets (likewise dictionaries).

But that iteration order is undefined and could easily change due to
changes in the private/internal structure of sets/dictionaries. Algorithms
that rely on that “publicly observable behavior” (i.e. leaking of
internals) will suddenly break.

And an algorithm in which such “sudden breakage” would occur is…?

Here are a few off the top of my head:

func hasPrefix(Sequence)->Bool
func hasSuffix(Sequence)->Bool
func containsSubsequence(Sequence)->Bool

What do these methods mean with regards to Set’s “publicly observable
behavior”?

xwu · October 16, 2017, 8:07pm

That ordering can be arbitrary, but it shouldn’t leak internal

representation such that the method used to create identical things affects
the outcome of generic methods because of differences in internal
representation.

It would be better to say that the iteration order is well-defined.
That will almost always mean documented, and usually predictable though
obviously e.g. RNGs and iterating in random order will not be predictable
by design.

That's actually more semantically constrained than what Swift calls
a `Collection` (which requires conforming types to be multi-pass and(?)
finite). By contrast, Swift's `SpongeBob` protocol explicitly permits
conforming single-pass, infinite, and/or unordered types.

I think you’re talking about Sequence here, I’ve lost track of your
nonsense by now. Yes, the current Swift protocol named Sequence allows
unordered types. You seem to keep asserting that but not actually
addressing my argument, which is *that allowing Sequences to be
unordered with the current API is undesired and actively harmful, and
should* *therefore** be changed*.

What is harmful about it?

After thinking about it, I think the harmful bit is that unordered
sequences are leaking internal representation (In your example, this is
causing people to be surprised when two sets with identical elements are
generating different sequences/orderings based on how they were created).
You are correct when you say that this problem is even true for for-in.

I would not say it is a problem. Rather, by definition, iteration
involves retrieving one element after another; if you're allowed to do that
with Set, then the elements of a Set are observably ordered in some way.
Since it's not an OrderedSet--i.e., order doesn't matter--then the only
sensible conclusion is that the order of elements obtained in a for...in
loop must be arbitrary. If you think this is harmful, then you must believe
that one should be prohibited from iterating over an instance of Set.
Otherwise, Set is inescapably a Sequence by the Swift definition of
Sequence. All extension methods on Sequence like drop(while:) are really
just conveniences for common things that you can do with iterated access;
to my mind, they're essentially just alternative ways of spelling various
for...in loops.

I think an argument could be made that you shouldn’t be able to
iterate over a set without first defining an ordering on it (even if that
ordering is somewhat arbitrary). Maybe we have something like a
“Sequenc(e)able” protocol which defines things which can be turned into a
sequence when combined with some sort of ordering. One possible ordering
could be the internal representation (At least in that case we are calling
it out specifically). If I had to say
“setA.arbitraryOrder.elementsEqual(setB.arbitraryOrder)” I would definitely
be less surprised when it returns false even though setA == setB.

Well, that's a totally different direction, then; you're arguing that
`Set` and `Dictionary` should not conform to `Sequence` altogether. That's
fine (it's also a direction that some of us explored off-list a while ago),
but at this point in Swift's evolution, realistically, it's not within the
realm of possible changes.

I am actually suggesting something slightly different. Basically, Set
and Dictionary’s conformance to Collection would have a different
implementation. They would conform to another protocol declaring that they
are unordered. That protocol would fill in part of the conformance to
sequence/collection using a default ordering, which is mostly arbitrary,
but guaranteed to produce the same ordering for the same list of elements
(even across collection types). This would be safer, but a tiny bit slower
than what we have now (We could also potentially develop a way for
collections like set to amortize the cost). For those who need to recover
speed, the new protocol would also define a property which quickly returns
a sequence/iterator using the internal ordering (I arbitrarily called it
.arbitraryOrder).

I believe it would not be source breaking.

That is indeed something slightly different.

In an ideal world--and my initial understanding of what you were
suggesting--Set and Dictionary would each have a member like `collection`,
which would expose the underlying data as a `SetCollection` or
`DictionaryCollection` that in turn would conform to `Collection`;
meanwhile, Set and Dictionary themselves would not offer methods such as
`prefix`, or indexing by subscript, which are not compatible with being
unordered. For those who want a particular ordering, there'd be something
like `collection(ordered areInIncreasingOrder: (T, T) -> Bool) ->
{Set|Dictionary}Collection`.

What you suggest here instead would be minimally source-breaking.
However, I'm unsure of where these guarantees provide benefit to justify
the performance cost. Certainly not for `first` or `dropFirst(_:)`, which
still yields an arbitrary result which doesn't make sense for something
_unordered_. We *could* have an underscored customization point named
something like `_customOrderingPass` that is only invoked from
`elementsEqual` or other such methods to pre-rearrange the internal
ordering of unordered collections in some deterministic way before
comparison. Is that what you have in mind?

Something like that. Whatever we do, there will be a tradeoff between
speed, correctness, and ergonomics.

My suggestion trades speed for correctness, and provides a way to
recover speed through additional typing (which is slightly less ergonomic).

You haven't convinced me that this is at all improved in "correctness."
It trades one arbitrary iteration order for another on a type that tries to
model an unordered collection.

We could do something like you suggest. I don’t think the method would
need to be underscored… the ordering pass could just be a method on the
protocol which defines it as unordered. Then we could provide a special
conformance for things where order really matters based on adherence to
that protocol. That might be an acceptable tradeoff. It would give us
speed at the cost of having the correct implementation being less ergonomic
and more error prone (you have to remember to check that it is unordered
and call the ordering method when it mattered).

I’d still be a bit worried that people would make incorrect generic
algorithms based on expecting an order from unordered things, but at least
it would be possible for them check and handle it correctly. I think I
could get behind that tradeoff/compromise, given where we are in the swift
process and Swift's obsession with speed (though I still slightly prefer
the safer default). At least the standard library would handle all the
things correctly, and that is what will affect the majority of programmers.

What is an example of such an "incorrect" generic algorithm that would be
made correct by such a scheme?

To start with, the one you gave as an example at the beginning of this
discussion: Two sets with identical elements which have different internal
storage and thus give different orderings as sequences. You yourself have
argued that the confusion around this is enough of a problem that we need
to make a source-breaking change (renaming it) to warn people that the
results of the ‘elementsEqual’ algorithm are undefined for sets and
dictionaries.

No, I am arguing that the confusion about ‘elementsEqual’ is foremost a
problem with its name; the result of this operation is not at all undefined
for two sets but actually clearly defined: it returns true if two sets have
the same elements in the same iteration order, which is a publicly
observable behavior of sets (likewise dictionaries).

How is the iteration order of an unordered set or dictionary “publicly
observable”? If either is implemented such that it can asynchronously
optimize its storage (maybe by rebalancing a tree or merging two
non-contiguous array segments or something), its iteration order could
change *without* changing what values it contains. Seems like consecutive
calls to “elementsEquals” (or whatever we’re calling it) should return the
same answer, if we don’t add, remove, or mutate elements.

Sets are values. If you add, remove, or mutate any elements you have a
different Set and thus a potentially different ordering of elements.

From the “value semantics” PoV, yes. But from the “unordered collection of
values” PoV, Sets/Dictionaries, being unordered, are semantically free to
rearrange the in-memory ordering of their elements *without* user
intervention.

No, they are not semantically free to do so. The semantics of Collection
forbid it, because the iteration order must be multi-pass. As long as the
value is unchanged, the iteration order is unchanged. That is a documented,
public guarantee of the API.

Maybe we need to add a mechanism for expressing the idea of a “value type

···

On Mon, Oct 16, 2017 at 13:15 David Sweeris <davesweeris@mac.com> wrote:

On Oct 16, 2017, at 09:21, Michael Ilseman <milseman@apple.com> wrote:
On Oct 16, 2017, at 8:46 AM, David Sweeris via swift-evolution < > swift-evolution@swift.org> wrote:
On Oct 16, 2017, at 07:20, Xiaodi Wu via swift-evolution < > swift-evolution@swift.org> wrote:
On Mon, Oct 16, 2017 at 05:48 Jonathan Hull <jhull@gbis.com> wrote:

On Oct 15, 2017, at 9:58 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:
On Sun, Oct 15, 2017 at 8:51 PM, Jonathan Hull <jhull@gbis.com> wrote:

On Oct 14, 2017, at 10:48 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

which has referential tendencies“ for these “managed” types of values?

Perhaps a less abstract example would be a “RemoteDictionary”, which maps
each key to a remote server where the value is actually stored... I would
expect it to iterate over its values in whatever order it gets them back
from all the servers. And since dictionaries are unordered and network
conditions & server loads can change (quickly), I wouldn’t expect
consecutive iterations to necessarily be in the same order.

- Dave Sweeris

Adam_Kemp1 · October 17, 2017, 5:54pm

Once we allow covariant functions to satisfy protocol requirements and have generalized existentials and recursive protocol requirements, wouldn't we be able to update thusly:

protocol Unordered {
func map<T>(…) -> Any<U: Unordered where U.Element == T>
}
protocol Ordered: Unordered {
func map<T>(…) -> Any<O: Ordered where O.Element == T>
}

Now apply that to every order-preserving function that takes a Sequence and returns another Sequence. You’ve moved the burden from users of API to implementers of API. It reminds me of the const/non-const split that C++ developers have to deal with, where a lot of functions end up being implemented twice so that you can have a const version and a non-const version (generally one just calls the other). It’s a pain. I don’t want that when working with Sequences. I don’t think it’s worth it. And FWIW, when I was programming in C# I wrote functions that took an IEnumerable<T> and return another IEnumerable<T> very often. It’s a powerful feature that would have been ruined by a change like this.

The idea is that covariance would mean you only need to implement the function once.

In the example you showed above map is written twice. Maybe the two protocols can share an implementation, but you still have to have two versions declared somewhere.

What does it look like if you’re just writing a single function somewhere that takes in a Sequence and returns another Sequence? How do you make that function take both ordered and unordered Sequences? To make it concrete, say you write a function that just wraps map:

func firstNames(ofPeople people: Sequence<Person>) -> Sequence<Person> {
return people.map { $0.firstName }
}

I want that function to work on both ordered and unordered Sequences, and if you start with an ordered Sequence you should end up with an ordered Sequence. How do I make that work?

If there are no real-world problems, why do we feel the need to change the function name in the first place?

To quote myself from an earlier email: "I’m not even sure a name change is necessary for this method at all, but I’m not at all in favor of anything beyond that.”

To extend that, I’m content with doing nothing. I’m not sure there’s cause for doing anything at all, and I’m very sure that no one on this list has demonstrated any need for a major change to the library, let alone new language features.

···

On Oct 17, 2017, at 10:41 AM, Kevin Nattinger via swift-evolution <swift-evolution@swift.org> wrote:

On Oct 17, 2017, at 10:36 AM, Adam Kemp <adam_kemp@apple.com <mailto:adam_kemp@apple.com>> wrote:

On Oct 17, 2017, at 10:00 AM, Kevin Nattinger <swift@nattinger.net <mailto:swift@nattinger.net>> wrote:

xwu · October 17, 2017, 12:44pm

That ordering can be arbitrary, but it shouldn’t leak internal

representation such that the method used to create identical things affects
the outcome of generic methods because of differences in internal
representation.

It would be better to say that the iteration order is well-defined.
That will almost always mean documented, and usually predictable though
obviously e.g. RNGs and iterating in random order will not be predictable
by design.

That's actually more semantically constrained than what Swift calls
a `Collection` (which requires conforming types to be multi-pass and(?)
finite). By contrast, Swift's `SpongeBob` protocol explicitly permits
conforming single-pass, infinite, and/or unordered types.

I think you’re talking about Sequence here, I’ve lost track of your
nonsense by now. Yes, the current Swift protocol named Sequence allows
unordered types. You seem to keep asserting that but not actually
addressing my argument, which is *that allowing Sequences to be
unordered with the current API is undesired and actively harmful, and
should* *therefore** be changed*.

What is harmful about it?

After thinking about it, I think the harmful bit is that unordered
sequences are leaking internal representation (In your example, this is
causing people to be surprised when two sets with identical elements are
generating different sequences/orderings based on how they were created).
You are correct when you say that this problem is even true for for-in.

I would not say it is a problem. Rather, by definition, iteration
involves retrieving one element after another; if you're allowed to do that
with Set, then the elements of a Set are observably ordered in some way.
Since it's not an OrderedSet--i.e., order doesn't matter--then the only
sensible conclusion is that the order of elements obtained in a for...in
loop must be arbitrary. If you think this is harmful, then you must believe
that one should be prohibited from iterating over an instance of Set.
Otherwise, Set is inescapably a Sequence by the Swift definition of
Sequence. All extension methods on Sequence like drop(while:) are really
just conveniences for common things that you can do with iterated access;
to my mind, they're essentially just alternative ways of spelling various
for...in loops.

I think an argument could be made that you shouldn’t be able to
iterate over a set without first defining an ordering on it (even if that
ordering is somewhat arbitrary). Maybe we have something like a
“Sequenc(e)able” protocol which defines things which can be turned into a
sequence when combined with some sort of ordering. One possible ordering
could be the internal representation (At least in that case we are calling
it out specifically). If I had to say
“setA.arbitraryOrder.elementsEqual(setB.arbitraryOrder)” I would definitely
be less surprised when it returns false even though setA == setB.

Well, that's a totally different direction, then; you're arguing that
`Set` and `Dictionary` should not conform to `Sequence` altogether. That's
fine (it's also a direction that some of us explored off-list a while ago),
but at this point in Swift's evolution, realistically, it's not within the
realm of possible changes.

I am actually suggesting something slightly different. Basically, Set
and Dictionary’s conformance to Collection would have a different
implementation. They would conform to another protocol declaring that they
are unordered. That protocol would fill in part of the conformance to
sequence/collection using a default ordering, which is mostly arbitrary,
but guaranteed to produce the same ordering for the same list of elements
(even across collection types). This would be safer, but a tiny bit slower
than what we have now (We could also potentially develop a way for
collections like set to amortize the cost). For those who need to recover
speed, the new protocol would also define a property which quickly returns
a sequence/iterator using the internal ordering (I arbitrarily called it
.arbitraryOrder).

I believe it would not be source breaking.

That is indeed something slightly different.

In an ideal world--and my initial understanding of what you were
suggesting--Set and Dictionary would each have a member like `collection`,
which would expose the underlying data as a `SetCollection` or
`DictionaryCollection` that in turn would conform to `Collection`;
meanwhile, Set and Dictionary themselves would not offer methods such as
`prefix`, or indexing by subscript, which are not compatible with being
unordered. For those who want a particular ordering, there'd be something
like `collection(ordered areInIncreasingOrder: (T, T) -> Bool) ->
{Set|Dictionary}Collection`.

What you suggest here instead would be minimally source-breaking.
However, I'm unsure of where these guarantees provide benefit to justify
the performance cost. Certainly not for `first` or `dropFirst(_:)`, which
still yields an arbitrary result which doesn't make sense for something
_unordered_. We *could* have an underscored customization point named
something like `_customOrderingPass` that is only invoked from
`elementsEqual` or other such methods to pre-rearrange the internal
ordering of unordered collections in some deterministic way before
comparison. Is that what you have in mind?

Something like that. Whatever we do, there will be a tradeoff between
speed, correctness, and ergonomics.

My suggestion trades speed for correctness, and provides a way to
recover speed through additional typing (which is slightly less ergonomic).

You haven't convinced me that this is at all improved in "correctness."
It trades one arbitrary iteration order for another on a type that tries to
model an unordered collection.

We could do something like you suggest. I don’t think the method would
need to be underscored… the ordering pass could just be a method on the
protocol which defines it as unordered. Then we could provide a special
conformance for things where order really matters based on adherence to
that protocol. That might be an acceptable tradeoff. It would give us
speed at the cost of having the correct implementation being less ergonomic
and more error prone (you have to remember to check that it is unordered
and call the ordering method when it mattered).

I’d still be a bit worried that people would make incorrect generic
algorithms based on expecting an order from unordered things, but at least
it would be possible for them check and handle it correctly. I think I
could get behind that tradeoff/compromise, given where we are in the swift
process and Swift's obsession with speed (though I still slightly prefer
the safer default). At least the standard library would handle all the
things correctly, and that is what will affect the majority of programmers.

What is an example of such an "incorrect" generic algorithm that would
be made correct by such a scheme?

To start with, the one you gave as an example at the beginning of this
discussion: Two sets with identical elements which have different internal
storage and thus give different orderings as sequences. You yourself have
argued that the confusion around this is enough of a problem that we need
to make a source-breaking change (renaming it) to warn people that the
results of the ‘elementsEqual’ algorithm are undefined for sets and
dictionaries.

No, I am arguing that the confusion about ‘elementsEqual’ is foremost a
problem with its name; the result of this operation is not at all undefined
for two sets but actually clearly defined: it returns true if two sets have
the same elements in the same iteration order, which is a publicly
observable behavior of sets (likewise dictionaries).

But it is a behavior which has absolutely no meaning at all because the
order does not depend on the elements of the set but on the history of how
the set has been reached its current state.
So why should I ever use this method on a set?
What is the use case?

One example: you can use it to check an instance of Set<Float> to
determine if it has a NaN value. (The “obvious” way of doing it is not
guaranteed to work since NaN != NaN.)

How would I do that? I'd rather expect to use a property isNaN on Float to
do that.

set.elementsEqual(set)

I don’t see why a non-source-breaking change is suddenly off-limits.

But more than that, any generic algorithm which is assuming that the
sequence is coming from an ordered source (i.e. many things using
first/last). Some uses of first are ok because the programmer actually
means ‘any’, but anywhere where they actually mean first/last may be
problematic.

Such as...?

Currently, there is no way to test for ordered-ness, so there is no way

for even a careful programmer to mitigate this problem. By adding a
protocol which states that something is unordered, we can either branch on
it, or create a separate version of an algorithm for things which conform.

It is clearly the case that Swift’s protocol hierarchy fits sets and
collections imperfectly; however, it is in the nature of modeling that
imperfections are present. The question is not whether it is possible to
incur performance, API surface area, and other trade-offs to make the model
more faithful, but rather whether this usefully solves any problem. What is
the problem being mitigated? As I write above, Swift’s Set and Dictionary
types meet the semantic requirements for Collection and moonlight as
ordered collections. What is a generic algorithm on an ordered collection
that is “not OK” for Set and Dictionary? (“elementsEqual”, as I’ve said,
is not such an example.)

On the contrary, `elementsEqual` is exactly such an example, because it
makes no sense to use it on a Set.

let s1 = Set([1,2,3,4,5,6])
let s2 = Set([6,5,4,3,2,1])

Both sets have different iteration orders. Comparing those sets with some
other collection using `elementsEqual` will give no meaningful result
because the order - and therefore the result of `elementsEqual` - is in
effect random.

No, it is not such an example; it’s misleadingly named but works
correctly—that is, its behavior matches exactly the documented behavior,
which relies on only the semantic guarantees of Sequence, which Set
correctly fulfills.

Fulfills to the letter. Again, what can you do with it if the result is
random??

The result is not random.

···

On Tue, Oct 17, 2017 at 00:56 Thorsten Seitz <tseitz42@icloud.com> wrote:

Am 17.10.2017 um 00:13 schrieb Xiaodi Wu <xiaodi.wu@gmail.com>:
On Mon, Oct 16, 2017 at 14:21 Thorsten Seitz <tseitz42@icloud.com> wrote:

Am 16.10.2017 um 16:20 schrieb Xiaodi Wu via swift-evolution < >> swift-evolution@swift.org>:
On Mon, Oct 16, 2017 at 05:48 Jonathan Hull <jhull@gbis.com> wrote:

On Oct 15, 2017, at 9:58 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:
On Sun, Oct 15, 2017 at 8:51 PM, Jonathan Hull <jhull@gbis.com> wrote:

On Oct 14, 2017, at 10:48 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

knatt · October 16, 2017, 8:09pm

That ordering can be arbitrary, but it shouldn’t leak internal representation such that the method used to create identical things affects the outcome of generic methods because of differences in internal representation.

It would be better to say that the iteration order is well-defined. That will almost always mean documented, and usually predictable though obviously e.g. RNGs and iterating in random order will not be predictable by design.

That's actually more semantically constrained than what Swift calls a `Collection` (which requires conforming types to be multi-pass and(?) finite). By contrast, Swift's `SpongeBob` protocol explicitly permits conforming single-pass, infinite, and/or unordered types.

I think you’re talking about Sequence here, I’ve lost track of your nonsense by now. Yes, the current Swift protocol named Sequence allows unordered types. You seem to keep asserting that but not actually addressing my argument, which is that allowing Sequences to be unordered with the current API is undesired and actively harmful, and should therefore be changed.

What is harmful about it?

After thinking about it, I think the harmful bit is that unordered sequences are leaking internal representation (In your example, this is causing people to be surprised when two sets with identical elements are generating different sequences/orderings based on how they were created). You are correct when you say that this problem is even true for for-in.

I would not say it is a problem. Rather, by definition, iteration involves retrieving one element after another; if you're allowed to do that with Set, then the elements of a Set are observably ordered in some way. Since it's not an OrderedSet--i.e., order doesn't matter--then the only sensible conclusion is that the order of elements obtained in a for...in loop must be arbitrary. If you think this is harmful, then you must believe that one should be prohibited from iterating over an instance of Set. Otherwise, Set is inescapably a Sequence by the Swift definition of Sequence. All extension methods on Sequence like drop(while:) are really just conveniences for common things that you can do with iterated access; to my mind, they're essentially just alternative ways of spelling various for...in loops.

I think an argument could be made that you shouldn’t be able to iterate over a set without first defining an ordering on it (even if that ordering is somewhat arbitrary). Maybe we have something like a “Sequenc(e)able” protocol which defines things which can be turned into a sequence when combined with some sort of ordering. One possible ordering could be the internal representation (At least in that case we are calling it out specifically). If I had to say “setA.arbitraryOrder.elementsEqual(setB.arbitraryOrder)” I would definitely be less surprised when it returns false even though setA == setB.

Well, that's a totally different direction, then; you're arguing that `Set` and `Dictionary` should not conform to `Sequence` altogether. That's fine (it's also a direction that some of us explored off-list a while ago), but at this point in Swift's evolution, realistically, it's not within the realm of possible changes.

I am actually suggesting something slightly different. Basically, Set and Dictionary’s conformance to Collection would have a different implementation. They would conform to another protocol declaring that they are unordered. That protocol would fill in part of the conformance to sequence/collection using a default ordering, which is mostly arbitrary, but guaranteed to produce the same ordering for the same list of elements (even across collection types). This would be safer, but a tiny bit slower than what we have now (We could also potentially develop a way for collections like set to amortize the cost). For those who need to recover speed, the new protocol would also define a property which quickly returns a sequence/iterator using the internal ordering (I arbitrarily called it .arbitraryOrder).

I believe it would not be source breaking.

That is indeed something slightly different.

In an ideal world--and my initial understanding of what you were suggesting--Set and Dictionary would each have a member like `collection`, which would expose the underlying data as a `SetCollection` or `DictionaryCollection` that in turn would conform to `Collection`; meanwhile, Set and Dictionary themselves would not offer methods such as `prefix`, or indexing by subscript, which are not compatible with being unordered. For those who want a particular ordering, there'd be something like `collection(ordered areInIncreasingOrder: (T, T) -> Bool) -> {Set|Dictionary}Collection`.

What you suggest here instead would be minimally source-breaking. However, I'm unsure of where these guarantees provide benefit to justify the performance cost. Certainly not for `first` or `dropFirst(_:)`, which still yields an arbitrary result which doesn't make sense for something _unordered_. We *could* have an underscored customization point named something like `_customOrderingPass` that is only invoked from `elementsEqual` or other such methods to pre-rearrange the internal ordering of unordered collections in some deterministic way before comparison. Is that what you have in mind?

Something like that. Whatever we do, there will be a tradeoff between speed, correctness, and ergonomics.

My suggestion trades speed for correctness, and provides a way to recover speed through additional typing (which is slightly less ergonomic).

You haven't convinced me that this is at all improved in "correctness." It trades one arbitrary iteration order for another on a type that tries to model an unordered collection.

We could do something like you suggest. I don’t think the method would need to be underscored… the ordering pass could just be a method on the protocol which defines it as unordered. Then we could provide a special conformance for things where order really matters based on adherence to that protocol. That might be an acceptable tradeoff. It would give us speed at the cost of having the correct implementation being less ergonomic and more error prone (you have to remember to check that it is unordered and call the ordering method when it mattered).

I’d still be a bit worried that people would make incorrect generic algorithms based on expecting an order from unordered things, but at least it would be possible for them check and handle it correctly. I think I could get behind that tradeoff/compromise, given where we are in the swift process and Swift's obsession with speed (though I still slightly prefer the safer default). At least the standard library would handle all the things correctly, and that is what will affect the majority of programmers.

What is an example of such an "incorrect" generic algorithm that would be made correct by such a scheme?

To start with, the one you gave as an example at the beginning of this discussion: Two sets with identical elements which have different internal storage and thus give different orderings as sequences. You yourself have argued that the confusion around this is enough of a problem that we need to make a source-breaking change (renaming it) to warn people that the results of the ‘elementsEqual’ algorithm are undefined for sets and dictionaries.

No, I am arguing that the confusion about ‘elementsEqual’ is foremost a problem with its name; the result of this operation is not at all undefined for two sets but actually clearly defined: it returns true if two sets have the same elements in the same iteration order, which is a publicly observable behavior of sets (likewise dictionaries).

How is the iteration order of an unordered set or dictionary “publicly observable”? If either is implemented such that it can asynchronously optimize its storage (maybe by rebalancing a tree or merging two non-contiguous array segments or something), its iteration order could change without changing what values it contains. Seems like consecutive calls to “elementsEquals” (or whatever we’re calling it) should return the same answer, if we don’t add, remove, or mutate elements.

Sets are values. If you add, remove, or mutate any elements you have a different Set and thus a potentially different ordering of elements.

From the “value semantics” PoV, yes. But from the “unordered collection of values” PoV, Sets/Dictionaries, being unordered, are semantically free to rearrange the in-memory ordering of their elements without user intervention.

No, they are not semantically free to do so. The semantics of Collection forbid it, because the iteration order must be multi-pass. As long as the value is unchanged, the iteration order is unchanged. That is a documented, public guarantee of the API.

Of the Collection API. But `elementsEqual` is on Sequence, which carries no such guarantee.

···

On Oct 16, 2017, at 1:08 PM, Xiaodi Wu via swift-evolution <swift-evolution@swift.org> wrote:
On Mon, Oct 16, 2017 at 13:15 David Sweeris <davesweeris@mac.com <mailto:davesweeris@mac.com>> wrote:
On Oct 16, 2017, at 09:21, Michael Ilseman <milseman@apple.com <mailto:milseman@apple.com>> wrote:

On Oct 16, 2017, at 8:46 AM, David Sweeris via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:
On Oct 16, 2017, at 07:20, Xiaodi Wu via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

On Mon, Oct 16, 2017 at 05:48 Jonathan Hull <jhull@gbis.com <mailto:jhull@gbis.com>> wrote:

On Oct 15, 2017, at 9:58 PM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:
On Sun, Oct 15, 2017 at 8:51 PM, Jonathan Hull <jhull@gbis.com <mailto:jhull@gbis.com>> wrote:

On Oct 14, 2017, at 10:48 PM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:

Maybe we need to add a mechanism for expressing the idea of a “value type which has referential tendencies“ for these “managed” types of values?

Perhaps a less abstract example would be a “RemoteDictionary”, which maps each key to a remote server where the value is actually stored... I would expect it to iterate over its values in whatever order it gets them back from all the servers. And since dictionaries are unordered and network conditions & server loads can change (quickly), I wouldn’t expect consecutive iterations to necessarily be in the same order.

- Dave Sweeris
_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

gparker42 · October 16, 2017, 8:31pm

Even if a Set value has a fixed order, a copy of that value may have a *different* order. How many generic algorithm implementations are going to be confused by that?

···

On Oct 16, 2017, at 1:08 PM, Xiaodi Wu via swift-evolution <swift-evolution@swift.org> wrote:

On Mon, Oct 16, 2017 at 13:15 David Sweeris <davesweeris@mac.com <mailto:davesweeris@mac.com>> wrote:

On Oct 16, 2017, at 09:21, Michael Ilseman <milseman@apple.com <mailto:milseman@apple.com>> wrote:

Sets are values. If you add, remove, or mutate any elements you have a different Set and thus a potentially different ordering of elements.

From the “value semantics” PoV, yes. But from the “unordered collection of values” PoV, Sets/Dictionaries, being unordered, are semantically free to rearrange the in-memory ordering of their elements without user intervention.

No, they are not semantically free to do so. The semantics of Collection forbid it, because the iteration order must be multi-pass. As long as the value is unchanged, the iteration order is unchanged. That is a documented, public guarantee of the API.

--
Greg Parker gparker@apple.com <mailto:gparker@apple.com> Runtime Wrangler

David_Sweeris · October 17, 2017, 1:03am

If the semantics of unordered collections (with a lowercase c) and of `Collection` (with an uppercase C) differ on such a basic level, then why are we trying to force them together? I mean, I understand that source-compatibility is important, but so is correctly modeling that which we claim to model.

- Dave Sweeris

···

On Oct 16, 2017, at 1:07 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Mon, Oct 16, 2017 at 13:15 David Sweeris <davesweeris@mac.com <mailto:davesweeris@mac.com>> wrote:

On Oct 16, 2017, at 09:21, Michael Ilseman <milseman@apple.com <mailto:milseman@apple.com>> wrote:

On Oct 16, 2017, at 8:46 AM, David Sweeris via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

On Oct 16, 2017, at 07:20, Xiaodi Wu via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

On Mon, Oct 16, 2017 at 05:48 Jonathan Hull <jhull@gbis.com <mailto:jhull@gbis.com>> wrote:

On Oct 15, 2017, at 9:58 PM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:

On Sun, Oct 15, 2017 at 8:51 PM, Jonathan Hull <jhull@gbis.com <mailto:jhull@gbis.com>> wrote:

On Oct 14, 2017, at 10:48 PM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:

That ordering can be arbitrary, but it shouldn’t leak internal representation such that the method used to create identical things affects the outcome of generic methods because of differences in internal representation.

It would be better to say that the iteration order is well-defined. That will almost always mean documented, and usually predictable though obviously e.g. RNGs and iterating in random order will not be predictable by design.

That's actually more semantically constrained than what Swift calls a `Collection` (which requires conforming types to be multi-pass and(?) finite). By contrast, Swift's `SpongeBob` protocol explicitly permits conforming single-pass, infinite, and/or unordered types.

I think you’re talking about Sequence here, I’ve lost track of your nonsense by now. Yes, the current Swift protocol named Sequence allows unordered types. You seem to keep asserting that but not actually addressing my argument, which is that allowing Sequences to be unordered with the current API is undesired and actively harmful, and should therefore be changed.

What is harmful about it?

After thinking about it, I think the harmful bit is that unordered sequences are leaking internal representation (In your example, this is causing people to be surprised when two sets with identical elements are generating different sequences/orderings based on how they were created). You are correct when you say that this problem is even true for for-in.

I would not say it is a problem. Rather, by definition, iteration involves retrieving one element after another; if you're allowed to do that with Set, then the elements of a Set are observably ordered in some way. Since it's not an OrderedSet--i.e., order doesn't matter--then the only sensible conclusion is that the order of elements obtained in a for...in loop must be arbitrary. If you think this is harmful, then you must believe that one should be prohibited from iterating over an instance of Set. Otherwise, Set is inescapably a Sequence by the Swift definition of Sequence. All extension methods on Sequence like drop(while:) are really just conveniences for common things that you can do with iterated access; to my mind, they're essentially just alternative ways of spelling various for...in loops.

I think an argument could be made that you shouldn’t be able to iterate over a set without first defining an ordering on it (even if that ordering is somewhat arbitrary). Maybe we have something like a “Sequenc(e)able” protocol which defines things which can be turned into a sequence when combined with some sort of ordering. One possible ordering could be the internal representation (At least in that case we are calling it out specifically). If I had to say “setA.arbitraryOrder.elementsEqual(setB.arbitraryOrder)” I would definitely be less surprised when it returns false even though setA == setB.

Well, that's a totally different direction, then; you're arguing that `Set` and `Dictionary` should not conform to `Sequence` altogether. That's fine (it's also a direction that some of us explored off-list a while ago), but at this point in Swift's evolution, realistically, it's not within the realm of possible changes.

I am actually suggesting something slightly different. Basically, Set and Dictionary’s conformance to Collection would have a different implementation. They would conform to another protocol declaring that they are unordered. That protocol would fill in part of the conformance to sequence/collection using a default ordering, which is mostly arbitrary, but guaranteed to produce the same ordering for the same list of elements (even across collection types). This would be safer, but a tiny bit slower than what we have now (We could also potentially develop a way for collections like set to amortize the cost). For those who need to recover speed, the new protocol would also define a property which quickly returns a sequence/iterator using the internal ordering (I arbitrarily called it .arbitraryOrder).

I believe it would not be source breaking.

That is indeed something slightly different.

In an ideal world--and my initial understanding of what you were suggesting--Set and Dictionary would each have a member like `collection`, which would expose the underlying data as a `SetCollection` or `DictionaryCollection` that in turn would conform to `Collection`; meanwhile, Set and Dictionary themselves would not offer methods such as `prefix`, or indexing by subscript, which are not compatible with being unordered. For those who want a particular ordering, there'd be something like `collection(ordered areInIncreasingOrder: (T, T) -> Bool) -> {Set|Dictionary}Collection`.

What you suggest here instead would be minimally source-breaking. However, I'm unsure of where these guarantees provide benefit to justify the performance cost. Certainly not for `first` or `dropFirst(_:)`, which still yields an arbitrary result which doesn't make sense for something _unordered_. We *could* have an underscored customization point named something like `_customOrderingPass` that is only invoked from `elementsEqual` or other such methods to pre-rearrange the internal ordering of unordered collections in some deterministic way before comparison. Is that what you have in mind?

Something like that. Whatever we do, there will be a tradeoff between speed, correctness, and ergonomics.

My suggestion trades speed for correctness, and provides a way to recover speed through additional typing (which is slightly less ergonomic).

You haven't convinced me that this is at all improved in "correctness." It trades one arbitrary iteration order for another on a type that tries to model an unordered collection.

We could do something like you suggest. I don’t think the method would need to be underscored… the ordering pass could just be a method on the protocol which defines it as unordered. Then we could provide a special conformance for things where order really matters based on adherence to that protocol. That might be an acceptable tradeoff. It would give us speed at the cost of having the correct implementation being less ergonomic and more error prone (you have to remember to check that it is unordered and call the ordering method when it mattered).

I’d still be a bit worried that people would make incorrect generic algorithms based on expecting an order from unordered things, but at least it would be possible for them check and handle it correctly. I think I could get behind that tradeoff/compromise, given where we are in the swift process and Swift's obsession with speed (though I still slightly prefer the safer default). At least the standard library would handle all the things correctly, and that is what will affect the majority of programmers.

What is an example of such an "incorrect" generic algorithm that would be made correct by such a scheme?

To start with, the one you gave as an example at the beginning of this discussion: Two sets with identical elements which have different internal storage and thus give different orderings as sequences. You yourself have argued that the confusion around this is enough of a problem that we need to make a source-breaking change (renaming it) to warn people that the results of the ‘elementsEqual’ algorithm are undefined for sets and dictionaries.

No, I am arguing that the confusion about ‘elementsEqual’ is foremost a problem with its name; the result of this operation is not at all undefined for two sets but actually clearly defined: it returns true if two sets have the same elements in the same iteration order, which is a publicly observable behavior of sets (likewise dictionaries).

How is the iteration order of an unordered set or dictionary “publicly observable”? If either is implemented such that it can asynchronously optimize its storage (maybe by rebalancing a tree or merging two non-contiguous array segments or something), its iteration order could change without changing what values it contains. Seems like consecutive calls to “elementsEquals” (or whatever we’re calling it) should return the same answer, if we don’t add, remove, or mutate elements.

Sets are values. If you add, remove, or mutate any elements you have a different Set and thus a potentially different ordering of elements.

From the “value semantics” PoV, yes. But from the “unordered collection of values” PoV, Sets/Dictionaries, being unordered, are semantically free to rearrange the in-memory ordering of their elements without user intervention.

No, they are not semantically free to do so. The semantics of Collection forbid it, because the iteration order must be multi-pass. As long as the value is unchanged, the iteration order is unchanged. That is a documented, public guarantee of the API.

xwu · October 16, 2017, 11:46pm

To expand on this, Set([1,2,3,4,5]).hasPrefix([1,2,3]) currently returns
true. But let’s say a year from now, we change Set to return an ordering
based on hash values (which is entirely reasonable). Suddenly the same code
may return true or false.

No guarantees will be broken by doing that, but the result has still
changed because we are building on top of undefined behavior. Collection
says nothing about the ordering over different builds of a program.

So that's not breakage. Results are allowed to change; `hasPrefix` should
change if the iteration order changes, and the iteration order is allowed
to change over different builds. Just like how the memory layout is allowed
to change, or anything else not explicitly guaranteed by public API
semantics is allowed to change.

To start with, the one you gave as an example at the beginning of this

discussion: Two sets with identical elements which have different internal
storage and thus give different orderings as sequences. You yourself have
argued that the confusion around this is enough of a problem that we need
to make a source-breaking change (renaming it) to warn people that the
results of the ‘elementsEqual’ algorithm are undefined for sets and
dictionaries.

No, I am arguing that the confusion about ‘elementsEqual’ is foremost a
problem with its name; the result of this operation is not at all undefined
for two sets but actually clearly defined: it returns true if two sets have
the same elements in the same iteration order, which is a publicly
observable behavior of sets (likewise dictionaries).

But that iteration order is undefined and could easily change due to
changes in the private/internal structure of sets/dictionaries. Algorithms
that rely on that “publicly observable behavior” (i.e. leaking of
internals) will suddenly break.

And an algorithm in which such “sudden breakage” would occur is…?

Here are a few off the top of my head:

func hasPrefix(Sequence)->Bool
func hasSuffix(Sequence)->Bool
func containsSubsequence(Sequence)->Bool

What do these methods mean with regards to Set’s “publicly observable
behavior”?

You keep claiming that this bug is a feature because it is the current

···

On Mon, Oct 16, 2017 at 6:40 PM, Jonathan Hull <jhull@gbis.com> wrote:
On Oct 16, 2017, at 4:11 PM, Jonathan Hull via swift-evolution < swift-evolution@swift.org> wrote:
On Oct 16, 2017, at 1:05 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:
On Mon, Oct 16, 2017 at 10:49 Jonathan Hull <jhull@gbis.com> wrote:

On Oct 16, 2017, at 7:20 AM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:
behavior… but that is tautological reasoning.

Thanks,
Jon

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

Jon_Hull · October 17, 2017, 2:42pm

That ordering can be arbitrary, but it shouldn’t leak internal representation such that the method used to create identical things affects the outcome of generic methods because of differences in internal representation.

It would be better to say that the iteration order is well-defined. That will almost always mean documented, and usually predictable though obviously e.g. RNGs and iterating in random order will not be predictable by design.

That's actually more semantically constrained than what Swift calls a `Collection` (which requires conforming types to be multi-pass and(?) finite). By contrast, Swift's `SpongeBob` protocol explicitly permits conforming single-pass, infinite, and/or unordered types.

I think you’re talking about Sequence here, I’ve lost track of your nonsense by now. Yes, the current Swift protocol named Sequence allows unordered types. You seem to keep asserting that but not actually addressing my argument, which is that allowing Sequences to be unordered with the current API is undesired and actively harmful, and should therefore be changed.

What is harmful about it?

After thinking about it, I think the harmful bit is that unordered sequences are leaking internal representation (In your example, this is causing people to be surprised when two sets with identical elements are generating different sequences/orderings based on how they were created). You are correct when you say that this problem is even true for for-in.

I would not say it is a problem. Rather, by definition, iteration involves retrieving one element after another; if you're allowed to do that with Set, then the elements of a Set are observably ordered in some way. Since it's not an OrderedSet--i.e., order doesn't matter--then the only sensible conclusion is that the order of elements obtained in a for...in loop must be arbitrary. If you think this is harmful, then you must believe that one should be prohibited from iterating over an instance of Set. Otherwise, Set is inescapably a Sequence by the Swift definition of Sequence. All extension methods on Sequence like drop(while:) are really just conveniences for common things that you can do with iterated access; to my mind, they're essentially just alternative ways of spelling various for...in loops.

I think an argument could be made that you shouldn’t be able to iterate over a set without first defining an ordering on it (even if that ordering is somewhat arbitrary). Maybe we have something like a “Sequenc(e)able” protocol which defines things which can be turned into a sequence when combined with some sort of ordering. One possible ordering could be the internal representation (At least in that case we are calling it out specifically). If I had to say “setA.arbitraryOrder.elementsEqual(setB.arbitraryOrder)” I would definitely be less surprised when it returns false even though setA == setB.

Well, that's a totally different direction, then; you're arguing that `Set` and `Dictionary` should not conform to `Sequence` altogether. That's fine (it's also a direction that some of us explored off-list a while ago), but at this point in Swift's evolution, realistically, it's not within the realm of possible changes.

I am actually suggesting something slightly different. Basically, Set and Dictionary’s conformance to Collection would have a different implementation. They would conform to another protocol declaring that they are unordered. That protocol would fill in part of the conformance to sequence/collection using a default ordering, which is mostly arbitrary, but guaranteed to produce the same ordering for the same list of elements (even across collection types). This would be safer, but a tiny bit slower than what we have now (We could also potentially develop a way for collections like set to amortize the cost). For those who need to recover speed, the new protocol would also define a property which quickly returns a sequence/iterator using the internal ordering (I arbitrarily called it .arbitraryOrder).

I believe it would not be source breaking.

That is indeed something slightly different.

In an ideal world--and my initial understanding of what you were suggesting--Set and Dictionary would each have a member like `collection`, which would expose the underlying data as a `SetCollection` or `DictionaryCollection` that in turn would conform to `Collection`; meanwhile, Set and Dictionary themselves would not offer methods such as `prefix`, or indexing by subscript, which are not compatible with being unordered. For those who want a particular ordering, there'd be something like `collection(ordered areInIncreasingOrder: (T, T) -> Bool) -> {Set|Dictionary}Collection`.

What you suggest here instead would be minimally source-breaking. However, I'm unsure of where these guarantees provide benefit to justify the performance cost. Certainly not for `first` or `dropFirst(_:)`, which still yields an arbitrary result which doesn't make sense for something _unordered_. We *could* have an underscored customization point named something like `_customOrderingPass` that is only invoked from `elementsEqual` or other such methods to pre-rearrange the internal ordering of unordered collections in some deterministic way before comparison. Is that what you have in mind?

Something like that. Whatever we do, there will be a tradeoff between speed, correctness, and ergonomics.

My suggestion trades speed for correctness, and provides a way to recover speed through additional typing (which is slightly less ergonomic).

You haven't convinced me that this is at all improved in "correctness." It trades one arbitrary iteration order for another on a type that tries to model an unordered collection.

We could do something like you suggest. I don’t think the method would need to be underscored… the ordering pass could just be a method on the protocol which defines it as unordered. Then we could provide a special conformance for things where order really matters based on adherence to that protocol. That might be an acceptable tradeoff. It would give us speed at the cost of having the correct implementation being less ergonomic and more error prone (you have to remember to check that it is unordered and call the ordering method when it mattered).

I’d still be a bit worried that people would make incorrect generic algorithms based on expecting an order from unordered things, but at least it would be possible for them check and handle it correctly. I think I could get behind that tradeoff/compromise, given where we are in the swift process and Swift's obsession with speed (though I still slightly prefer the safer default). At least the standard library would handle all the things correctly, and that is what will affect the majority of programmers.

What is an example of such an "incorrect" generic algorithm that would be made correct by such a scheme?

To start with, the one you gave as an example at the beginning of this discussion: Two sets with identical elements which have different internal storage and thus give different orderings as sequences. You yourself have argued that the confusion around this is enough of a problem that we need to make a source-breaking change (renaming it) to warn people that the results of the ‘elementsEqual’ algorithm are undefined for sets and dictionaries.

No, I am arguing that the confusion about ‘elementsEqual’ is foremost a problem with its name; the result of this operation is not at all undefined for two sets but actually clearly defined: it returns true if two sets have the same elements in the same iteration order, which is a publicly observable behavior of sets (likewise dictionaries).

But it is a behavior which has absolutely no meaning at all because the order does not depend on the elements of the set but on the history of how the set has been reached its current state.
So why should I ever use this method on a set?
What is the use case?

One example: you can use it to check an instance of Set<Float> to determine if it has a NaN value. (The “obvious” way of doing it is not guaranteed to work since NaN != NaN.)

How would I do that? I'd rather expect to use a property isNaN on Float to do that.

set.elementsEqual(set)

I see why that would work (thanks to Set being a collection vs a sequence), but it still feels like a hack. I definitely wouldn’t want to be maintaining code with that in it. Especially when compared to something like:

set.contains(where: {$0.isNaN})

I don’t see why a non-source-breaking change is suddenly off-limits.

But more than that, any generic algorithm which is assuming that the sequence is coming from an ordered source (i.e. many things using first/last). Some uses of first are ok because the programmer actually means ‘any’, but anywhere where they actually mean first/last may be problematic.

Such as...?

Currently, there is no way to test for ordered-ness, so there is no way for even a careful programmer to mitigate this problem. By adding a protocol which states that something is unordered, we can either branch on it, or create a separate version of an algorithm for things which conform.

It is clearly the case that Swift’s protocol hierarchy fits sets and collections imperfectly; however, it is in the nature of modeling that imperfections are present. The question is not whether it is possible to incur performance, API surface area, and other trade-offs to make the model more faithful, but rather whether this usefully solves any problem. What is the problem being mitigated? As I write above, Swift’s Set and Dictionary types meet the semantic requirements for Collection and moonlight as ordered collections. What is a generic algorithm on an ordered collection that is “not OK” for Set and Dictionary? (“elementsEqual”, as I’ve said, is not such an example.)

On the contrary, `elementsEqual` is exactly such an example, because it makes no sense to use it on a Set.

let s1 = Set([1,2,3,4,5,6])
let s2 = Set([6,5,4,3,2,1])

Both sets have different iteration orders. Comparing those sets with some other collection using `elementsEqual` will give no meaningful result because the order - and therefore the result of `elementsEqual` - is in effect random.

No, it is not such an example; it’s misleadingly named but works correctly—that is, its behavior matches exactly the documented behavior, which relies on only the semantic guarantees of Sequence, which Set correctly fulfills.

Fulfills to the letter. Again, what can you do with it if the result is random??

The result is not random.

It is undefined though. As you said earlier, by the guarantees we have been given, it may shift over different builds/runs of a program. Thus in one run, it might return true and then false in another (without changing our code). As Greg pointed out, it is also possible with the guarantees we are given, for set/dict to have different orderings with copies of themselves. (It will happen for sure when deep copying a dictionary with reference-type keys).

As I keep saying, relying on the behavior of a leaking internal implementation is a bad plan.

We should add an additional guarantee to set/dict that the order returned will be the same for the same contents regardless of history (but can be otherwise arbitrary). That will fix the behavior for algorithms like elementsEqual (i.e. They will return the same result across builds/runs). It will also implicitly provide as a result, the constraint you were arguing is needed across copies of a collection type. I agree that that is an important guarantee. Why not fix both issues with a single non-source-breaking change?

Why is the source-breaking change of renaming things better?

Thanks,
Jon

···

On Oct 17, 2017, at 5:44 AM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:
On Tue, Oct 17, 2017 at 00:56 Thorsten Seitz <tseitz42@icloud.com <mailto:tseitz42@icloud.com>> wrote:
Am 17.10.2017 um 00:13 schrieb Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>>:

On Mon, Oct 16, 2017 at 14:21 Thorsten Seitz <tseitz42@icloud.com <mailto:tseitz42@icloud.com>> wrote:

Am 16.10.2017 um 16:20 schrieb Xiaodi Wu via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>>:
On Mon, Oct 16, 2017 at 05:48 Jonathan Hull <jhull@gbis.com <mailto:jhull@gbis.com>> wrote:

On Oct 15, 2017, at 9:58 PM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:
On Sun, Oct 15, 2017 at 8:51 PM, Jonathan Hull <jhull@gbis.com <mailto:jhull@gbis.com>> wrote:

On Oct 14, 2017, at 10:48 PM, Xiaodi Wu <xiaodi.wu@gmail.com <mailto:xiaodi.wu@gmail.com>> wrote:

trs · October 17, 2017, 6:01am

Even worse, Set([5,4,3,2,1]) will probably return false although it contains the same elements.

-Thorsten

···

Am 17.10.2017 um 01:40 schrieb Jonathan Hull via swift-evolution <swift-evolution@swift.org>:

To expand on this, Set([1,2,3,4,5]).hasPrefix([1,2,3]) currently returns true. But let’s say a year from now, we change Set to return an ordering based on hash values (which is entirely reasonable). Suddenly the same code may return true or false.

No guarantees will be broken by doing that, but the result has still changed because we are building on top of undefined behavior. Collection says nothing about the ordering over different builds of a program.

Thanks,
Jon

On Oct 16, 2017, at 4:11 PM, Jonathan Hull via swift-evolution <swift-evolution@swift.org> wrote:

On Oct 16, 2017, at 1:05 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Mon, Oct 16, 2017 at 10:49 Jonathan Hull <jhull@gbis.com> wrote:

On Oct 16, 2017, at 7:20 AM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

To start with, the one you gave as an example at the beginning of this discussion: Two sets with identical elements which have different internal storage and thus give different orderings as sequences. You yourself have argued that the confusion around this is enough of a problem that we need to make a source-breaking change (renaming it) to warn people that the results of the ‘elementsEqual’ algorithm are undefined for sets and dictionaries.

No, I am arguing that the confusion about ‘elementsEqual’ is foremost a problem with its name; the result of this operation is not at all undefined for two sets but actually clearly defined: it returns true if two sets have the same elements in the same iteration order, which is a publicly observable behavior of sets (likewise dictionaries).

But that iteration order is undefined and could easily change due to changes in the private/internal structure of sets/dictionaries. Algorithms that rely on that “publicly observable behavior” (i.e. leaking of internals) will suddenly break.

And an algorithm in which such “sudden breakage” would occur is…?

Here are a few off the top of my head:

func hasPrefix(Sequence)->Bool
func hasSuffix(Sequence)->Bool
func containsSubsequence(Sequence)->Bool

What do these methods mean with regards to Set’s “publicly observable behavior”?

You keep claiming that this bug is a feature because it is the current behavior… but that is tautological reasoning.

Thanks,
Jon

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

trs · October 17, 2017, 6:03am

To start with, the one you gave as an example at the beginning of this discussion: Two sets with identical elements which have different internal storage and thus give different orderings as sequences. You yourself have argued that the confusion around this is enough of a problem that we need to make a source-breaking change (renaming it) to warn people that the results of the ‘elementsEqual’ algorithm are undefined for sets and dictionaries.

No, I am arguing that the confusion about ‘elementsEqual’ is foremost a problem with its name; the result of this operation is not at all undefined for two sets but actually clearly defined: it returns true if two sets have the same elements in the same iteration order, which is a publicly observable behavior of sets (likewise dictionaries).

But that iteration order is undefined and could easily change due to changes in the private/internal structure of sets/dictionaries. Algorithms that rely on that “publicly observable behavior” (i.e. leaking of internals) will suddenly break.

And an algorithm in which such “sudden breakage” would occur is…?

Here are a few off the top of my head:

func hasPrefix(Sequence)->Bool
func hasSuffix(Sequence)->Bool
func containsSubsequence(Sequence)->Bool

What do these methods mean with regards to Set’s “publicly observable behavior”?

In what way do these algorithms break? They would continue to determine--correctly--whether an instance of Set, when iterated, begins with, ends with, or contains (respectively) a subsequence that matches the argument.

Why do you not answe the question, what these methods *mean* for a Set?
Still waiting for a use case.

-Thorsten

···

Am 17.10.2017 um 01:43 schrieb Xiaodi Wu via swift-evolution <swift-evolution@swift.org>:

On Mon, Oct 16, 2017 at 6:10 PM, Jonathan Hull <jhull@gbis.com> wrote:

On Oct 16, 2017, at 1:05 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Mon, Oct 16, 2017 at 10:49 Jonathan Hull <jhull@gbis.com> wrote:

On Oct 16, 2017, at 7:20 AM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

hlovatt · October 16, 2017, 11:58pm

My preferences in order would be:

1. Split out of Sequence Iterable/ForEachable (whatever the name) and
have Set and Dictionary conform to this new protocol instead of Sequence.
With further protocols splits made to other 'mixin' protocols to keep the
order of iteration undefined.

2. Rename elementsEqual, iterableOrderEqual and change the definitions of
all the order dependent methods in the 'collections' hierarchy to
explicitly say "based on iteration order" and to explicitly say that "the
method iterates over the collection to produce their result and if the
collection can only iterate once then subsequent calls will cause a fatal
error".

There is only one reason that I can see for rejecting my 1st option - it's
just too much effort. I don't accept the argument of a breaking change in
the true sense of the word breaking because algorithms over Set/Dictionary
that rely on order are broken and there explicitly cause a compile time
error for these is good. Arguing that it isn't good to deliberately break
these algorithms is like saying that if there is a bug in the compiler that
accepts faulty code we shouldn't fix it because that will break someones
code - no a bug should be flagged.

-- Howard.

···

On 17 October 2017 at 10:40, Jonathan Hull via swift-evolution < swift-evolution@swift.org> wrote:

To expand on this, Set([1,2,3,4,5]).hasPrefix([1,2,3]) currently returns
true. But let’s say a year from now, we change Set to return an ordering
based on hash values (which is entirely reasonable). Suddenly the same code
may return true or false.

No guarantees will be broken by doing that, but the result has still
changed because we are building on top of undefined behavior. Collection
says nothing about the ordering over different builds of a program.

Thanks,
Jon

On Oct 16, 2017, at 4:11 PM, Jonathan Hull via swift-evolution < > swift-evolution@swift.org> wrote:

On Oct 16, 2017, at 1:05 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Mon, Oct 16, 2017 at 10:49 Jonathan Hull <jhull@gbis.com> wrote:

On Oct 16, 2017, at 7:20 AM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

To start with, the one you gave as an example at the beginning of this

discussion: Two sets with identical elements which have different internal
storage and thus give different orderings as sequences. You yourself have
argued that the confusion around this is enough of a problem that we need
to make a source-breaking change (renaming it) to warn people that the
results of the ‘elementsEqual’ algorithm are undefined for sets and
dictionaries.

No, I am arguing that the confusion about ‘elementsEqual’ is foremost a
problem with its name; the result of this operation is not at all undefined
for two sets but actually clearly defined: it returns true if two sets have
the same elements in the same iteration order, which is a publicly
observable behavior of sets (likewise dictionaries).

But that iteration order is undefined and could easily change due to
changes in the private/internal structure of sets/dictionaries. Algorithms
that rely on that “publicly observable behavior” (i.e. leaking of
internals) will suddenly break.

And an algorithm in which such “sudden breakage” would occur is…?

Here are a few off the top of my head:

func hasPrefix(Sequence)->Bool
func hasSuffix(Sequence)->Bool
func containsSubsequence(Sequence)->Bool

What do these methods mean with regards to Set’s “publicly observable
behavior”?

You keep claiming that this bug is a feature because it is the current

behavior… but that is tautological reasoning.

Thanks,
Jon

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

_______________________________________________
swift-evolution mailing list
swift-evolution@swift.org
https://lists.swift.org/mailman/listinfo/swift-evolution

knatt · October 17, 2017, 6:20pm

Once we allow covariant functions to satisfy protocol requirements and have generalized existentials and recursive protocol requirements, wouldn't we be able to update thusly:

protocol Unordered {
func map<T>(…) -> Any<U: Unordered where U.Element == T>
}
protocol Ordered: Unordered {
func map<T>(…) -> Any<O: Ordered where O.Element == T>
}

Now apply that to every order-preserving function that takes a Sequence and returns another Sequence. You’ve moved the burden from users of API to implementers of API. It reminds me of the const/non-const split that C++ developers have to deal with, where a lot of functions end up being implemented twice so that you can have a const version and a non-const version (generally one just calls the other). It’s a pain. I don’t want that when working with Sequences. I don’t think it’s worth it. And FWIW, when I was programming in C# I wrote functions that took an IEnumerable<T> and return another IEnumerable<T> very often. It’s a powerful feature that would have been ruined by a change like this.

The idea is that covariance would mean you only need to implement the function once.

In the example you showed above map is written twice. Maybe the two protocols can share an implementation, but you still have to have two versions declared somewhere.

Perhaps I'm wrong, but (once we allow covariant conformance) wouldn't a single implementation of the `Ordered` version be covariant and thus satisfy the `Unordered` requirement without even having a dummy implementation forwarding to it? That's what I'm aiming for.

What does it look like if you’re just writing a single function somewhere that takes in a Sequence and returns another Sequence? How do you make that function take both ordered and unordered Sequences? To make it concrete, say you write a function that just wraps map:

func firstNames(ofPeople people: Sequence<Person>) -> Sequence<Person> {
return people.map { $0.firstName }
}

Hmm, I'm not sure that would work with the covariant requirement. The associated type one could:
func firstNames<U: Unordered>(ofPeople people: U<MapResultType: Person>) -> U.MapResultType<Element: String> {
return people.map { $0.firstName }
}

If the sequence you put in maps to an ordered sequence, you get the ordered sequence out.
That said, I can see how the generics there could get out-of-hand as you add more operations… -> U.MapResultType.FilterResultType.MapResultType…<Element: String>

I'm planning on playing with this all before opening a real proposal, if/when I can figure out how, so I'm sure I'll have to deal with these and similar issues. Definitely good things to keep an eye out for, thanks.

···

On Oct 17, 2017, at 10:54 AM, Adam Kemp <adam_kemp@apple.com> wrote:

On Oct 17, 2017, at 10:41 AM, Kevin Nattinger via swift-evolution <swift-evolution@swift.org <mailto:swift-evolution@swift.org>> wrote:

On Oct 17, 2017, at 10:36 AM, Adam Kemp <adam_kemp@apple.com <mailto:adam_kemp@apple.com>> wrote:

On Oct 17, 2017, at 10:00 AM, Kevin Nattinger <swift@nattinger.net <mailto:swift@nattinger.net>> wrote:

I want that function to work on both ordered and unordered Sequences, and if you start with an ordered Sequence you should end up with an ordered Sequence. How do I make that work?

If there are no real-world problems, why do we feel the need to change the function name in the first place?

To quote myself from an earlier email: "I’m not even sure a name change is necessary for this method at all, but I’m not at all in favor of anything beyond that.”

To extend that, I’m content with doing nothing. I’m not sure there’s cause for doing anything at all, and I’m very sure that no one on this list has demonstrated any need for a major change to the library, let alone new language features.

trs · October 17, 2017, 3:58pm

That ordering can be arbitrary, but it shouldn’t leak internal representation such that the method used to create identical things affects the outcome of generic methods because of differences in internal representation.

It would be better to say that the iteration order is well-defined. That will almost always mean documented, and usually predictable though obviously e.g. RNGs and iterating in random order will not be predictable by design.

That's actually more semantically constrained than what Swift calls a `Collection` (which requires conforming types to be multi-pass and(?) finite). By contrast, Swift's `SpongeBob` protocol explicitly permits conforming single-pass, infinite, and/or unordered types.

I think you’re talking about Sequence here, I’ve lost track of your nonsense by now. Yes, the current Swift protocol named Sequence allows unordered types. You seem to keep asserting that but not actually addressing my argument, which is that allowing Sequences to be unordered with the current API is undesired and actively harmful, and should therefore be changed.

What is harmful about it?

After thinking about it, I think the harmful bit is that unordered sequences are leaking internal representation (In your example, this is causing people to be surprised when two sets with identical elements are generating different sequences/orderings based on how they were created). You are correct when you say that this problem is even true for for-in.

I would not say it is a problem. Rather, by definition, iteration involves retrieving one element after another; if you're allowed to do that with Set, then the elements of a Set are observably ordered in some way. Since it's not an OrderedSet--i.e., order doesn't matter--then the only sensible conclusion is that the order of elements obtained in a for...in loop must be arbitrary. If you think this is harmful, then you must believe that one should be prohibited from iterating over an instance of Set. Otherwise, Set is inescapably a Sequence by the Swift definition of Sequence. All extension methods on Sequence like drop(while:) are really just conveniences for common things that you can do with iterated access; to my mind, they're essentially just alternative ways of spelling various for...in loops.

I think an argument could be made that you shouldn’t be able to iterate over a set without first defining an ordering on it (even if that ordering is somewhat arbitrary). Maybe we have something like a “Sequenc(e)able” protocol which defines things which can be turned into a sequence when combined with some sort of ordering. One possible ordering could be the internal representation (At least in that case we are calling it out specifically). If I had to say “setA.arbitraryOrder.elementsEqual(setB.arbitraryOrder)” I would definitely be less surprised when it returns false even though setA == setB.

Well, that's a totally different direction, then; you're arguing that `Set` and `Dictionary` should not conform to `Sequence` altogether. That's fine (it's also a direction that some of us explored off-list a while ago), but at this point in Swift's evolution, realistically, it's not within the realm of possible changes.

I am actually suggesting something slightly different. Basically, Set and Dictionary’s conformance to Collection would have a different implementation. They would conform to another protocol declaring that they are unordered. That protocol would fill in part of the conformance to sequence/collection using a default ordering, which is mostly arbitrary, but guaranteed to produce the same ordering for the same list of elements (even across collection types). This would be safer, but a tiny bit slower than what we have now (We could also potentially develop a way for collections like set to amortize the cost). For those who need to recover speed, the new protocol would also define a property which quickly returns a sequence/iterator using the internal ordering (I arbitrarily called it .arbitraryOrder).

I believe it would not be source breaking.

That is indeed something slightly different.

In an ideal world--and my initial understanding of what you were suggesting--Set and Dictionary would each have a member like `collection`, which would expose the underlying data as a `SetCollection` or `DictionaryCollection` that in turn would conform to `Collection`; meanwhile, Set and Dictionary themselves would not offer methods such as `prefix`, or indexing by subscript, which are not compatible with being unordered. For those who want a particular ordering, there'd be something like `collection(ordered areInIncreasingOrder: (T, T) -> Bool) -> {Set|Dictionary}Collection`.

What you suggest here instead would be minimally source-breaking. However, I'm unsure of where these guarantees provide benefit to justify the performance cost. Certainly not for `first` or `dropFirst(_:)`, which still yields an arbitrary result which doesn't make sense for something _unordered_. We *could* have an underscored customization point named something like `_customOrderingPass` that is only invoked from `elementsEqual` or other such methods to pre-rearrange the internal ordering of unordered collections in some deterministic way before comparison. Is that what you have in mind?

Something like that. Whatever we do, there will be a tradeoff between speed, correctness, and ergonomics.

My suggestion trades speed for correctness, and provides a way to recover speed through additional typing (which is slightly less ergonomic).

You haven't convinced me that this is at all improved in "correctness." It trades one arbitrary iteration order for another on a type that tries to model an unordered collection.

We could do something like you suggest. I don’t think the method would need to be underscored… the ordering pass could just be a method on the protocol which defines it as unordered. Then we could provide a special conformance for things where order really matters based on adherence to that protocol. That might be an acceptable tradeoff. It would give us speed at the cost of having the correct implementation being less ergonomic and more error prone (you have to remember to check that it is unordered and call the ordering method when it mattered).

I’d still be a bit worried that people would make incorrect generic algorithms based on expecting an order from unordered things, but at least it would be possible for them check and handle it correctly. I think I could get behind that tradeoff/compromise, given where we are in the swift process and Swift's obsession with speed (though I still slightly prefer the safer default). At least the standard library would handle all the things correctly, and that is what will affect the majority of programmers.

What is an example of such an "incorrect" generic algorithm that would be made correct by such a scheme?

To start with, the one you gave as an example at the beginning of this discussion: Two sets with identical elements which have different internal storage and thus give different orderings as sequences. You yourself have argued that the confusion around this is enough of a problem that we need to make a source-breaking change (renaming it) to warn people that the results of the ‘elementsEqual’ algorithm are undefined for sets and dictionaries.

No, I am arguing that the confusion about ‘elementsEqual’ is foremost a problem with its name; the result of this operation is not at all undefined for two sets but actually clearly defined: it returns true if two sets have the same elements in the same iteration order, which is a publicly observable behavior of sets (likewise dictionaries).

But it is a behavior which has absolutely no meaning at all because the order does not depend on the elements of the set but on the history of how the set has been reached its current state.
So why should I ever use this method on a set?
What is the use case?

One example: you can use it to check an instance of Set<Float> to determine if it has a NaN value. (The “obvious” way of doing it is not guaranteed to work since NaN != NaN.)

How would I do that? I'd rather expect to use a property isNaN on Float to do that.

set.elementsEqual(set)

If this is the only use case for `elementsEqual` I suggest we remove that method and just use Float.isNaN:

set.contains { $0.isNaN }

which I argue is far more readable (intention revealing).

-Thorsten

···

Am 17.10.2017 um 14:44 schrieb Xiaodi Wu <xiaodi.wu@gmail.com>:

On Tue, Oct 17, 2017 at 00:56 Thorsten Seitz <tseitz42@icloud.com> wrote:

Am 17.10.2017 um 00:13 schrieb Xiaodi Wu <xiaodi.wu@gmail.com>:
On Mon, Oct 16, 2017 at 14:21 Thorsten Seitz <tseitz42@icloud.com> wrote:

Am 16.10.2017 um 16:20 schrieb Xiaodi Wu via swift-evolution <swift-evolution@swift.org>:
On Mon, Oct 16, 2017 at 05:48 Jonathan Hull <jhull@gbis.com> wrote:

On Oct 15, 2017, at 9:58 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

On Sun, Oct 15, 2017 at 8:51 PM, Jonathan Hull <jhull@gbis.com> wrote:

On Oct 14, 2017, at 10:48 PM, Xiaodi Wu <xiaodi.wu@gmail.com> wrote:

I don’t see why a non-source-breaking change is suddenly off-limits.

But more than that, any generic algorithm which is assuming that the sequence is coming from an ordered source (i.e. many things using first/last). Some uses of first are ok because the programmer actually means ‘any’, but anywhere where they actually mean first/last may be problematic.

Such as...?

Currently, there is no way to test for ordered-ness, so there is no way for even a careful programmer to mitigate this problem. By adding a protocol which states that something is unordered, we can either branch on it, or create a separate version of an algorithm for things which conform.

It is clearly the case that Swift’s protocol hierarchy fits sets and collections imperfectly; however, it is in the nature of modeling that imperfections are present. The question is not whether it is possible to incur performance, API surface area, and other trade-offs to make the model more faithful, but rather whether this usefully solves any problem. What is the problem being mitigated? As I write above, Swift’s Set and Dictionary types meet the semantic requirements for Collection and moonlight as ordered collections. What is a generic algorithm on an ordered collection that is “not OK” for Set and Dictionary? (“elementsEqual”, as I’ve said, is not such an example.)

On the contrary, `elementsEqual` is exactly such an example, because it makes no sense to use it on a Set.

let s1 = Set([1,2,3,4,5,6])
let s2 = Set([6,5,4,3,2,1])

Both sets have different iteration orders. Comparing those sets with some other collection using `elementsEqual` will give no meaningful result because the order - and therefore the result of `elementsEqual` - is in effect random.

No, it is not such an example; it’s misleadingly named but works correctly—that is, its behavior matches exactly the documented behavior, which relies on only the semantic guarantees of Sequence, which Set correctly fulfills.

Fulfills to the letter. Again, what can you do with it if the result is random??

The result is not random.