SE-0203 — Rename Sequence.elementsEqual


(Tino) #55

I don’t think it’s confusing that Set is a Collection - it’s confusing that every Collection is a Sequence.
Obviously, an Iterator protocol would be sufficient to be used in for ... in, but it wouldn’t have the methods that depend on the actual order.
Note that this not only includes elementsEqual but also reversed, starts(with:), lexicographicallyPrecedes and others.
Renaming one of those is like putting a warning sign in front of a big hole, and it may save some people from falling into that trap. But shouldn’t it at least be evaluated if it’s not better to fill that hole?

The only arguments against that have been “it has to be that way” and “this is out of scope” - and if that’s enough to not even start a real discussion, imho Swift evolution has failed.


(Adam Kemp) #56

2 A set of related events, movements, or items that follow each other in a particular order.

If you iterate over a Set you will get a particular order. It may not be the order you want, but it will be some particular order. That’s a sequence.

As @jawbroken said, this argument is pointless. This proposal isn’t about Sequence.


(Tino) #57

The Oxford dictionary supports my standpoint as well:
There’s a difference between something can be transformed into a sequence and something is a sequence:
We can easily transform a Float into a SignedInteger. This works even better than turning a set into a sequence, because there are simple and transparent rules for that transformation, whereas the order of Set is a black box.
But would it make sense if Float conformed to SignedInteger?

I couldn’t find a reference, but I always considered a protocol to be more than a bunch of methods, something that has a meaning.
Sequence should be an abstraction that enables us to write algorithms that allow its user to choose freely wether they want to store their items in an array, a linked list, or something else.
But when I use this abstraction (something like stepsHappenedInRightOrder), I automatically allow people to use my API with Set - although I know that this absolutely makes no sense!
Someone could use it with Set, could write some unit tests which succeed by coincidence, and end up with a program that is inherently wrong, just like the use of elementsEqual in the past.


#58

Every Collection has comparable and advanceable indices so they’re all inherently sequenced, right?

I’m not sure I understand this at all. In what way is for-in loop itself not effectively a method that depends on the actual order. Or a next() method on an Iterator?

I’m not seeing great harm in any of these other methods. reversed and starts(with:) are probably not particularly useful on a Dictionary or Set, but I can’t see it being common for them to be used incorrectly. Have you seen them causing confusion or bugs? lexicographicallyPrecedes seems slightly more likely to be misused, as someone might feed it two Dictionary or Set instances like elementsEqual, but it has an unusual and precise enough name that I’m not concerned.

Even if this discussion was starting from first principles I’m not sure it wouldn’t end up with a similar design. Scanning the documentation, there’s a lot of useful Sequence API that makes sense on Dictionary and Set, with a couple less useful parts and a single particularly harmful method with an unfortunate name. It might be simpler just to rename the harmful part than to have some more complex protocol redesign (e.g. somehow separating Sequence and Collection) or making things clunkier to use (e.g. hiding the Sequence and Collection conformances inside a view).


(Tino) #59

Questions - great! ;-)

Absolutely right - and because of the way our computers work, they can’t even represent something that’s truly unordered. The difference is wether that order exists because some programmer decided intentionally that he needs it (array, list…), or if that order happens more or less accidentally (set, dictionary).

Right as well. But the iterator doesn’t care wether next() returns something that was put into its position consciously, or a random element.

But it makes a difference for the algorithms we write.

Let’s take a simple example where the source of order matters:
It is quite useful that I can search a substring in a string, which is nothing but a bunch of ordered characters (let’s ignore Unicode for a while ;-).
You could perform the same search in a set of characters… but why should you ever want to do so?
Would it be useful if CharacterSet had all methods of String, or would it be harmful and confusing?


#60

If we rename this at all, then we should call it lexicographicallyEquals. We already have lexicographicallyPrecedes when Element is Comparable, and this is the natural relaxation of that to Equatable.

The objections in the proposal don’t carry water. In particular,

Right, the proper term is “equals”.

This is a good thing. We don’t want people reaching for this method when “==” is more appropriate.

Irrelevant. If I see two words (sequences of characters) in a foreign language, I may have no idea which one comes first alphabetically (so they are not Comparable to me) but I can still tell whether they are the same word.

That is, I can tell whether the words are lexicographically equal, even when I don’t know which one precedes the other.

And this is not just a statement about my ignorance. We can imagine a language which *does not have* a canonical ordering of its alphabet. So two different lexicographers in that language could arrange their dictionaries differently.

In that case, it is meaningless to ask if one word lexicographically precedes the other (except with respect to a provided dictionary), but it is *still meaningful* to ask if two words are lexicographically equal. Because equal words are equal in every dictionary, and distinct words are distinct.

No more so than the existing lexicographicallyPrecedes. I don’t think this is a real-world concern. A name like lexicographicallyEquals is sufficiently esoteric that people will consult the documentation to make sure they are using it correctly, unlike elementsEqual.


(Ted Kremenek) #61

I appreciate that broader concerns have been brought up on this review thread. Those can be factored in to deciding whether a rename is warranted. However, changing Sequence or Collection in any way beyond what this proposal outlines — which is focused in the renaming of a single method — is very much beyond the scope of this proposal review. At this point such discussion is likely adding diminishing signal to the review of this particular proposal.

I recommend that broader discussion about the nature of the Sequence and Collection protocols — and any broader ideas about how they could be changed — should be taken off this thread and brought back to the general evolution discussion or pitches categories. I am recommending this not to shoot down ideas, but rather steer that discussion to the right place in the forums and allow this thread to focus on the specific proposed change in the review.

The review period for this proposal is almost over. What discussion on this thread should be focused on at this point is whether or not the method elementsEqual in Sequence should be renamed, and if so what justifies that change.

When considering the rename itself, I’d like to call out that we are beyond the point of just renaming methods just for the sake of polishing up APIs. Here are the source compatibility goals for Swift 5 (and hence Swift 4.2):

Source-breaking changes in Swift 5 will have an even higher bar than in Swift 4, following these guidelines:

  • The current syntax/API must be shown to actively cause problems for users.
  • The new syntax/API must be clearly better and must not conflict with existing Swift syntax.
  • There must be a reasonably automated migration path for existing code.

For those in favor of a rename, please keep in mind that this is the core criteria for any source-breaking change.

Thank you to all who have put so much energy into the review discussion!


Is a set a sequence?
(Xiaodi Wu) #62

Like you, I did not think that misunderstanding of the word “lexicographically” was a real-world concern. However, during the pitch phase, several users commented that they would not consult the documentation to make sure that they would use this method correctly; instead, they would confidently deduce that the method reorders elements before comparison. :man_shrugging:


#63

Overly complex method names are likely to hide overly complex algorithms :-D


(Benjamin Mayo) #64

I can’t confidently say that we should rename this method, given that the biggest cases of misuse (i.e. set equality) have been dealt with by conditional conformances. There are obviously a lot of other potential danger zones here, I don’t think it is that common in reality though.

If you just look at the name in isolation, ignoring the source-compatibility and migration costs, it definitely should be renamed as it is not as clear as it could be. To that effect, I’d throw my hat in the ring in support of sequentiallyEquals(by:) or lexicographicallyEquals(by:), backing up suggestions from others upthread.

In short, I guess this is a +0.5 to the overall proposal? I apologise for sort-of sitting on the fence on this one.


(Tino) #65

That sounds a little bit like “security by obscurity”… afaics, the current name is absolutely fine as long as you only use the method on “real” sequences.
I don’t think it’s necessary to irritate people who want to compare an array with a list (I also don’t think it’s necessary to break their code ;-).


(Saagar Jha) #66

I just took the time to read the proposal and glance through the discussion, and I thought I’d pitch an alternative that I didn’t see anyone else propose (maybe because it’s a dumb idea :slight_smile:). I think we should deprecate Sequence.elementsEqual(_:) and not rename it at all. As others have said in this thread, this sort of operation is relatively infrequent and its use is generally a code smell in “beginner” code, as the examples cited above show. However, we then need to deal with this:

elementsEqual(_:) is a generic method on Sequence, we can use it to compare an instance of UnsafeBufferPointer<Int> to an instance of [Int]. This is a potentially useful and non-redundant feature which would be eliminated if the method is removed altogether.

I think the solution here is not to provide a one-shot method that does this; rather, a possible solution I thought up of is to provide functionality so that this operation can be achieved in a relatively short composition.

Intuitively, I’d lump elementsEqual(_:) as a zip(_:_:) derivative, since it’s doing something similar. Unfortunately, zip, as it is currently available, has some annoyances that make it unsuitable as a replacement. In particular, it silently drops the suffix of the longer Sequence, which makes it impossible to distinguish from say ([1, 2], [1, 2]) and ([1, 2, 3], [1, 2]). However, I think we could add a new method that does some sort of padding and returns a sequence out of that. If we create a method like this, say paddedZip(_:_:), now elementsEqual(_:) is relatively simple to write: we could just write it as paddedZip(sequence1, sequence 2).first(where: { $0 != $1 }) == nil. If we add mismatch to the standard library, this would be even more concise.

Overall, my thoughts were to make this entire class of operations easier to do, rather than trying to “baby-proof” a special-case operation.This is just a rough sketch a possible solution I came up with; of course, things like naming or semantics are free to change based on your feedback.


Is a set a sequence?
#67

It is actually called “progressive disclosure”.

A less intrusive change would be to prevent the use of elementsEqual when == is available. Specifically, when the argument and receiver have the same type, and that type conforms to Equatable, we could make it a compiler error to call elementsEqual, similar to what we do for subscripting a Range.


(Erica Sadun) #68

I don’t think it’s a dumb idea and I think it should be considered.

Sets are inherently unordered collections. Placing any guarantees at all about iteration order (outside of OrderedSet) may not be in line with the type’s fundamental semantics.


On the elementsEqual problem, or “[Pitch] Set and Dictionary should not be Sequences”
(Ben Cohen) #69

I would be opposed to solving the problem this way. I disagree with the sentiment that elementsEqual is uncommon. I find myself comparing different sequence types frequently – it’s very useful when testing a new type of sequence you just wrote, for example – and believe we need such a method in the standard library (though I’m pretty flexible on what it is called; my concern with this proposal is that the renaming isn’t sufficiently better to justify the high bar for a rename).

As you point out, the logic for elementsEqual is deceptively tricky, and as such, even if we did add a paddedZip, there would be users that miss the issue with just using zip and do that instead – easily as bad a source of bugs as the Set problem that kicked all this off IMO. If equalElements didn’t already exist, it would easily qualify for inclusion in the std lib on the grounds of commonality, readability, and difficulty of implementation, even if we did bikeshed ourselves to death trying to name it :)

Additionally, with elementsEqual being an actual method in the std lib, at least we have documentation of it where we can put the warning that you need to be sure not to use it on Set, unlike the composed alternative.

p.s. paddedZip would make a great proposal – though I’d prefer a name that started with zip. Python calls it zip_longest.


Is a set a sequence?
Add a padded version of zip to complement the current zip
Add Various Zip-Related Types and Operations
(Adam Kemp) #70

As I pointed out in the pitch discussion, API names are UX. A UX that requires consulting documentation for trivial use cases is a bad UX. A method that does a simple thing that still requires consulting documentation to understand is a bad API. An API that does a simple thing that no one can find because its name is opaque to many people is a bad API.

The name “lexicographicallyEquals” was thoroughly discussed already. There were multiple issues with it, including but not limited to the fact that its name doesn’t mean anything to many people. Let’s not rehash this again and again, and please let’s stop pretending that it’s a good thing to have to look up documentation to understand a simple API. That is a recipe for a library no one will want to use.


(Vladimir) #71

IMO elementsEqual is confusing by its own, without any relation to ordered/unordered sequences/sets/etc. So my +1 to rename to elementsSequentiallyEqual or to suggested elementsEqualInIterationOrder

We still have a question what to do with all others methods on Sequence protocol(prefix()/suffix() etc) that assume that order is essential and meaningful feature of target “bag of items”, and which use on unordered sequences like Set/Dictionary could produce hard-to-find unstable bugs. [But this probably should be discussed in “Is a set a sequence?” thread]


(Saagar Jha) #72

Hmm, that’s an interesting point. Personally, I think that if you’re explicitly reaching for elementsEqual, you’d probably know the difference, but I do concede that this is only an assumption, and it would be a hard thing to spot a mistake if this did happen even if you did know what you’re doing, since it doesn’t show its semantics as clearly as a function call.

My thought behind this was that if you’re going to write something like what I suggested above, you already know that it won’t work on Set, because you understand what the code is actually doing–you wrote it yourself. Personally, I think that’s better than any documentation.

Just finished writing this up:


(Ted Kremenek) #73

Proposal Declined

The Core Team weighed all the feedback on this review thread, and decided to ultimately decline to make the change proposed in this proposal.

Rationale

There was enough consensus on the review thread that the current name of the API can be misinterpreted. When assessing this kind of API change, we weighed the following factors:

  1. How harmful is the current name of the API?
  2. How much improvement is possible by changing the API name?
  3. How much harm does the change associated with improvement cause?

For #3, the main harm is unnecessary source churn for users.

For #1, much of the harm that motivated this proposal is mitigated somewhat by the ability to consistently use == now that conditional conformances have been adopted (where appropriate) for Standard Library types.

For #2, the benefits can outweigh the concerns for #3 if it is clear the newly proposed name is a clearly better solution and the “right” API. For this there was no clear consensus on the review. Neither the API name in the proposal or alternatives proposed during the review seemed like clearly better APIs that both delivered benefits that outweighed the source churn this change would require.

These kind of API discussions are a delicate balance. As Swift continues to mature the bar to rename APIs will get higher. Ultimately the perspective from the Core Team is that this renaming did not meet the current bar.

Thank you to everyone who participated in this review!

Ted Kremenek
Review Manager


Rejected: SE-0203 — rename Sequence.elementsEqual
On the elementsEqual problem, or “[Pitch] Set and Dictionary should not be Sequences”
(Ted Kremenek) #74