SE-0207: Add a containsOnly algorithm to Sequence

ddunbar · April 8, 2018, 3:34am

What is your evaluation of the proposal?

I like the feature.

I really don't like the name. In particular, one helper method that I often find myself wanting is "does this collection contain exactly one match for predicate". It is possible to read containsOnly as that method (in fact, I was excited to see the proposal because at first I thought that is what it would be). After thought, I agree Only does match the use here, but I think it is subtle enough that it makes me rather unhappy with the naming.

Is the problem being addressed significant enough to warrant a change to Swift?

Yes, it is a useful function.

Does this proposal fit well with the feel and direction of Swift?

Sure, if an acceptable name is found.

If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?

No strong opinion, it is hard to reconcile naming in other languages with Swift's naming conventions.

How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

A quick reading.

bzamayo · April 8, 2018, 9:54am

What is your evaluation of the proposal?
Strongly in favour of this addition. It encapsulates a useful predicate that is confusing and laborious to write succinctly in an efficient manner. In regard to naming, I haven’t seen a better suggestion than containsOnly. The suggested ambiguity of the name as ‘containing only one element which matches the predicste’ doesn’t hold water to me. If you need to be fastidious, you could name the method containsOnly(elementsWhere:) and containsOnly(elementsEqualTo:). I don’t think this is necessary personally.

I think grouping this under the other related methods is important, i.e maintaining a ‘contains’ prefix in the name. Therefore, a containsAll base name is an alternative that I would support, if people really do not like containsOnly.

Is the problem being addressed significant enough to warrant a change to Swift?
It’s an additive change to provide commonly used functionality that is non-trivial to write manually. Fine by me,

Does this proposal fit well with the feel and direction of Swift?
Sure, presuming it is named likewise.

If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?
Other languages offer the same functionality with names like all. This proposal tackles the same functionality in accordance with Swift naming guidelines.

How much effort did you put into your review? A glance, a quick reading, or an in-depth study?
Studied the various threads and considered alternative names.

Karl · April 8, 2018, 9:35pm

What is your evaluation of the proposal?
+1

I think the name creates a potential ambiguity, but it's not significant and can be remedied with an additional method.

Is the problem being addressed significant enough to warrant a change to Swift?
Yup

Does this proposal fit well with the feel and direction of Swift?
Yup

If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?
I don't remember. any/all/none seem pretty familiar.

How much effort did you put into your review? A glance, a quick reading, or an in-depth study?
Took part in original discussion (ages ago). Browsed it again briefly, read the comments here, etc.

Somebody proposed a count(where:) method. Maybe that would be a better fit for your problem?

contains doesn't tell you anything about the total number of elements in the collection. Only what proportion of them pass a given predicate.

Ben_Cohen · April 11, 2018, 1:43am

The trouble with that is doesn't bail early in the case of a count greater than the target. Which is a real shame when you are checking for a count of 1.

You'd need a slightly different collection of methods to handle that, which took the exact target count as an argument (unless you want to get up to shenanigans with overloading ==):

a.count(exactly: 1) // bails as soon as count > 1
a.count(exactly: 1, of: "foo")  // Equatable exact count
a.count(exactly: 1, where: { $0 < 5 }) // predicate version

griotspeak · April 11, 2018, 4:04am

+1

I read the proposal and have used my own version of this for a while.

dabrahams · April 11, 2018, 2:39pm

I agree that the latter is definitely not necessary, since equality notionally implies identity (except for class types, which don't count ).

I have been thinking ever since I read the proposal that elementsWhere: would be a better argument label for the closure version, though. If we are to prefer usage to be something like grammatical English, you can't leave out the noun and have it read correctly. “This list of families contains only [elements] where the number of children is three.”

Maybe the appearance of the plural noun “elements” also helps a bit to clear up @ddunbar's concern about what this method means, at least when reading its signature?

cal · April 11, 2018, 5:51pm

Very much +1. Niceties like this are great additions to the standard library.

dabrahams · April 12, 2018, 4:08am

SE-0207 Core Team Interim Report / Feedback Request

In today's core team meeting, we discussed three possible variations of the overload that takes a predicate:

studentBody.containsOnly(where: gradeIsAPlus)
studentBody.containsOnly(elementsWhere: gradeIsAPlus)
studentBody.containsOnly(gradeIsAPlus)

These choices reflect different prioritizations of principled values, and we are trying to decide which principles should win out. The eventual decision will set a precedent that will help us evaluate similar bikesheds in the future. The core team sent me back to the evolution list to try to gather relevant signal.

The main argument for #1 (what's in the proposal) is that in the standard library, all unary predicates are currently labeled with where: or while:, or are unlabeled. The principle is that surface uniformity of similar constructs is valuable; it makes code (including new APIs) easier to write and APIs easier to find.

The main argument for #2 is that our API guidelines say we should “prefer method and function names that make use sites form grammatical English phrases.” That principle was followed in choosing the standard library's current names, and the existing uses of where: don't have a basic grammaticality problem, but usage #1 does: a noun is missing.

The main argument for #3 is that usage is still quite clear, and it would be better not to try to achieve grammaticality than to go partway there but fail (as in #1). [Note that as far as the compiler is concerned, this choice is unambiguous with the other overload because closures are not Equatable]. The principle here is that in the absence of a difference in clarity, shorter/smaller APIs are better.

The core team would very much appreciate your input on how to weigh these principles.

Thanks,
Dave

xwu · April 12, 2018, 4:33am

dabrahams:

The main argument for #1 (what’s in the proposal) is that in the standard library, all unary predicates are currently labeled with where: or while:, or are unlabeled. The principle is that surface uniformity of similar constructs is valuable; it makes code (including new APIs) easier to write and APIs easier to find.

The main argument for #2 is that our API guidelines say we should “prefer method and function names that make use sites form grammatical English phrases.” That principle was followed in choosing the standard library’s current names, and the existing uses of where: don’t have a basic grammaticality problem, but usage #1 does: a noun is missing.

The main argument for #3 is that usage is still quite clear, and it would be better not to try to achieve grammaticality than to go partway there but fail (as in #1). [Note that as far as the compiler is concerned, this choice is unambiguous with the other overload because closures are not Equatable]. The principle here is that in the absence of a difference in clarity, shorter/smaller APIs are better.

The core team would very much appreciate your input on how to weigh these principles.

Principle #1, internal consistency, promotes clarity for all users, and the rewards only increase with increasing familiarity with the standard library. Principle #2 promotes clarity for English speakers, and the rewards increase with increased English fluency. Principle #3 comes into play when all other factors are equal, as embodied in the exhortation to omit needless words (words that promote principles #1 and 2 being, of course, needful).

Given that none of the proposed names are offensively hostile to the English language or actively confusing, I'd argue that principle #1 is paramount. Moreover, Swift naming guidelines have permitted eliding words such as "with" when translating from Obj-C, so it doesn't seem out of place to elide an obvious word here.

cal · April 12, 2018, 2:48pm

+1 to Option 2

I think that making the call site read naturally (the "grammatical english phrase" philosophy) is one of the most important aspects we should consider when designing new APIs. It's one of Swift's unique and defining characteristics, so we should give it a rather high (although not ultimate) precedence.

I also think containsOnly(elementsWhere:) is reasonably consistent with the where: precedent that already exists in the standard library. In contrast, the unlabeled option (Option 3) seems contrary to existing patterns, and containsOnly(where:) doesn't read very naturally.

nnnnnnnn · April 12, 2018, 3:37pm

I agree with Xiaodi here. The consistency is important, and use of where: as a label is already strongly established:

contains(where:) (the direct counterpart)
first(where:)
index(where:)
last(where:)

I don't find containsOnly(where:) any less grammatical than contains(where:)—in both cases you have to supply the omitted noun:

// Tell me whether 'names' contains (an element) where the first letter is 'N'
names.contains(where: { $0.first == "N" })
// Tell me whether 'names' contains only (elements) where the first letter is 'N'
names.containsOnly(where: { $0.first == "N" })

The additional length of containsOnly(elementsWhere:) doesn't feel justified.

nuclearace · April 12, 2018, 3:50pm

Yes, I second this notion. It's not really "leaving" the noun out, but just moving it to be after.

bzamayo · April 12, 2018, 4:15pm

FWIW, I brought up containsOnly(elementsWhere:) not to be grammar precise necessarily, but to clarify the ambiguity raised by some that containsOnly(where:) might be confused to mean 'return true if the sequence only contains a single element that matches the predicate'. I don't find this particularly convincing, but I raised the suggestion all the same.

I think containsOnly(where:) is within reason grammatically accurate, and is consistent with other predicates in the standard library. That's the standout winner in my books.

cal · April 12, 2018, 4:24pm

nnnnnnnn:

// Tell me whether 'names' contains (an element) where the first letter is 'N'
names.contains(where: { $0.first == "N" })
// Tell me whether 'names' contains only (elements) where the first letter is 'N'
names.containsOnly(where: { $0.first == "N" })

Nevermind -- this example sold me on Option 1. If it's good enough for contains, it's good enough for containsOnly.

Ben_Cohen · April 12, 2018, 4:48pm

We also have (admittedly unreleased, but accepted) removeAll(where:) as precedent for a similar situation for where: as the label even when it isn't quite grammatical (i.e. removeAll(where: isEven) rather than removeAll(elementsWhere: isEven).

What's more, as can be seen from the naming of the predicate in that example, chasing grammar at the call site is always going to be a best-efforts thing.

I feel pretty strongly that varying argument labels on a case by case basis in this way will feel, to many developers – including native English speakers – to be strange, surprising, and annoying. Maybe they will come to learn eventually of the naming guideline reasons behind the choice, at which point I doubt they will then feel more warmly about about having to remember which label to use when.

Nevin · April 12, 2018, 4:49pm

Am I to understand the that the core team did not consider any alternative spelling of the base name, for example allMatch(_:)?

I think it is important to consider the following situations from a clarity and fluency perspective:

1. The sequence has a plural name, such as “students”
2. The sequence has a singular name, such as “studentBody”
3. The sequence is a literal, such as [harry, ron, hermione]
4. The sequence is named self, such as in an extension
5. The sequence name is omitted, such as in an extension

And for each of those, the following scenarios:

a. The predicate is a named function such as “gradeIsAPlus”
b. The predicate is a closure literal such as “{ $0.grade == .aPlus }”

dabrahams · April 12, 2018, 6:23pm

Okay, I hate to disagree with almost everybody, and to throw a wrench in the review works with a late, strong opinion (from the review manager no less!), but I was turning this over in my head last night while trying to sleep—yes, I still do that with Swift design problems—and it struck me that maybe we were asking the wrong questions. I for one was missing the point in most of the grumbling about the proposed names, and the questions we asked don't address that point.

I now see that

grades.containsOnly(4)

is easily misread as meaning "grades contains just one element element equal to 4." And when I look back at the principles that drive naming, just one is paramount: understandability at the use site. That principle outweighs the benefits of grammaticality, consistency, and discoverability, among others.

When I thought about it, even spelling the full semantics of this function out as "c contains only elements equal to 4" leaves the reader jumping through mental hoops: it is really just a convoluted way of saying what I really mean: "all of c's elements are equal to 4." Attempting to group these methods into a family with contains sort of begs the original problem we're trying to solve: you can implement equivalent logic today in terms of contains, but doing so requires mental gymnastics that make your code unclear.

For me, this makes the right choices for names in this family obvious:

grades.allEqual(4)
grades.allSatisfy(isAtLeastBPlus)

Note: I really love the way the calls look when "Equal" and "Satisfy" are turned into argument labels, but can find no principled justification for doing so. If you have one, so much the better.
grades.all(equal: 4)
grades.all(satisfy: isAtLeastBPlus)

Erica_Sadun · April 12, 2018, 6:31pm

As someone who has been biting her tongue and deliberately not participating: this. Very much this.

pyrtsa · April 12, 2018, 6:37pm

I agree, my initial response to ”xs contains only y” is definitely that xs is a list of one y.

But not only that, because by definition, the result of [].containsOnly(42) is also going to be true – which, I think, is less intuitive than at least these alternatives that have been suggested earlier:

[].containsAll(42),
[].allEqual(42).

(I also like all(equal:) and all(satisfy:).)

Ben_Cohen · April 12, 2018, 6:40pm

My take is that this isn't easily misread – but rather, is of the order of "yeah, I guess you could interpret it that way now you mention it". This is of course subjective. But I think you need to weight the applicability of understandability at the call site by the likelihood of misinterpretation rather than treat it as a binary thing.

I think earlier someone suggested that all variants like grades.all(satisfy: isAtLeastBPlus) could be interpreted as a filter rather than a test. Again, I think this is a stretch – about of the same order as thinking containsOnly means contains only one element. Though at least in this case the types will help clear things up – but that doesn't help when you're reading as much as writing.

Finally, while I agree clarity at the call site is most important, I again think it's a matter of overweighting versus rather than trumping the other guidelines. Deviating this significantly from an otherwise consistent family of names will be very jarring.