SE-0270 (Review #3): Add Collection Operations on Noncontiguous Elements

The third review of SE-0270: Add Collection Operations on Noncontiguous Elements begins now and runs through January 28th, 2020. The first review was here, and the second is here.

I want to apologize for the murky status of the proposal over the past month. The review period ended on December 19th, but clearly the conversation has continued, and attentive readers will have picked up on the fact that revisions were apparently being planned.

The Core Team has received a lot of great feedback on this proposal. A substantial amount of that feedback came after the review period was technically over, and it came largely in the form of a series of counter-proposals. The goal of proposal review is to guide the decision-making process in a way that hopefully leads to the best result, not to reach a formal yes/no decision on a specific proposal. While the Core Team wants the review process to be respectful of the author's time, and while we understand that many counter-proposals are fairly insubstantial, we also feel it is important to treat well-thought-out counterproposals seriously: the evolution process is not meant to be a "first to propose, wins" process. We have therefore been allowing these conversations to play out naturally, but the result is perhaps slightly unclear. Let me try to clarify it now.

There has been quite a lot of discussion about nearly every aspect of this proposal, from names down to its deepest structure. In consideration of this feedback, the proposal author has elected to revise the proposal to change a few names and to remove some of the more debatable aspects, leaving them for future consideration. You can see a full difference of the changes here. If you are interested, you may also find it valuable to read some of Dave Abrahams' counter-proposals and the ensuing discussion in the previous review thread; I believe Dave considers some but not all of those counter-proposals to still apply to this revision.

As before, the Core Team has elected not to partially accept any aspects of the proposal; this is again a de novo review. I will be trying to collate feedback from both this and the previous (second) round of review, so if you feel that you have nothing more to say about this proposal, please feel free to ignore this re-review.

Reviews are an important part of the Swift evolution process. All review feedback should be either on this forum thread or, if you would like to keep your feedback private, directly to me as the review manager via email or direct message on the forums. If you send me email, please put "SE-0270" somewhere in the subject line.

What goes into a review of a proposal?

The goal of the review process is to improve the proposal under review through constructive criticism and, eventually, determine the direction of Swift.

When reviewing a proposal, here are some questions to consider:

  • What is your evaluation of the proposal?
  • Is the problem being addressed significant enough to warrant a change to Swift?
  • Does this proposal fit well with the feel and direction of Swift?
  • If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?
  • How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

As always, thank you for contributing to Swift.

John McCall
Review Manager

10 Likes
  • What is your evaluation of the proposal?

I was aware of the other review iterations but by the time I got to them they were already quite lengthy so I decided not to participate. With that in mind I looked through this proposal with fresh eyes and no other preconceptions from other discussions. I'm a +1.

I wasn't sure about the RangeSet idea when I first started reading the proposal, but it makes absolute sense as to why that direction was chosen. I believe there are some useful future directions for this type and keeping the proposal focused on the core functionality is a good decision.

  • Is the problem being addressed significant enough to warrant a change to Swift?

Yes. I've built out my own solutions to the problem this is trying to address (but with a more limited scope and focused on Indices, not the proposed RangeSet) in my own projects, so I will be happy to finally have it available within the standard library itself.

  • Does this proposal fit well with the feel and direction of Swift?

Yes, it feels well thought out and fits nicely with the ergonomics of the language.

  • If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?

Other languages I've used for these operations don't tend to consider ranges, they only typically work with individual indices. While that is a valid direction, this proposed solution feels more forward facing and solves more issues within the same API.

  • How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

I've read through the proposal in its current form. I was aware of previous reviews and discussions but did not participate.

The methods now read much better with subrange than they did with range, I think this change brings the proposal into better alignment with Swift’s existing naming conventions.

• • •

I’m curious about the MutableCollection subscript. Specifically, its return type is DiscontiguousSlice, which means the only type which can be assigned through that subscript is DiscontiguousSlice. I’m not per se opposed to that subscript, but it seems extremely niche and unlikely to be used much in practice.

My understanding of MutableCollection is that one-to-one element-replacing assignments are allowed in general, which means from a high-level perspective it ought to be possible to assign any Sequence through either a Slice or DiscontiguousSlice subscript, provided that the number of elements is the same.

This is a bit of a tangent, and I understand that the existing Slice subscript has the same behavior, but I find that equally troublesome:

var nums = Array(0..<5)
var x = [6, 7, 8]
nums[1..<4] = x       // error
nums[1..<4] = x[...]  // success

Rather than further splinter the assignable-subscript API surface for MutableCollection, perhaps we ought to consider making it generic instead?

4 Likes

I'm not sure what I think about this revision. I'm finding it a bit tiring to keep coming back to find a new revised proposal under review, which I then need to read through from beginning to end and consider freshly (no, looking at diffs is not enough to assess the integrity of the overall proposal), so I haven't done it for this revision yet. I think it would be much more efficient if each of the sub-threads I started in the last review contained some discussion of the specific points raised and a message describing whether—and if so, how—the author intends to address those points. Then we could have a discussion of the trade-offs and hopefully reach some consensus on the direction before going through a whole 'nother review cycle. That's why I went to the trouble to break up the posts the way I did.

So this isn't a review, and I doubt I'll have time to do a complete one during the review period. That said, I can address one of the prominent changes I see here. Let me first refer to my definition of “subrange”: a range of (consecutive) elements in a collection (probably I should have said a “sequence” (lowercase s) rather than a “range” just to avoid confusion).

This usage is not consistent with that definition:

let indicesOfEvens = numbers.subranges(where: { $0.isMultiple(of: 2) })

The method would have to return something like a collection of slices to be consistent with the definition. Now, my definition of “subranges” isn't necessarily gospel, and maybe someone else can come up with a definition that makes everything consistent again, but this also points to another problem that seems to be persisting, so I'll ask again:

What is this thing the latter method is returning, semantically?

If its primary semantic role is as a collection of index ranges or slices, then it should conform to collection and not have top-level set APIs, and something like the proposed naming direction makes sense. If its primary semantic role will be to act as a set of indices, then the proposed naming direction makes no sense to me.

In fact, you see this semantic clash in action in this example:

let rangeOfEvens = numbers.moveSubranges(indicesOfEvens, to: numbers.startIndex)

Here we clearly have a set of indices being used as an input. The neither the author nor the reader of this code care about the representation of that set as a sorted collection of ranges.

moveSubanges(_:to:) seems to be an improvement over gather to me!

  • What is your evaluation of the proposal?

My evaluation of the proposal reads +1.

  • Is the problem being addressed significant enough to warrant a change to Swift?

Yes, Swift deserves a type comparable to IndexSet which lacks a lot of niceity that RangeSet does not.

  • Does this proposal fit well with the feel and direction of Swift?

Yes, in particular, adding and removing ranges from a RangeSet works the way you would expect it to; It works smarter than a dumb Set.

  • If you have used other languages or libraries with a similar feature, how do you feel that this proposal compares to those?

I have not seen this in other languages or libraries.

  • How much effort did you put into your review? A glance, a quick reading, or an in-depth study?

I have reviewed the past two discussions and skimmed over most of this third proposal.

I advocated for embracing subrange in this revision of the proposal in large part because of your observation that "subrange" is the existing term of art for "a range of positions in a collection":

A subrange is not "a range of elements in a collection":

  1. Swift already has SubSequence and Slice for that.
  2. If it were, removeSubrange wouldn't be any clearer than removeRange.

I think of "subrange" now as "a range of positions that represent consecutive elements in a collection". So numbers.subranges { $0.isMultiple(of: 2) } is returning "the positions of the elements that are a multiple of 2".

I think there's a lot of nice synergy with removeSubrange(_:) and replaceSubrange(_:with:) in naming the algorithms subranges(where:), removeSubranges(_:), and moveSubranges(_:to:). (Also I imagine we'll be adding moveSubrange(_:to:) soon, in a follow up proposal.)

1 Like

And I suppose removeSubrange(_:) is semantically "removing the elements represented by the given range of positions from the collection"

Well, it's very obvious that I misspoke/mis-analyzed things (a.k.a. “was wrong” :wink:) when I said that. Okay, this is a little subtle, but only slightly. We used "subrange" in two places: replaceSubrange and removeSubrange. That's all we really have to go on when thinking about the meaning of "subrange." Original intent wouldn't matter much even if we had an official definition somewhere—but we don't even have that.

If I ask ten people what either of these methods do, they're going to tell me something like ”it removes/replaces the elements delimited by the given bounds.” That is to say, it removes or replaces elements, not positions. I really don't think it's reasonable to say replaceSubrange is replacing one set of positions with another, since the thing after with: is not a representation of new positions, but of elements.

You can go ahead and stretch to say removeSubrange is really removing positions, or stretch even further and say something really complicated (!) like:

But I hope we can agree that there's nothing extant that pushes anyone to think of replaceSubrange(i..<j, with: x) as "replace the elements represented by the range of positions from i to j” as opposed to the much simpler "replace the elements in positions from i to j.” As in science, the simplest interpretation that fits the evidence wins.

A subrange is not "a range of elements in a collection":

  1. Swift already has SubSequence and Slice for that.
  2. If it were, removeSubrange wouldn't be any clearer than removeRange .

I've been trying to tell you guys: “subrange” was deliberately chosen not to be the name of an existing type, so that it doesn't imply something incorrect about the type of the collection's elements. Whatever you think of the importance or effectiveness of the choice, we didn't use words like “range,” "slice," or “subsequence” because as I said above, when you take the most straightforward plain english description of the semantics of those methods—and really the only simple one that works for replaceSubrange(_:with:)—the word is clearly referring to a contiguous subset of elements, in-situ, and any of those words would have suggested that the elements have that type.

It seems to me you're really missing my point here. Let me try one last time: clients of this method do not care that the return value has a property that is a collection of ranges, or even that you can efficiently derive a collection of what either of us is calling a “subrange” from it. Clients of this method care that the result represents a set of positions in the collection. If you put the word "subrange" in the name, you are injecting irrelevant information that's at the wrong level of abstraction for what they're trying to communicate with their code. You can easily see that clients are thinking of this thing as a set of positions by looking at the name of the first argument to moveSubranges in this example from the proposal:

rangeOfEvens = numbers.moveSubranges(indicesOfEvens, to: numbers.startIndex)

This client wants to move the elements at positions in indicesOfEvens. They do not want to move a bunch of subranges; they've forgotten that there is a way to derive subranges—whichever definition you choose—from indicesOfEvens, and rightly so. That's why @Chris_Lattner3's suggestion of moveElements made sense.

FWIW, to me this sounds like a classic example of falling into the “a name is good because it sounds like other names we've got” trap.

1 Like

I can't help myself, but the more I read about subranges, the more I dislike that term:
It makes sense to speak of something like a subset or substring, as it implies that this is derived from a base set/string.
Here, however, the base is a Collection, and I can just speak of a "range of elements in the collection" - the "sub"-prefix is redundant (at least for me - maybe there's a subtile difference that's obvious for native speakers).

What pushes me to think of "subrange" as "a range of positions" is the subrange argument label on replaceSubrange (and removeSubrange):

mutating func replaceSubrange<C>(
    _ subrange: Range<Index>,
    with newElements: C
) where C: Collection, C.Element == Element

If "subrange" were intended to be interpreted as "a range of elements" I would expect a declaration more like:

mutating func replaceSubrange<C>(
    at bounds: Range<Index>,
    with newElements: C
) where C: Collection, C.Element == Element

What exactly is the difference between the two interpretations? They read identically to me. If there is a difference, some documentation should clear it up. We can't expect to fully explicate a method in its name.

1 Like

I am looking forward to seeing this functionality land in the standard library!

However, I have to agree with the other commenters that the proposed API is extremely confusing. I would expect the API to look like the other standard library APIs related to indices:

let indices = numbers.indices { element in 
    element.isMultiple(of: 2)
}

Similarly, e.g. rangeSet.insert(contentsOf: range) is documented as "Adds the values represented by the given range to the range set”. From what I understand, this method inserts a new range into a range set. I would expect this API to be similar to the standard Set with an API like rangeSet.insert(range).

No, it adds the values represented by the range, to the range set. That is, if the inserted range overlaps and already existing range in the set, that range is instead extended.

This was the name in the original proposal, however it's my understanding that similar to count(where:), indices(where:) represents a technical challenge for type checker performance because it conflicts with the existing Collection.indices property.

I think that's an uncharitable interpretation of my post.

It seems to me that consistent use of terms in method names is an important contributor to clarity. It aids in understanding by evoking other, similar methods you've run across.

I think it would be a missed opportunity to leverage the existing connotations of "subrange", were we to choose a name different than moveSubrange(_:to) when we already have replaceSubrange(_:with:) and removeSubrange(_:).

Similarly, I think it would be more appropriate to use the name moveAll(to:where:) than moveElements(to:where:) in order to be consistent with removeAll(where:). If we didn't, I imagine the discerning reader would feel obligated to pause and investigate the distinction we were trying to make by using "elements" in one name and "all" in the other.

Consistent use of terms allows the reader to more accurately extrapolate the meaning of all the methods in a family after encountering only a small subset. For example, a reader who is familiar with removeAll(where:) and moveSubrange(_:to:) would probably be able to anticipate the existence and meaning of removeSubrange(_:) and moveAll(to:where:).

Having sunk the cost into establishing the "subrange" term, I think the names we can choose for this proposal with the richest connotations are subranges(where:), removeSubranges(_:), and moveSubranges(_:to:).

2 Likes

SE-0270 has been accepted; thank you all for your feedback.

4 Likes