SE-0270: Add Collection Operations on Noncontiguous Elements

I like the name Selection. It’s easy for people to understand and remember.

Oh, but it might be confusing, like if you’re doing UI programming :confused:

Thanks for the continued feedback! In regards to naming, a couple quick notes:

  • We have a name for a set of ranges: Set<Range<T>>. Because the semantics of that aren't the semantics of this proposed RangeSet<T>, using one of those is confusing and not terribly useful.
  • The name RangeSet isn't completely novel — it's been in the Guava library since 2013 with essentially the same semantics, and you can find third-party implementations under the same name for a variety of languages.

A Range always contains the values within its boundaries, it just doesn't store them individually. Every range has the contains(_:) method, not just those that have Collection conformance. RangeSet uses these same terms and semantics.

This example shows why the inverted method takes a collection — you always need the actual range of seats passed in from the outside. Inverting the set of occupied seats in the span of representable Ints would always give you way too many seats available!

2 Likes

Countability or enumeratability are not required characteristics of a set. Swift.Set needs to be countable and enumeratable, but it's not a general set. This new type is more general in some ways (compact representations for some very large sets of Double), while being less in others.

1 Like

Yes but I think it should return a wrapper view rather than copying a new rangeset. In a simple case like that, you just want to query each row from the perspective of the holes. That should be a really simple operation.

Perhaps “inverted” could return a wrapper, and we could add an initialiser on RangeSet if you need an independent copy.

This is why I’d like it to go in the preview package.

1 Like

Just my quick 2cents on naming:

An old fashioned "Multi-" prefix sometimes works when extending an existing types to act like multiple instances of itself. How about MultiRange?

1 Like

To reemphasize my earlier point: these names aren't better, just weirder. That is to say, IMO they do nothing for someone not already familiar with the type other than make them thing "huh, that's a weird name." I consider that first impression harmful and think we should be more mindful of this harm.

RangeSet has prior art in other libraries, and echoes IndexSet, with which Cocoa programmers are already familiar. These are strong recommendations as they allow porting of previous experience. Allusions to mathematical terms or specific use cases don't bring similar benefits, no matter how satisfyingly apt they may seem in an evolution thread.

6 Likes

I agree that RangeSet is an excellent name for this feature.

I also agree that it would be nice to have an InvertedRangeSet type, much as we have ReversedCollection.

This would be misleading me -- I expect the Multi prefix to appear on constructs that support duplicate values. (Which definitely have a role in/near the standard library.)

I like both RangeSet and Ranges.


I am not convinced that the complement of a RangeSet (or indeed, a Range) is something we need (or want) to model with a dedicated type. AFAICS, the relative complement implemented by subtract (and the convenience shorthand inverted(within:)) covers most (all?) practical use cases; dealing with a separate type seems to introduce more problems than it answers.

To be clear, I think RangeSet is fine as a name.

If we are to explore other options, however, I think Ranges is unsatisfying for the reasons I mention—and because it is weird (not least because it is a plural, of an existing type no less).

2 Likes

Overall, I really like the proposal. I had the need for it just the other day and went to borrow some code from @nnnnnnnn's proposal (namely, I the extremely good documentation :wink:), so that definitely speaks to its quality. I agree, though, that I'd like to see this in i.e., a preview package and exercise it myself more.

The model proposed composes nicely with the collection system. I'm excited about how removeAll(at:) and DiscontiguousSlice might make the best-performing code also be the most readable, just as two examples.

within: in the init arguments to RangeSet is a little awkward when a parts of the stdlib already make use of in:.

I'm perfectly happy with the spelling RangeSet, with Ranges a close second.

I like that we have a bundle of types that have Set in the name with reasonably distinct uses. I prefer this to approach to, say, ending up like Java where you get cargo-culting that everyone on a project must use AtomicDoubleEdgedSwordHoldTheMustardLinkedList<T> because the senior dev says so.

Building off of that, though, I'm bummed like some others are to not have SetAlgebra conformance on a type named Set. Seems a little bit like the whole range expression/offset/offset expression build-up where we end up splintering protocol requirements. In particular, I think it likely that a project will end up needing to write a protocol SetAlgebraButActuallyJustTheSubsetThatRangeSetImplements { } and using that for as a bound in a generic algorithm.

1 Like

This would be a good time to remind people of how we arrived at the Swift API Guidelines, particularly the first naming guideline, “Include all the words needed to avoid ambiguity for a person reading code where the name is used.” Following that principle leads directly to many of the naming choices in the standard library, some of which Ben is complaining about here.

The naming guidelines are a product of work from people from across the spectrum of subjective viewpoints, from those inclined toward minimal names like those in C++, to those steeped in the rather verbose tradition of Cocoa. Given that diversity, it's remarkable that we eventually achieved a consensus that everybody involved could buy into.

At the time, the tradition of Objective-C naming patterns was so heavily ingrained in the culture of the participants, that had we not de-emphasized subjective criteria like “mouth feel,” we would have ended up with standard library APIs in Swift that most of us would regard today as wildly inappropriate for Swift. Code that read simply as a.contains(b) could elicit a strong “yuck” reaction from an Objective-C programmer who expected each parameter to be introduced with a noun describing its type. Furthermore, emphasizing “mouth feel” over disambiguation can be harmful: we found examples of several Objective-C API that would “ring nicely in the ear” for someone steeped in that tradition, but in practice resulted in misleading code.

It is perfectly appropriate to spend a few words, and even risk disrupting a reader's sense of pronounceable code, to avoid an easily made misinterpretation at the use site of an API. That has always been a principle of naming in Swift's standard library, and discarding it now would not only create inconsistency, it would undermine one of the library's core design principles.

I can hardly think of a case that warrants this sort of treatment more than one where a reader will see a Range being inserted into an instance of RangeSet, which is not actually modeling a set of ranges.

10 Likes

I'm bummed like some others are to not have SetAlgebra conformance on a type named Set .

Unfortunately SetAlgebra seems irretrievably flawed. There other set types that also cannot conform to it (such as a predicate set or an inverted wrapper set) and it also cannot support move-only types when we have them.

I would love to see a pitch for a new set protocol that was limited to the core capabilities of a set type (I think this may just be an element associated type and a contains algorithm) that could replace it.

5 Likes

Try not to be bummed :slightly_smiling_face:. It's important to realize that SetAlgebra was never supposed to support all set types. When we designed it, we did a fairly complete exploration of the different kinds of possible sets, and worked out a complete hierarchy of protocols to accommodate them. But it's also a principle that you don't introduce a protocol unless you have a family of actual model types that can conform to that protocol, and by “have” I mean more than just “have imagined:” I mean that you have implementations and evidence that they're useful in real code. At the time, we had Set and OptionSet, so we could only really justify introducing the most refined protocol that covered those examples. That was SetAlgebra.

The intention was always that, as more set types were actually implemented, we'd add less-refined protocols to the hierarchy to account for them. So this isn't a bad thing at all; the discovery of a new, useful set type that didn't fit SetAlgebra was always expected.

I'm not sure whether we can figure out how to inject less-refined protocols into SetAlgebra's hierarchy now that we have ABI stability. If we can't, finding an alternative coding pattern that works will be an interesting problem, because the later discovery of these protocols is going to be a pattern in generic library evolution.

Hope this helps,
Dave

5 Likes

SE-0241 also proposed within: labels originally, and updated to in: labels during the review (in commit 14a1643).

3 Likes

I prefer to have a method to normalize / canonicalize the RangeSet if it is not supposed to be by default.

+1 on the proposal. If we're bikeshedding names, SegmentSet might get us away from the semi overload of Range along with encloses(_) or encompasses

In mathematics and computing, most sets are not a finite, explicit constructive collection of elements — but rather a possibly-infinite, implicit non-constructive set defined through a predicate that decides wether or not an element is a member of the set.

In Swift the type Set must be created by enumerating all its elements. You add elements to the set, and the set is defined in terms of its members. If an arbitrary other value is equal to a member, it is considered to be contained in the set.

However, in some programming languages there are PredicateSet which does not have a finite, well-defined list of elements. Instead, they have a rule that defines wether a given value is included or not. Adding to the set, means OR-ing two a predicate to the existing. Removing from the set, means AND NOT-ing to the existing predicate. A predicate set is not a set-of-predicates, but a set-defined-by-a-predicate. In fact, there once was a pitch to add PredicateSet to Swift, if I recall correctly.

RangeSet fits perfectly, IMHO. I don't have problem with the name at all. It's not merely a set-of-ranges, but a set-defined-by-ranges. When you add to the set, you extend its definition.

(Similarly, Foundation already has CharacterSet. It is kinda like a predicate set, but kinda also like an element set. I don't know how it is implemented. It allows you to add specific characters, but you can't enumerate its members. I suspect it uses ranges under the hood.)

7 Likes

What you say is correct, but I think you're conflating two separate issues.

The problem with RangeSet is not that it's an inappropriate name for the proposed type, but that it's too general. Swift keeps shooting itself in the foot by using up general names for specific things. If you want a good example of this, go upthread to the comments about SetAlgebra. This gist of these is that it's not really the thing that we want to call SetAlgebra, but it uses up the more general name, yet it's too late to rename it.

The second issue is whether the semantics of "operations on noncontiguous elements" should be formally specified in terms of an underlying implementation that uses ranges. In that case, it's not an issue of what the thing is called, but the semantics of its behavior.

I would be happy with the naming if we compromised with, say, IndexRangeSet, but I'm not yet convinced that the type itself is the right design abstraction for the proposed APIs. (Clearly I'm in the minority here, but that's OK.)

SE-0270 was returned for revision, and it is now up for a second round of review.

Terms of Service

Privacy Policy

Cookie Policy