SE-0270 (Review #2): Add Collection Operations on Noncontiguous Elements

RangeSet treats Ranges as sets; should we name operations accordingly?

[Almost done with these posts; I have just one more on the stack—so far!]

Welcome back, @maven! Actually, this is logically a “set of indices”, or rather, a “set of anything comparable”. I know it can be confusing: here's the place where I finally realized it myself.

I had planned to let someone else answer this post, but I was just re-reading it and after refactoring your statement above I noticed you were saying Range<Bound> is a bit like Set<Bound>, which is an excellent point: Ranges are treated by RangeSet as contiguous sets of the Bound type. Once you realize that, a number of the APIs bear re-examination:

1.

    /// Creates a range set containing the values in the given range.
    public init(_ range: Range<Bound>)
RangeSet(0..<10)

LGTM; it's an information- and semantics- preserving construction of one set from another.

2.

    /// Creates a range set containing the values in the given ranges.
    public init<S: Sequence>(_ ranges: S) where S.Element == Range<Bound>

Hmm,

RangeSet([0..<10, 10..<20])

Doesn't look symmetric with the previous one, and seems to imply we're building a set of ranges. Should it be:

RangeSet(union: [0..<10, 10..<20])

? That doesn't read beautifully to me, but it's at least trying to address the issue. Open to other ideas.

3.

    /// Adds the values represented by the given range to the range set.
    public mutating func insert(contentsOf range: Range<Bound>)
        
    /// Removes the given range of values from the range set.
    public mutating func remove(contentsOf range: Range<Bound>)

Now that we have the contentsOf: label, these are less suggestive than they used to be of the idea that RangeSet is a set of ranges, but the use sites are now longer, and these APIs fail to acknowledge the set-like role played by parameter. The obvious alternatives are s.formUnion(someRange) and s.subtract(someRange), which are both consistent with the set role of the parameter and more compact at the use site. I used to be rather ambivalent about suggesting this change, but seeing the usage written out, it now looks like a no-brainer to me.

Full disclosure: changing these names as I've suggested implies the possiblility of a whole pile of related API, with usage like

let s1 = s.union(someRange)        // non-mutating
let s2 = otherRange.subtracting(s) // non-mutating, reversing order

etc. However, I don't think that implication obliges us to add any of those other APIs today. We should start conservatively and only expand when we have real use cases demonstrating a benefit.

4.

    public struct Ranges: RandomAccessCollection {
        public var startIndex: Int { get }
        public var endIndex: Int { get }
        public subscript(i: Int) -> Range<Bound>
    }
    
    /// A collection of the ranges that make up the range set.
    public var ranges: Ranges { get }

Seems to me contiguousSubsets is a better name for this, despite its length. This is not a name that will be uttered often, and where it is useful, IMO, there's usually a better way. For example, we should discourage

for r in s.ranges {
    for i in r {
        ... // something with i
    }
}

in favor of

for i in c.indices[s] {
    ... // something with i

}

because the latter is simpler, but also because the former is wrong with an arbitrary collection (the inner loop needs to be “for i in a.indices[r]”).

[Also, the property should be documented as being sorted by lowerBound and having non-overlapping elements.]

Conclusion

The proposed naming of these APIs may be a vestige of our initial failure to clearly answer the “what the heck is it?” question, and we should consider going all the way to reflecting a consistent view of the roles of RangeSet and Range. If we decide not to make the changes, we should at least understand why not.

7 Likes