RangeSet
treats Range
s as sets; should we name operations accordingly?
[Almost done with these posts; I have just one more on the stack—so far!]
Welcome back, @maven! Actually, this is logically a “set of indices”, or rather, a “set of anything comparable”. I know it can be confusing: here's the place where I finally realized it myself.
I had planned to let someone else answer this post, but I was just re-reading it and after refactoring your statement above I noticed you were saying Range<Bound>
is a bit like Set<Bound>
, which is an excellent point: Ranges
are treated by RangeSet
as contiguous sets of the Bound
type. Once you realize that, a number of the APIs bear re-examination:
1.
/// Creates a range set containing the values in the given range.
public init(_ range: Range<Bound>)
RangeSet(0..<10)
LGTM; it's an information- and semantics- preserving construction of one set from another.
2.
/// Creates a range set containing the values in the given ranges.
public init<S: Sequence>(_ ranges: S) where S.Element == Range<Bound>
Hmm,
RangeSet([0..<10, 10..<20])
Doesn't look symmetric with the previous one, and seems to imply we're building a set of ranges. Should it be:
RangeSet(union: [0..<10, 10..<20])
? That doesn't read beautifully to me, but it's at least trying to address the issue. Open to other ideas.
3.
/// Adds the values represented by the given range to the range set.
public mutating func insert(contentsOf range: Range<Bound>)
/// Removes the given range of values from the range set.
public mutating func remove(contentsOf range: Range<Bound>)
Now that we have the contentsOf:
label, these are less suggestive than they used to be of the idea that RangeSet
is a set of ranges, but the use sites are now longer, and these APIs fail to acknowledge the set-like role played by parameter. The obvious alternatives are s.formUnion(someRange)
and s.subtract(someRange)
, which are both consistent with the set role of the parameter and more compact at the use site. I used to be rather ambivalent about suggesting this change, but seeing the usage written out, it now looks like a no-brainer to me.
Full disclosure: changing these names as I've suggested implies the possiblility of a whole pile of related API, with usage like
let s1 = s.union(someRange) // non-mutating
let s2 = otherRange.subtracting(s) // non-mutating, reversing order
etc. However, I don't think that implication obliges us to add any of those other APIs today. We should start conservatively and only expand when we have real use cases demonstrating a benefit.
4.
public struct Ranges: RandomAccessCollection {
public var startIndex: Int { get }
public var endIndex: Int { get }
public subscript(i: Int) -> Range<Bound>
}
/// A collection of the ranges that make up the range set.
public var ranges: Ranges { get }
Seems to me contiguousSubsets
is a better name for this, despite its length. This is not a name that will be uttered often, and where it is useful, IMO, there's usually a better way. For example, we should discourage
for r in s.ranges {
for i in r {
... // something with i
}
}
in favor of
for i in c.indices[s] {
... // something with i
}
because the latter is simpler, but also because the former is wrong with an arbitrary collection (the inner loop needs to be “for i in a.indices[r]
”).
[Also, the property should be documented as being sorted by lowerBound
and having non-overlapping elements.]
Conclusion
The proposed naming of these APIs may be a vestige of our initial failure to clearly answer the “what the heck is it?” question, and we should consider going all the way to reflecting a consistent view of the roles of RangeSet
and Range
. If we decide not to make the changes, we should at least understand why not.