[Pitch] RangeSet/IndexSet Conversion

jmschonfeld · January 12, 2024, 6:03pm

Hi everyone,

I have a new pitch ready that proposes additional API for Foundation that allows conversion between the new RangeSet type in the standard library and Foundation's IndexSet type. While IndexSet is not available in swift-foundation at the moment, I'd still appreciate any feedback that you have on this API addition! I've included a snippet of the pitch below but you can check out the full proposal here.

`RangeSet`/`IndexSet` Conversion

Proposal: SF-NNNN
Author(s): Jeremy Schonfeld
Status: Pitch
Implementation: coming soon

Introduction/Motivation

With the latest revision of SE-0270, we've landed the RangeSet type in the standard library. Foundation already has an existing type called IndexSet used in various (mainly Objective-C-based) APIs to represent a collection of integer indices. These types are fairly similar in purpose, however IndexSet is constrained to non-negative, integer indices (as required by Objective-C collections) whereas RangeSet is generic over any Comparable index type. There are some cases when developers may wish to convert a set of indices between the IndexSet and RangeSet representations. For example, library authors writing new swift APIs that wrap or interact with existing Objective-C APIs may wish to expose RangeSet-based entry points while occasionally calling to IndexSet-based Objective-C APIs as an implementation detail. Additionally, developers may wish to write apps using the new RangeSet type while still interfacing with existing SDK APIs that use IndexSet from Foundation. Foundation has the opportunity to provide these conversion initializers as a convenience to ensure both correctness and performance while moving between the two types.

Proposed solution and example

We will add new initializers in Foundation on the IndexSet and RangeSet types that allow conversion between the two when the RangeSet's Bound == Int. For examples, developers could write the following code:

extension UICollectionView {
	func reloadSections(_ sections: RangeSet<Int>) {
		guard let indexSet = IndexSet(integersIn: sections) else {
			fatalError("Invalid section numbers passed to reloadSections(_:). Sections must be non-negative integers")
		}
		self.reloadSections(indexSet) // Call to existing ObjC API
	}
}

Check out the full proposal on GitHub

itaiferber · January 12, 2024, 8:30pm

This looks good, and seems like a no-brainer convenience addition to me!

Another point for the RangeSet → IndexSet conversion having an argument label while the IndexSet → RangeSet one not: it's not a hard and fast rule, but lossless (non-failable) conversion initializers typically don't have a label, while lossy (failable) ones do. [Typically, such a label for failable initializers tends to be slightly more descriptive about the cause of failure (e.g., Int.init(exactly:)), but I can buy the argument for consistency with the existing integersIn: label elsewhere]

Karl · January 14, 2024, 12:21pm

Personally, failable/non-failable is not part of the criteria I use to decide whether a parameter should have a label -- instead, it's solely about clarity at the point of use, and whether the label adds value or adds noise.

If the label describes what would otherwise be a non-obvious failure condition, it can more strongly justify its presence. If it describes the obvious behaviour, it can make code harder to read because I'm expecting it to tell me about some non-obvious quirk to its behaviour.

For instance, the standard library's String -> Int conversion initialiser does not have a label, and it's fine because it's obvious that it is attempting to parse an integer from a string, and that not every string will contain an integer. If we decided that this was non-obvious and needed a label, things might look like:

if let i = Int(integer: someString) { ... }
if let i = Int(number: someString) { ... }
if let i = Int(numericValue: someString) { ... }

And (again, according to my personal value judgement), I don't think those labels communicate anything meaningful; they're noise. When I read those labels, I pause for a fraction of a second to think "okay, why is that label numericValue there? What is being communicated by the phrase [Int, numericValue]?"

When it doesn't have those labels, I don't have that pause. It's doing a straightforward conversion.

if let i = Int(someString) { ... }

--

Anyway, in my opinion, the proposed integersIn label also does not add value. If you remove the label, it reads just as well, if not better.

func reloadSections(_ sections: RangeSet<Int>) {
	guard let indexSet = IndexSet(sections) else {
		fatalError("Invalid section numbers passed to reloadSections(_:). Sections must be non-negative integers")
	}
	// ...
}

What about a different label, then? Well, the fact that IndexSet can only contain non-negative integers might seem non-obvious and worthy of calling out at an abstract level, but the name IndexSet already suggests that the integers have to be valid indexes in a collection, which is reinforced by the type's documentation summary:

A collection of unique integer values that represent the indexes of elements in another collection.

The conversion will only fail when the RangeSet includes invalid indexes (unsupported by IndexSet), so from that perspective I think the conversion and cause of failure are straightforward enough not to need a label.

itaiferber · January 14, 2024, 3:33pm

Great point — I had hoped to express something similar, but your description is clearer and much more explicit.

I agree that this is non-obvious and should be called out, but I disagree that this fact is quite clear enough to benefit from dropping the label. In general, Swift's definition of Index differs pretty significantly from IndexSet's (obviously, a consequence of origins), and I personally think that IndexSet is confusing enough having its NSUInteger origins imported as Int into Swift; but more specifically, negative integers can be totally valid as Collection indices — there's just a mismatch in definition between that and what IndexSet attempts to represent.

To that end, I think a label could be helpful, but have struggled to find one that I think is sufficiently descriptive without being overly verbose. IndexSet(nonnegativeInd{ex,ic}es:) might be the closest I can think of toward that goal, but it's far from perfect.

(My post should have clarified this: given the difficulty of naming and IndexSet's existing problems, I could be convinced that keeping the [currently not-very-good] integersIn: label in the interest of consistency, at least, would be good enough — at least in the interest of sending folks off to documentation.)

In general, though, I do agree with you.

jmschonfeld · January 16, 2024, 11:45pm

Thanks for the insight! In general, I do agree with you in a similar sense to @itaiferber's reply that in general I'd use a label mainly to introduce clarity at the call site. However I do agree with Itai's comment below:

Coming from the swift collection world I think it is entirely valid for a an index to be negative unlike the traditional Cocoa collections that IndexSet was developed for originally. To denote this, we could perhaps use IndexSet(nonnegativeIntegersIn:) however I find that a bit problematic because: 1) it's quite verbose as Itai also mentioned, and 2) I read that as an initializer that initializes with non-negative integers and drops negative integers during conversion which I think makes the behavior a bit more muddy than clear here.

For those reasons I tend to lean back to integersIn: for the sake of consistency despite it not being entirely ideal, but I'm open to suggestions if we're able to find a label that improves the clarity enough to warrant diverging from the existing API.

jrose · January 17, 2024, 12:28am

Some other bikeshed colors, though I think I personally agree with Karl that the type is signal enough that not all values are valid:

init(validating:)
init(indexesIn:)
init(zeroBasedIndexes:)

The last is “inspired” by Data having an Int index that may not start at zero, which means selecting a RangeSet<Int> from a Data may produce different results from selecting an IndexSet from an NSData. This is no different from integer or range operations on Data/NSData, but still.