Collections of ranges are clunky

Swift has (from what I can tell), 5 range operators, corresponding 5 range types:

  • ...0 produces a PartialRangeThrough<Int>
  • ..<0 produces a PartialRangeUpTo<Int>
  • 0..<10 produces a Range<Int>
  • 0...10 produces a ClosedRange<Int>
  • 10... produces a PartialRangeFrom<Int>

Switch statements can handle heterogeneous case types just fine, e.g.:

switch 123 {
	case ...0: print("...0")
	case ..<0: print("..<0")
	case 0..<10: print("0..<10")
	case 0...10: print("0...10")
	case 10...: print("10...")
	default: break
}

However, making collections of heterogeneous range types is really hard, because they're mutually incompatible types. They're all subtypes of RangeExpression, but that protocol has an associated type (Bound), so it can't be used in a collection with first cooking up an AnyRange type eraser, and wrapping all the values in it. Consider this example, of trying to encode Canada's federal income tax margins:

postfix operator %
postfix func % (d: Double) -> Double { return d / 100.0 }

let federalMargins: KeyValuePairs<ClosedRange<Int>, Double> = [
	0 ... 47_630: 15.0% ,
	47_630 ... 95_259: 20.5%,
	95_259 ... 147_667: 26.0%,
	147_667 ... 210_371: 29.0%,
	210_371 ... Int.max: 33.0%,
]

The last line can't be 210_371..., because you get a type error:

error: cannot convert value of type 'PartialRangeFrom' to expected dictionary key type 'ClosedRange'

I understand that the current implementation has some pleasant performance upsides. The open vs. closed characteristic is encoded in the type, so each partial range instance only needs to store one T property, and each "normal" range only needs to store two T properties, without needing any extra space (and associated necessary padding) for storing flags like isOpen.

However, there are usability issues that I can't find any workarounds for. What would you guys recommend here? Am I missing something, or does this area of Swift need some TLC?

1 Like

Since you posted under “Using Swift”, I’m treating this as a request for ideas about how best to implement your particular example. But parts of what you wrote sound more like a desire to brainstorm for pitching improved language features. If that was your intent, we can move this to the “Evolution → Pitches” category instead.


There is a lot of duplication in the way you have defined it, with every bound repeated. That style is particularly vulnerable to typos that create gaps or overlaps. I would have handled the data in terms of thresholds instead of ranges anyway:

// Defined like this:
let federalThresholds: [(threshold: Int, rate: Double)] = [
	(0, 15.0%),
	(47_630, 20.5%),
	(95_259, 26.0%),
	(147_667, 29.0%),
	(210_371, 33.0%),
]

// Used in ways like this:
func rate(forIncome income: Int) -> Double {
    return federalThresholds.last(where: { $0.threshold <= income })!.rate
}
3 Likes

Indeed, but only in the case that I'm doing something totally wrong.


Actually I rather liked the verbosity, because it makes the mapping to the spec easier, although it's admittedly not idea.

Interestingly your rate(forIncome:) function is encoding a very common misconception. Marginal tax rates work by taxing different brackets of your income at different rates. Your function is picking the top marginal rate for given income, but that's not sufficient to calculate the income tax.

If I were to follow your approach of only specifying the upper bounds of each margin, then I would write some code to transform that into (lower, upper) pairs (ranges). That would then be used to compute how much money is in each margin, and tax that sub amount by the corresponding marginal rate.

But this regardless, this is just a toy example. Heterogeneous ranges come up occasionally, and working around them is rather difficult.

1 Like

RangeExpression already exists, which lets you use contains on any of the 5 range types, but the type system still needs work to let you nicely deal with the associated type. This would help a lot in the general case, but I'm not sure that completely works out for your example, because you'll probably need operations other than contains, e.g. finding the upper bounds of the closed ranges so you can calculate the effective tax rate from the marginal tax rates. So perhaps you are at the level of complexity where you should encapsulate the information in a specifically designed type rather than piecing together an ad-hoc one.

I could say that’s what the CRA told me, but I won’t. :stuck_out_tongue: Instead I’ll let you have that one since I live in Canada, deal with the very tax from your example, and should know better (or rather should have taken more care with my own toy example; I do know that). Had I been from anywhere else though, there would have been no reason for me to infer those details. Not all countries even have income tax whatsoever.

Thanks for the reminder to watch that contrived examples don’t inadvertently teach readers bad habits or ideas about other subjects.

The thing that gets closest to what you want with using mentioned future Swift features would be something like:

typealias TaxBracket = any RangeExpression<.Bound = Int> & Hashable
let pairs: [TaxBracket: Int] = [ /* insert same pairs here */ ]

So what you want requires more features on the generics side of things :(

Though, type-wise, these solutions still allow for overlapping brackets, which I don’t think it’s desirable... I suppose a threshold list as mentioned above might be more “exact”

Although RangeExpression exists and an existential would cover over the type mismatch, it feels like overkill for me as far as abstraction cost goes. I think it'd be more efficient and more immediately useful to have a convenient way of converting RangeExpressions of numeric type to a common concrete range type. ClosedRange and OpenRange are different types not only for performance reasons, but because there'd otherwise be no generic way to represent all possible ranges including Int.max (since a ClosedRange can't represent empty ranges, and an OpenRange can't represent a range with the largest possible value). That's a concern for generic use cases like slicing collections, but less likely to be a concern in a concrete situation like the OP.

Maybe we could add a static method to ClosedRange and OpenRange that converts a RangeExpression of the element type to the corresponding concrete range? That would allow:

let federalMargins: KeyValuePairs<ClosedRange<Int>, Double> = [
	.for(0 ..< 47_630): 15.0% , // Converts OpenRange to ClosedRange(0...47_629)
	.for(47_630 ..< 95_259): 20.5%,
	.for(95_259 ..< 147_667): 26.0%,
	.for(147_667 ..< 210_371): 29.0%,
	.for(210_371...): 33.0%, // Converts PartialRangeFrom to ClosedRange(210_371 ... .max)
]
2 Likes

This is made a little trickier by the detail that ClosedRange has its own index type, rather than using the bound type. Since RangeExpression.relative(to:) requires that the range expression matches the passed collection's index type, there isn't an affordance right now that can perform that translation.

You could perhaps go through a Range, but I don't think we have a way to distinguish between expressions like 5... and 5..<Int.max in that case, and 5...Int.max placed into a Range traps outright.

1 Like

I don't know if it makes sense for this use case, but if all you care about after constructing the values of the different types is applying a specific operation to them, then another useful trick for the toolbox can be to use a closure to erase the type:

func inRange<R: RangeExpression>(_ range: R) -> (Double) -> Bool
where R.Bound == Double {
  return { value in range.contains(value) }
}

let federalMargins: KeyValuePairs<(Double) -> Bool, Double> = [
  inRange(0...47_630): 15.0%,
  inRange(47_630...95_259): 20.5%,
  inRange(95_259...147_667): 26.0%,
  inRange(147_667...210_371): 29.0%,
  inRange(210_371...): 33.0%,
]

for (check, rate) in federalMargins {
  if check(100_000) {
    print(rate)  // 0.26
  }
}

Whether that's simpler/better than creating an Any* wrapper probably depends on the situation.

Edit: I guess in this particular case, even this works just as well, without the helper function:

let federalMargins: KeyValuePairs<(Double) -> Bool, Double> = [
  (0...47_630).contains: 15.0%,
  (47_630...95_259).contains: 20.5%,
  (95_259...147_667).contains: 26.0%,
  (147_667...210_371).contains: 29.0%,
  (210_371...).contains: 33.0%,
]
10 Likes

That's clever! This still seems like something that shouldn't require introducing virtual dispatch, ideally.

3 Likes

You can take advantage of the fact that Range is sliceable, and slices share indexes with their parent (though this has the boundary problem with Int.max):

let ints = Int.min ..< Int.max

let federalMargins: KeyValuePairs<Range<Int>, Double> = [
  ints[0...47_630]: 15.0%,
  ints[47_630...95_259]: 20.5%,
  ints[95_259...147_667]: 26.0%,
  ints[147_667...210_371]: 29.0%,
  ints[210_371...]: 33.0%,
]
1 Like

Indeed, it seems like a custom type would be appropriate here, but that's unfortunately, because ranges are a nice "common currency" type, and would be perfect to express something like this, if only they were more capable.

Heh, you're welcome. I've come to more carefully label toy examples, over-simplifications, etc. in my toy examples. Because:

  1. Inevitably someone will freak out over them. E.g. "you should never be force unwrapping", ya man, but error handling isn't the point of my post, this is quick back-of-the-napkin-code.
  2. I don't want to unintentionally give passers-by an idea that code should be written as I showed it

Hey Joe,

I've been working on some alternate implementations of ranges. I've come up with two variants so far:

  1. SimpleAnyRange, which is implemented similarly to the current ClosedRange, and OpenRange, with lowerBound and upperBound properties. But rather than those properties being merely a T: Comparable, they're a RangeBound<T: Comparable>
    • RangeBound is an enum with 3 cases: .unbounded, .included(T), .excluded(T). SimpleAnyRange can model all 9 possible combinations of these.
    • SimpleAnyRange is 25 bytes (32 after padding)
  2. PackedAnyRange, which uses a pretty nasty 9-case enum to explicitly enumerate the 9 combination of 2 range bounds.

Both of these types act like universal range types, but will probably perform way better than mere type erasers (pending testing), because they're like hand-rolled specializations. There's no virtual dispatch, heap allocation, etc.

I think that types like this could drastically simplify the whole range type "landscape", with minimal performance cost (1 word extra, for the case of PackedAnyRange). I haven't seen people storing large collections of ranges, so that seems like an acceptable trade-off. Even then, when necessary, you could have 2-word sized specializations like ClosedRange and AnyRange, which lift their bounds' details into the type system to save space per-instance.

What are your thoughts on something like this?

2 Likes

Oh this is cool. I never thought to use closures like this, as an ad-hoc type eraser. It works great when there's only a single API that you need to access.

Yeah, it'd be interesting to see how much cost using a single concrete type like this has in real code. We might be able to use optimization passes to specialize out branches for the unused cases when we're able to track a range down to its original literal value (which is what type specialization manages to do for us today). In the cases where we can't specialize, and we go through RangeExpression witness tables at runtime, then a concrete type would likely end up faster.

Do you have any suggestions for how to benchmark these new types?

Also, do you know how I can run the same set of unit tests, on two different type? I've used parameterized tests in JUnit before, but I couldn't figure out a similar thing for XCTest. Worst-case I might end up using a gyb template file

I would try a generic subclass of XCTestCase with the desired parametric tests, and then a normal subclass that instantiates the generic one with the desired types and runs its tests

Why couldn’t we define all ranges with a single type like this (pseudo code)? Is it because performance reasons?

enum LowerBound<Bound> {
    case none
    case from(Bound)
}
enum UpperBound<Bound> {
    case none
    case upTo(Bound)
    case through(Bound)
}
struct Range<Bound> {
    let lowerBound: LowerBound<Bound>
    let upperBound: UpperBound<Bound>
}
0...5 // .from(0) .through(5)
0..<5 // .from(0) .upTo(5)
0... // .from(0) .none
...5 // .none .through(5)
..<5 // .none .upTo(5)

edit: oh, I’ve just realized it’s actually really similar to what suggests @AlexanderM :sweat_smile:

1 Like