# Some light background

There currently exists 6 types in the standard library that represent intervals.

5 of them conform to `RangeExpression`:

1. `Range`
2. `ClosedRange`
3. `PartialRangeFrom`
4. `PartialRangeThrough`
5. `PartialRangeUpTo`

And a 6th `UnboundedRange` that is like a mystery.

# My beefs with ranges

1. They do not represent all kinds of intervals. Here are the ones they are able to represent:

``````let r1 = 0...1 // r1 is [0, 1]
let r2 = 0..<1 // r2 is [0, 1)
let r3 = 0...  // r3 is [0, ∞)
let r4 = ...1  // r4 is (-∞, 1]
let r5 = ..<1  // r5 is (-∞, 1)
let r6 = ...   // r6 is (-∞, ∞), I'm not sure if the assignment works tho
``````

These "range" types cover only 6 kinds of intervals.

An interval is defined by its endpoints' values, and the openness of its boundaries. An endpoint can be either bounded (non-infinite) or unbounded (infinite), and a boundary either open or closed. Thus, there are either 8 (if an infinite endpoint cannot have a closed boundary) or 16 (if an infinite endpoint can have a closed boundary) kinds of intervals. No matter which ever rule we use, the "range" types are at least 2 short.

2. You can not easily iterate a `RangeExpression`-conforming type backwards.

Since Swift removed C-style for-loops in favour of for-in loops, it has become harder to iterate descendingly than ascendingly. However, sometimes it is convenient and beneficial to do something like this:

``````for index in 5..>0 {
// for-loop block...
}
``````
3. All 5 existing `RangeExpression`-conforming types are highly similar in terms of both their use cases and their structures. However, the only thing they share code-wise is the `RangeExpression`-conformance. Combined with that `RangeExpression` can only be used as a generic constraint because of its associated type `Bound`, it leads to lots of boilerplate code when something needs to work with more than 1 "range" type:

For example, starting from line 943 of Range.swift are these 2 functions (with comments omitted):

``````@inlinable
public func overlaps(_ other: Range<Bound>) -> Bool {
let isDisjoint = other.upperBound <= self.lowerBound
|| self.upperBound <= other.lowerBound
|| self.isEmpty || other.isEmpty
return !isDisjoint
}

@inlinable
public func overlaps(_ other: ClosedRange<Bound>) -> Bool {
let isDisjoint = other.upperBound < self.lowerBound
|| self.upperBound <= other.lowerBound
|| self.isEmpty
return !isDisjoint
}
``````

Then, starting from line 444 of ClosedRange.swift are 2 more of these functions (again with comments omitted):

``````@inlinable
public func overlaps(_ other: ClosedRange<Bound>) -> Bool {
let isDisjoint = other.upperBound < self.lowerBound
|| self.upperBound < other.lowerBound
return !isDisjoint
}

@inlinable
public func overlaps(_ other: Range<Bound>) -> Bool {
return other.overlaps(self)
}
``````

There will need to be at least 12 more near-identical instance methods to cover all possible overlap tests between "range" types if we ignore polarity (e.g. only `closedRange.overlaps(partialRangeFrom)`, but no `partialRangeFrom.overlaps(closedRange)`) and `UnboundedRange`. Then 18 more on top of that if we include `UnboundedRange`, or 33 more if we both include both polarity and `UnboundedRange`.

In addition to all this, it is even harder to represent an interval that can be either bounded or unbounded or open or closed, using just a variable (or constant). You have to work around it with either a conditional logic or a wrapping type.

For example, starting from line 13 of VersionSetSpecifier.swift in Swift Package Manager is this enum:

``````/// An abstract definition for a set of versions.
public enum VersionSetSpecifier: Hashable {
/// The universal set.
case any
/// The empty set.
case empty
/// A non-empty range of version.
case range(Range<Version>)
/// The exact version that is required.
case exact(Version)
/// A range of disjoint versions (sorted).
case ranges([Range<Version>])
}
``````

It needs a custom enum just to work with different kinds of version intervals. If it needs to work with closed intervals, it will have to either add a new case, or convert a `ClosedRange` instance to a `Range` instance (and this only works if `Bound` is `Strideable`). Also, `.ranges` only works with `Range`. If the set ever needs to contain both `ClosedRange` and `Range`, then there needs to be another custom type and/or logic.

4. `Range` is ambiguous. It at least misled me to think that it's a big umbrella over all "range" types. All other "range" types have descriptive names that explain what kind of ranges they are. `Range` is probably better called `LeftClosedRightOpenRange`, or something similar.

5. `UnboundedRange` is a mystery. It works in many places where other "range" types work, but there is no indication of how it works. It's a type alias of (`UnboundedRange_`) -> (), which in turn is a complete mystery and has almost no footprint in the entire codebase. The unbounded range operator `...` is declared as a postfix operator but acts kind of like an expression. And it's not callable. The only documentation it has (or at least the only bit I can find) is a `TODO`: "How does the `...` ranges work?"

UPDATE: As it turns out, `UnboundedRange` isn't that mysterious after all. See Standard Library thread: How does UnboundedRange work?

# My solution

I started a draft pull request that introduces a new generic type `Interval`:

This new type should be void of all listed problems above.

It's able to represent all kinds of intervals by holding 4 constants:

``````public let lowerBoundary: Boundary
public let lowerEndpoint: Endpoint
public let upperEndpoint: Endpoint
public let upperBoundary: Boundary
``````

It can be iterated backwards by using a set of inverse interval operators, or by manually setting a flag:

``````public var isInverse: Bool
``````

An unbounded interval is not much different from other intervals. It just has unbounded endpoints and open boundaries. It's debatable whether an unbounded interval can be closed. Also, because an unbounded interval is not of a special type, it contains the same wrapped type information as the non-unbounded intervals.

Lots of basic things are already implemented in the draft pull request, but there is still a lot to work on. There are a lot of bikesheddable things too. For example, the interval operators are a bit unwieldy, compared to range operators. And the current `Interval` design doesn't allow all the "range" types to become a type alias, like how `CountableRange` works.

Overall, I would like to seek the community's opinions and feedbacks, and improve it, before turning it into a proper pitch.

5 Likes

You can iterate over `Range` or `CountableRange` backwards by using `.reversed()`. `Range` is a bidirectional collection so this generates efficient code, as shown in this compiler explorer demo.

You cannot iterate the other range types because they are infinite in size.

9 Likes

This is true, but comes with drawbacks.

First, it's harder (albeit slightly) to iterate descendingly than ascendingly.

In a C-style for-loop, both ascending and descending iterations have the same complexity:

``````for(var i = 0; i < 10; i++) {
// ...
}

for(var i = 9; i >= 0; i--) {
// ...
}
``````

Whereas with a for-in-loop using a range expression, iterating descendingly is slightly more complex than ascendingly:

``````for i in 1..<10 {
// ...
}

for i in (1..<10).reversed() {
// ...
}
``````

Second, `range.reversed()` returns a `ReversedCollection<Range>`, not `Range`, which means it does not work in every place where `Range` works.

There are some technical barriers to this change, like source and ABI compatibility, but I’m going to put those aside, because there’s a more important issue: we wouldn’t want this design. We’ve considered similar ones in the past, but we rejected them in favor of the current solution, where decisions about boundary and direction are baked into the type. Let me explain why.

1. The current design enforces the correctness of iteration.

For instance, suppose you try to do this:

``````let myRange = ..<3
for i in myRange { ... }
``````

This iteration is impossible to perform because it has no starting value—the last value will be 3, but what will the first value be? Fortunately, the compiler knows that it can’t iterate over a `PartialRangeTo`, so it rejects this code.

But if this was done with your `Interval` type:

``````let myInterval = Interval(fromUnboundedTo: 3, .exclusive)
for i in myInterval { ... }
``````

Then there would be no indication in the type system that this iteration is impossible to perform, so the mistake could only be caught by a runtime crash instead of a compile-time diagnostic.

2. The current design makes loops over ranges much faster.

For instance, currently when the compiler sees this:

``````for i in myRange { ... }
``````

It knows that it can turn that into something like:

``````var i = myRange.lowerBound
while i < myRange.upperBound {
...
i &+= 1
}
``````

And in general, it actually does—a Swift `for i in someRange` loop is nearly always as fast as a C `for` loop.

With your interval type, Swift would have to generate more flexible—and therefore much slower—code to account for all the possible variations in an `Interval<Int>`:

``````var i: Int
let endpoint: Interval<Int>.Endpoint
let startBoundary, endBoundary: Interval<Int>.Boundary
let step: Int
if myInterval.isInverse {
step = -1
startBoundary = myInterval.upperBoundary
endBoundary = myInterval.lowerBoundary
endpoint = myInterval.lowerEndpoint
switch myInterval.upperEndpoint {
case .bounded(let bound):
i = bound
case .unbounded:
fatalError(“Can’t iterate from an unbounded endpoint”)
}
}
else {
step = 1
startBoundary = myInterval.lowerBoundary
endBoundary = myInterval.upperBoundary
endpoint = myInterval.upperEndpoint
switch myInterval.lowerEndpoint {
case .bounded(let bound):
i = bound
case .unbounded:
fatalError(“Can’t iterate from an unbounded endpoint”)
}
}

if startBoundary == .open {
i += step
}

while ({ () -> Bool in
guard case .bounded(let end) = endpoint else { return true }
switch (myInterval.isInverse, endBoundary) {
// I actually may have flipped some of the tests here, but you know what I mean.
case (true, .open):
return i > end
case (true, .closed):
return i >= end
case (false, .open):
return i < end
case (false, .closed):
return i <= end
}
}()) {
...
i += step
}
``````

(Note the `fatalError` calls—those are point 1.)

Even if you provide methods to perform these operations, they will still end up executing the same amount of code—just better encapsulated. The only way to actually reduce the amount of code executed when iterating over an `Interval` would be by constant folding (the optimizer determining, for instance, that `isInterval` will always be `false` for a particular loop and removing code that only runs when it’s `true`), but unlike information embedded in the individual types like `Range`, information from constant folding can’t cross function call boundaries unless the loop is inlined into the caller. So loops over `Interval`s would frequently be much slower than loops over `Range`s.

It’s tempting to make a maximally configurable replacement for the `Range` types—I’ve had the same idea in the past. But these types have the design they have for specific, compelling reasons.

27 Likes

Thanks for the detailed feedback.

These are definitely 2 points that I neglected when designing `Interval`, and it seems like it's hard to improve or get around them.

^ I do have a question for (or perhaps an argument against) the first point, though. I think the compiler proves the correctness of iteration by checking if the type conforms to `Sequence`? Left-unbounded ranges fail at compile time if you try to iterate over it, because they don't conform to `Sequence`. But I think it becomes problematic when it comes to `PartialRangeFrom`. `PartialRangeFrom` conforms to `Sequence`, and so does `ReversedCollection<PartialRangeFrom>`. As a result, you can have something like this:

``````for i in (1...).reversed() {
print(i)
}
``````

It compiles, and runs non-stop. Instead of throwing an error, it just keeps running and never enters the for-loop body. Is this behaviour fixable with the current design? And if it's not, would the current design still be better than `Interval` in terms of point 1?

Here is the assembly code:

It's quite long, and I don't know much of x86 instruction set to understand it.

`PartialRangeFrom` is not a `BidirectionalCollection`, so calling `reversed()` on it does not produce a `ReversedCollection`.

Instead it produces an `Array`.

The program hangs while attempting to materialize that array. This is exactly the same behavior you would see when converting any other very long (or endless) sequence into an array.

• • •

To “fix” the behavior, we could introduce a deprecated overload:

``````extension PartialRangeFrom where Bound: Strideable, Bound.Stride: SignedInteger {
@available(*, deprecated, message: "Cannot reverse a PartialRangeFrom")
func reversed() -> [Bound] { fatalError() }
}
``````

That way attempting to call `reversed` on a `PartialRangeFrom` would give a compile-time warning and trap at runtime, instead of hanging.

The hang would still happen if `reversed` was called on such a value in a generic context, but such is life.

• • •

Interestingly, this code traps at runtime for me (Xcode 11.3.1, Swift 5.1.3):

``````let a = (1 as Int8)...
let b = a.reversed()  // Execution was interrupted, reason: EXC_BAD_INSTRUCTION
print(b.count)
``````

And indeed so does this:

``````let a = Array((1 as Int8)...)
``````

Or even this:

``````let a = ((1 as Int8)...).max()
``````

The implication is clearly that partial ranges in fact model truly unbounded, one-sided intervals, not restricted to the set of representable values for the element type.

7 Likes
``````0...10
0<..10
0..<10
0<.<10

...10
..<10

0...
0<..

...
``````

I'm not sure which 8 or 16 you meant, but these are the nine we should have. There have been discussions here before about the lack of ranges with open lower bounds, but no real reason was provided for their absence as far as I remember.

Your count of 9 is correct. My math must be off.

I got the 16 from (2²)², because there are 2 endpoints, and each can be either bounded or unbounded, and open or closed. But that includes things like [0, ∞], which need to be removed from the count (unless we allow closed unbounded intervals). And that must be where my calculation went wrong. I probably counted [-∞, ∞] twice.

I don't know if it's much of a reason, but as far as the operators grammar goes, `<..` and `<.<` are invalid.

You can also define custom operators that begin with a dot ( `.` ). These operators can contain additional dots. For example, `.+.` is treated as a single operator. If an operator doesn’t begin with a dot, it can’t contain a dot elsewhere. For example, `+.+` is treated as the `+` operator followed by the `.+` operator.

Also because of this rule, I had to make up some quite awkward opertors that uses "~" in place of ".":

Ah I didn't realise operators had to start with a . if they wanted to contain one. Let's hope that's not the reason we don't have `<..` though!

By the way, your operator `>=~~~` should probably be written something like `~~~~~` since it would correspond to `...`, no? In normal swift `.` signifies a range that is closed on that side.

I've considered something similar. In the TODO comment right above operator declarations, I left some notes on using bullet points instead of dots to circumvent the grammar, and shorten the operator to 3-character long. The problem is operators like `...`, `~~~`, and `•••` lack the directionality that `<=~<=` and `>=~>=` have.

1 and 2 seem like arguments not to unify the partial (unbounded) and bounded families. But I'd like to take a weaker position; why not unify `ClosedRange` and `Range`? Or, an even weaker position: unifying ClosedRange and Range expressions.

In the spirit of "my beefs with ranges", let me lay out the drawbacks of ClosedRange/Range:

1. `Range` is ambiguous. Following other Swift type families like `Int32`/`Int64`/`Int`, I might expect types like `OpenRange`, `ClosedRange` and `Range` (which is preferred). I understand we don't need a machine-dependent semantics like we do for `Int`, and we want to indicate which type is preferred by giving it a short name, but the inability to clearly indicate off-by-1 bugs in a code review is a real bother.
2. Conversions are hard. Consider this OS function:
``````extension MTLComputeCommandEncoder {
public func setBuffers(_ buffers: [MTLBuffer?], offsets: [Int], range: Range<Int>)
}
``````

I would argue that it is equally sensible for a client to use either syntax:

``````instance.setBuffers(buffers, offsets: offsets, range: 0..<1)
instance.setBuffers(buffers, offsets: offsets, range: 0...0)
``````

However, only one syntax is allowed here. Someone could write their function `<B: RangeExpression>` but in practice they didn't, everyone just uses `Range`. If range expressions were convertible or we had an `ExpressibleByRangeLiteral` type conformance it would go a long way.

Although this is more an argument in favor of a broader unification, I understand in theory why we want a expressive and therefore safer type system, but in practice I don't think we live in that world. We have a preferred type, everyone uses `Range` and lives dangerously. Presenting again the OS function:

``````extension MTLComputeCommandEncoder {
public func setBuffers(_ buffers: [MTLBuffer?], offsets: [Int], range: Range<Int>)
}
``````

I can call this with

``````encoder.setBuffers([],offset:[],range:0..<25)
``````

and not only will it compile, it actually UBs with an out-of-bounds read inside Metal. (For those unfamiliar with Metal, the function is supposed to assign the elements of the first parameter to an internal array with the indices in the last parameter, but here the sizes mismatch.)

Arguably this function should take a `PartialRangeFrom` and infer the final index from the array count rather than having 2 parameters that may disagree. In doing so it would be more obvious when implementing this function that you can't iterate over the range parameter without first converting to a complete/iterable type by supplying an end index from the first parameter. I believe this is the bug @beccadax envisions as being prevented by an expressive typesystem.

I believe there are ObjC bridging reasons to use `Range` here. It's unfair of me to present some ObjC bug as not being fixed by Swift, but it illustrates how these types are actually used in the wild. When there's better tooling for preferred types, it tilts the playing field in favor of their use, away from what might be the "natural" type we might prefer.

Separately from bridging, there is also a potential callside argument for preferring "the unnatural type" here. That is, if Metal takes an unbounded range and infers the end index from the array's count, it's easier to implement the function correctly, but calling it becomes harder:

``````encoder.setBuffers(someBuffers,offset:someBuffers.map{0},range:0...)
encoder.setBuffers(moreBuffers,offset:moreBuffers.map{0},range:<#?#>) //what goes here again?
``````

vs by requiring redundant information, we have to remember to check it's the same inside the function, but callers can more easily reason about the arguments:

``````encoder.setBuffers(someBuffers,offset:someBuffers.map{0},range:0..<5)
encoder.setBuffers(moreBuffers,offset:moreBuffers.map{0},range:5..<10) //copy from line above
encoder.setBuffers(evenMoreBuffers,offset:moreBuffers.map{0},range:10..<17)
``````

I can see arguments for either style. In any case, I think most uses of `Range` does not truly consider which type really ought to be used. In cases where it is considered, I think the considerations may be at some variance with what choice might produce the best local typesafety.

I do think a lot of this can be fixed without compromising type expressivity or deeply overhauling the stdlib. For example,

• we could support implicit conversions between (some) ranges (or perhaps range expressions) which would "unify" the syntax callside, without compromising the codegen or safety
• Provide a more thorough ObjC bridge for the non-preferred ranges
• `typealias OpenRange = Range`

I think the Metal example shows a plausible motivation for unifying bounded and unbounded ranges (or expressions) as well, but this is perhaps a bit more specialized, and may be better handled by generics or overloads than new language surface area.