Unify and expand Range and its likes

beccadax · July 14, 2020, 7:52pm

wowbagger:

It's able to represent all kinds of intervals by holding 4 constants:
public let lowerBoundary: Boundary
public let lowerEndpoint: Endpoint
public let upperEndpoint: Endpoint
public let upperBoundary: Boundary
It can be iterated backwards by using a set of inverse interval operators, or by manually setting a flag:
public var isInverse: Bool

There are some technical barriers to this change, like source and ABI compatibility, but I’m going to put those aside, because there’s a more important issue: we wouldn’t want this design. We’ve considered similar ones in the past, but we rejected them in favor of the current solution, where decisions about boundary and direction are baked into the type. Let me explain why.

1. The current design enforces the correctness of iteration.

For instance, suppose you try to do this:

let myRange = ..<3
for i in myRange { ... }

This iteration is impossible to perform because it has no starting value—the last value will be 3, but what will the first value be? Fortunately, the compiler knows that it can’t iterate over a PartialRangeTo, so it rejects this code.

But if this was done with your Interval type:

let myInterval = Interval(fromUnboundedTo: 3, .exclusive)
for i in myInterval { ... }

Then there would be no indication in the type system that this iteration is impossible to perform, so the mistake could only be caught by a runtime crash instead of a compile-time diagnostic.

2. The current design makes loops over ranges much faster.

For instance, currently when the compiler sees this:

for i in myRange { ... }

It knows that it can turn that into something like:

var i = myRange.lowerBound
while i < myRange.upperBound {
  ...
  i &+= 1
}

And in general, it actually does—a Swift for i in someRange loop is nearly always as fast as a C for loop.

With your interval type, Swift would have to generate more flexible—and therefore much slower—code to account for all the possible variations in an Interval<Int>:

var i: Int
let endpoint: Interval<Int>.Endpoint
let startBoundary, endBoundary: Interval<Int>.Boundary
let step: Int
if myInterval.isInverse {
  step = -1
  startBoundary = myInterval.upperBoundary
  endBoundary = myInterval.lowerBoundary
  endpoint = myInterval.lowerEndpoint
  switch myInterval.upperEndpoint {
  case .bounded(let bound):
    i = bound
  case .unbounded:
    fatalError(“Can’t iterate from an unbounded endpoint”)
  }
}
else {
  step = 1
  startBoundary = myInterval.lowerBoundary
  endBoundary = myInterval.upperBoundary
  endpoint = myInterval.upperEndpoint
  switch myInterval.lowerEndpoint {
  case .bounded(let bound):
    i = bound
  case .unbounded:
    fatalError(“Can’t iterate from an unbounded endpoint”)
  }
}

if startBoundary == .open {
  i += step
}

while ({ () -> Bool in
  guard case .bounded(let end) = endpoint else { return true }
  switch (myInterval.isInverse, endBoundary) {
  // I actually may have flipped some of the tests here, but you know what I mean.
  case (true, .open):
    return i > end
  case (true, .closed):
    return i >= end
  case (false, .open):
    return i < end
  case (false, .closed):
    return i <= end
  }
}()) {
  ...
  i += step
}

(Note the fatalError calls—those are point 1.)

Even if you provide methods to perform these operations, they will still end up executing the same amount of code—just better encapsulated. The only way to actually reduce the amount of code executed when iterating over an Interval would be by constant folding (the optimizer determining, for instance, that isInterval will always be false for a particular loop and removing code that only runs when it’s true), but unlike information embedded in the individual types like Range, information from constant folding can’t cross function call boundaries unless the loop is inlined into the caller. So loops over Intervals would frequently be much slower than loops over Ranges.

It’s tempting to make a maximally configurable replacement for the Range types—I’ve had the same idea in the past. But these types have the design they have for specific, compelling reasons.