What's the purpose of making `ClosedRange.Index` some opaque type, not `Int` like `Array.Index`?

I was surprise this do not work:

let range = 1...100
let slice = range[...5]  // // error: No exact matches in call to subscript

// so ClosedRange<Int>.Index is not `Int`
// had to use index calculation?
let anotherSlide = range[...range.index(range.startIndex, offsetBy: 5)]

Okay, so maybe CloseRange<T>.Index is some opaque type, like String.Index for iterating through ClosedRange<Character>, but it doesn't work:

let characterRange = Character("A")..."Z"
for letter in characterRange {  // Protocol 'Sequence' requires that 'Character' conform to 'Strideable'
    print(letter)
}

Everything I tried to not work: .forEach(), .makeIterator(), is there no way to iterate ClosedRange<Character> (other than turning it into an Array)?

So what's the reason for making ClosedRange<T>.Index opaque, not Int?

A range is not necessarily composed of discrete units; that depends on the underlying type. For example:

let doubleRange = (1.0)...(2.5)

for x in doubleRange {
  // What would this even mean? What are the distinct 
  // values that would live in this range?
}

You can only enumerate the members of the range if the underlying type is Strideable, which is the concept of being able to say "what comes before/after this element". That's why you can't iterate through a range of Characters; because Characters are not Strideable. It's not obvious what the individual steps should be. This might seem surprising at first glance; clearly we know that "B" comes after "A". But Character does not just encompass the English alphabet; it covers any unique grapheme cluster, which may be composed of multiple Unicode code points. For example, what comes after "πŸ”" or "πŸ΄β€β˜ οΈ"?

If you're specifically wanting to enumerate the English alphabet, you may just want to define it like this:

let chars = Array("ABCDEFGHIJKLMNOPQRSTUVWXYZ")

In the case of something like Int, which does have a well-defined concept of "what's next", if you want to get the first five items I'd recommend using something like range.prefix(5). That will work regardless of what the underlying index type is.

2 Likes

:pray:

Is there any simple way to:

Character("A")..."Z"

turn this into:

Array("ABCDEFGHIJKLMNOPQRSTUVWXYZ")

it seems the problem is there is no way to iterate through range of non-stridable type. From a Unicode character, can you get the next character? Ok, maybe this is not possible with Unicode Character? But limited to ASCII character this is passible.

Characters are not Strideable, but UnicodeScalars are:

1> UnicodeScalar("A")...UnicodeScalar("Z")
$R1: ClosedRange<UnicodeScalar> = {
  lowerBound = U'A'
  upperBound = U'Z'
}
1 Like
    for i in UnicodeScalar("A")...UnicodeScalar("Z") {  // Protocol 'Sequence' requires that 'UnicodeScalar' (aka 'Unicode.Scalar') conform to 'Strideable'

    }

??

Sorry, I incorrectly inferred that the fact that ... worked implied Strideable conformance; it only implies Comparable conformance.

Conceptually, it doesn’t make sense to stride between Characters, but each UnicodeScalar is assigned a 32-bit integer code point and thus the type could support Strideable conformance. You can add conformance in an extension:

  1> extension UnicodeScalar: Strideable { 
  2.     public func advanced(by n: Int64) -> UnicodeScalar { 
  3.         return UnicodeScalar(UInt32(Int64(self.value) + n))! 
  4.     } 
  5.     public func distance(to other: UnicodeScalar) -> Int64 { 
  6.         return Int64(other.value - self.value) 
  7.     } 
  8. }
  9> for i in UnicodeScalar("A")...UnicodeScalar("Z") { print("\(i)") }
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z
1 Like

Well, this would be tough for ℝ β€” but don't forget that Double is much more limited, so actually, you could iterate (Apple Developer Documentation).
However, that's probably not what ranges are designed for.

3 Likes

One reason (not sure if there are others) is that ClosedRange should be able to contain Int.max:

let range = 0...Int.max
range.endIndex // no Int left for endIndex

Since the endIndex must represent the past-the-end position, the natural endIndex for this range would be Int.max + 1, which would overflow.

It makes sense then that ClosedRange.Index is an enum that includes an extra case to represent the end index:

extension ClosedRange where Bound: Strideable, Bound.Stride: SignedInteger {
  public enum Index {
    case pastEnd
    case inRange(Bound)
  }
}
8 Likes

Making UnicodeScalar Strideable is misleading. Stride from 'A' to 'z' will give us sequence ABCDEFGHIJKLMNOPQRSTUVWXYZ[]^_`abcdefghijklmnopqrstuvwxyz, which is not obvious.

For better or worse, that matches the behavior of grep and other tools that accept [A-z] as a regular expression range.

1 Like

Could it be that UnicodeScalar is not Stridable built-in is because the values are sparse with "invalid" values in-between, so it cannot be 100% correct as Stridable?

This is not an oversight, I think.

How to convert ClosedRange<Character>:

Character("A")..."Z"

into a String of these Characters?

Character is not Stridable, but UnicodeScalar can be as shown by @ksluder, this is what I come up with:

extension Character {
    var unicodeScalar: UnicodeScalar {
        unicodeScalars.first!
    }
}

extension ClosedRange where Bound == Character {
    var asUnicodeScalarRange: ClosedRange<UnicodeScalar> {
        lowerBound.unicodeScalar ... upperBound.unicodeScalar
    }
}

extension UnicodeScalar: Strideable {
    public func advanced(by n: Int64) -> UnicodeScalar {
        return UnicodeScalar(UInt32(Int64(self.value) + n))!
    }
    public func distance(to other: UnicodeScalar) -> Int64 {
        return Int64(other.value - self.value)
    }
}

extension String {
    init(_ range: ClosedRange<Character>) {
        self.init(range.asUnicodeScalarRange)
    }

    init<S>(_ scalars: S) where S: Sequence, S.Element == UnicodeScalar {
        self.init(scalars.map { Character($0) })
    }
}


print("\n\nTurn Character(\"A\")...\"Z\" into a String")
print(String(Character("A")..."Z"))

FYI, Character("A")..."Z" contains infinite Characters.

(Character("A")..."Z").contains(Character("A")) // -> true
(Character("A")..."Z").contains(Character("A\u{20DD}")) // -> true
(Character("A")..."Z").contains(Character("A\u{20DD}\u{20DD}")) // -> true
(Character("A")..."Z").contains(Character("A\u{20DD}\u{20DD}\u{20DD}")) // -> true
// You can concatenate any combining scalars as many as you want.
2 Likes

:face_with_raised_eyebrow: Yes, as Swift.Character represents Unicode grapheme cluster (so can have more than one UnicodeScalar values). But I'm only using the .first element from the unicodeScalars of Character. So no matter how much is in a Character, only its first scalar value is used, so this is how it works:

print(String(Character("A\u{20DD}\u{20DD}\u{20DD}") ... "Z\u{20DD}\u{20DD}\u{20DD}"))
// prints: ABCDEFGHIJKLMNOPQRSTUVWXYZ

Anyway, I don't mean for this to be generally useful. I am learning why CloseRange<Character> is not iteratable and how to go around it and make it iteratable indirectly via UnicodeScalar

Thanks for pointing things out!

1 Like
Terms of Service

Privacy Policy

Cookie Policy