Give Int an initializer that takes a single Character

As noted by @cukr above, we can already do this:

let reallyLongNumber = "12940214297391293"
let digits = reallyLongNumber.map { $0.wholeNumberValue! }

I know that, I was just pointing out that it might be more intuitive to provide an initializer that does that to match that of strings. There are also a few things that an initializer would be able to provide that the current solution does not.

Firstly, the behaviour of wholeNumberValue does not match that of Int's string initializer

let char: Character = "①"
print(char.wholeNumberValue)
// Prints "Optional(1)"
print(Int("\(char)"))
// Prints "nil"

This creates a bit of a discrepancy because parsing an integer from a string has behaves differently from deriving an integer of each individual character of said string, specifically, wholeNumberValue is less restrictive.

Also, using wholeNumberValue does not allow you to get the value of the character in any base other than ten:

let char: Character = "f"
print(char.wholeNumberValue)
// Prints "nil"
print(Int("\(char)", radix: 16))
// Prints "Optional(15)"

Because of this, I think an initializer on FixedWidthInteger, with the same behaviour as the one that takes a string and a radix with a default value of 10, is in order for Character. Also, an initializer seems to be the more preferred method of type conversion over properties throughout the standard library. For example, we have an initializer on Double that takes an Int rather than an asDouble property on Int. And this is true for almost all type conversions.

2 Likes

Your point still stands, initializer would be better, but there's also another property, specifically for base 16

FWIW, since String has an inline representation for small strings, there shouldn't be an allocation (memory) overhead for any Character -> String conversion that's just a single Unicode scalar (or even a few Unicode scalars).

3 Likes

Actually, there should be zero allocations in any scenario for Character -> String since that will at worst be a retain. An allocation would only occur if a CoW is triggered. A Character is basically just a String of length 1.

And even then, an allocation would only happen if the contents of the Character exceeded 15 UTF-8 code units in size (otherwise it will be in small form as Jordan pointed out).

5 Likes

Good to hear!

Perf concerns aside, I think there might still be value in providing the initializer. What do you think?

If the principle being invoked here is that APIs which take an instance of a StringProtocol-conforming type should also take a Character, I agree that it sounds perfectly reasonable, but I think we should then do it systematically and not one API at a time.

Sounds reasonable at first glance, see below.

It might vary on an API-by-API basis. This does come up for APIs that take Collection and corresponding APIs that take Element, such as append(_:) vs append(contentsOf:).

The same arguments could be made for Unicode.Scalar (or even UTF{N}.CodeUnit were that a strong type instead of a typealias). We do need to draw the line somewhere, and I have no opinion currently where that line happens to fall.

@lancep would you be interested in a brief survey of StringProtocol APIs and how reasonable it is to add overloads taking various view elements?

Sounds like a good plan! I'll try to tackle a survey this weekend