I find string handling in Swift 4 a bit too weird. I don't want this to be a generic complaint so I will give two specific examples.
Note that none of these have anything to do with Index
or Substring
. Just the way unicode points are handled.
Concrete example #1
I want to create an attributed string with an icon attachment. To put an attachment we need a specific character whose unicode is given by NSAttachmentCharacter
.
In general you don't have to use that because the API provides the following convenience constructor:
NSAttributedString(attachment: attachment)
But I want a bit more control, for example I want to add other attributes in there. For example:
let iconAttrs: [NSAttributedStringKey : Any] = [
.attachment: attachment,
my_custom_key: my_custom_value
]
So I need to create the string manually, but I can't just do this:
NSAttributedString(string: " ", attributes: iconAttrs) // won't work!
The string has to be the attachment character! So how I put it in there?
I don't know, but the minimum thing I could manage to make work was this:
String(Character(Unicode.Scalar(NSAttachmentCharacter)!))
I don't know about you but this seems insane?
Concrete example #2
Given a Character
, how do we get the unicode value?
This is what I managed to make:
func charCode(char a: Character) -> Int {
return Int(a.unicodeScalars.first!.value)
}
Why is Character
not mapped directly to a unicode rune?
Why do I care about this? Because I want to convert between "cases" of non-latin text (specifically, convert Japanese text from Hiragana<->Katakana back and forth).
I need to take the unicode point for a character, check if it's within a certain unicode block, and if so, convert it to the other block by adding the "offset" value
But somehow there's too much ceremony to do what should otherwise be very simple.
For example, once I have the unicode point that I want to offset, how do I add the offset to it?
This is what I managed to make:
func scalar(_ a: UnicodeScalar, add b: Int) -> UnicodeScalar {
let newCode = Int(a.value) + b
return UnicodeScalar(UInt32(newCode))!
}
This function should not even exist. I'm just adding two numbers together.
Why isn't the UnicodeScalar just a number?
Now, part of the boiler plate here is that I'm converting the UInt32
to Int
, but that's because the thing I want to add is an offset
between two unicode blocks, and it could be a negative number.
So to put all the things together, my solution looks like this:
let hiraganaMinusKatakana = charCode( "あ") - charCode("ア")
func katakanaToHiragana(_ c: UnicodeScalar) -> UnicodeScalar {
return scalar(c, add: hiraganaMinusKatakana)
}
func hiraganaToKatakana(_ c: UnicodeScalar) -> UnicodeScalar {
return scalar(c, add: -hiraganaMinusKatakana)
}
func normalize_to_hiragana(_ input: String) -> String {
var out = "".unicodeScalars
for c in input.unicodeScalars {
if KatakanaBlock.contains(Character(c)) {
out.append(katakanaToHiragana(c))
} else {
out.append(c)
}
}
return String(out)
}
I'm probably doing some wasteful temporary allocations here somewhere.
The API is too weird, but it's not weird because it's trying to help you write performant code, it's just weird.
So when you do something like this, the most obvious solution is probably not the most performant solution.