I accept this argument and retract my previous argument about alternative encodings.
I would support an ExpressibleByUnicodeScalarLiteral
improvement with new _ExpressibleByBuiltinUnicodeScalarLiteral
conformances:
extension UTF8 {
//get rid of the old typealias to UInt8. Leave UInt8 alone!:
struct CodeUnit: _ExpressibleByBuiltinUnicodeScalarLiteral, ExpressibleByUnicodeScalarLiteral {
//8-bit only, compiler-enforced. Custom types can also use UTF8.CodeUnit as its UnicodeScalarLiteralType:
typealias UnicodeScalarLiteralType = CodeUnit
var value: UInt8
}
}
This would use the well-known double quotes. It would add compiler-enforced 8- and 16-bit code unit types.
It would not pollute Integer APIs at all.
The only problem of course is that changing the Element of String.UTF8View etc. would be a breaking change (It is using UTF8.CodeUnit
). Maybe there needs to be a String.betterUTF8View
(or whatever other name) and the old utf8View
etc. would just be deprecated.
Everyone that wants to mess around with code units can then use types like [UTF8.CodeUnit]
instead of [UInt8]