SE-0243: Codepoint and Character Literals

Surely we can’t avoid working on an area of the language just because many community members lack the background expertise to understand the problem? We don’t declare all of FloatingPoint a no-go zone just because Steve is the only person here who understands floats.

The problem is we don’t have a way to express integer values with textual semantics, with an appropriate textual literal syntax, that doesn’t cause additional issues in the rest of the language (i.e., x.isMultiple(of: 'a')). Or more broadly, we don’t have a “safe” and “readable” way to process and generate ASCII bytestrings. I don’t think anyone has lost track of that.

IHDR is just a concrete example of something that is very difficult to safely and efficiently express with existing language tools. If you want a sampling of “pain points”, I would say that any proposed solution must address the following in a safe manner:

// storing a bytestring value 
static 
var liga:(Int8, Int8, Int8, Int8) 
{
    return (108, 105, 103, 97) // ('l', 'i', 'g', 'a')
}
// storing an ASCII scalar to mixed utf8-ASCII text
var xml:[UInt8] = ...
xml.append(47) // '/'
xml.append(62) // '>'
// ASCII range operations 
let current:UnsafePointer<Int8> = ...
if 97 ... 122 ~= current.pointee // 'a' ... 'z'
{
    ...
}
// ASCII arithmetic operations 
let year:ArraySlice<Int8> = ...
var value:Int = 0
for digit:Int8 in year 
{
    guard 48 ... 57 ~= digit // '0' ... '9'
    else 
    {
        ...
    }

    value = value * 10 + .init(digit - 48) // digit - '0'
}
// reading an ASCII scalar from mixed utf8-ASCII text 
let xml:[Int8] = ... 
if let i:Int = xml.firstIndex(of: 60) // '<'
{
    ...
}
// matching ASCII signatures 
let c:UnsafePointer<UInt8> = ...
if (c[0], c[1], c[2], c[3]) == (80, 76, 84, 69) // ('P', 'L', 'T', 'E')
{
    ...
}