Single Quoted Character Literals (Why yes, again)

taylorswift · December 8, 2022, 11:11pm

beccadax:

The way this section is written is incredibly confusing—I actually read your protocol hierarchy backwards at first (because inheritance is usually drawn the other way, with more-derived classes below less-derived classes!) and spent half an hour writing increasingly confused critiques of the backwards design. Even now that I've figured that out, though, I still don't quite see how this design is supposed to work, particularly in terms of initializing types that only conform to the marker protocols.

To get everyone on the same page, I strongly recommend you write the actual declarations for the new protocols along with first cuts at their doc comments, and describe any modifications to the semantics of existing protocols. (For instance, which protocols imply support for which syntaxes?) I would also write examples of the code the compiler should generate when a single-quoted literal is used for a type that only conforms to the new protocols. (Just Swift expressions—the equivalent of saying ""a" as Unicode.Scalar lowers to Unicode.Scalar(unicodeScalarLiteral: 97)"). And I would specify which types will gain conformances to which protocols.

unless i am misunderstanding the proposal in its newest iteration, @_marker protocols cannot declare requirements, so user-defined types cannot implement ExpressibleBySingleQuotedLiteral alone; the conformances for Unicode.Scalar, Character, UInt8, etc would have to be baked into the compiler, or rely on ExpressibleByUnicodeScalarLiteral.

from what i recall during the first review, one of the more widespread criticisms of the original proposal was this:

One concern raised during the review was that because ExpressibleByStringLiteral refines ExpressibleByExtendedGraphemeClusterLiteral, then type context will allow expressions like 'x' + 'y' == "xy".

which does not coexist happily with 'x' + 'y' == 241.

with that in mind, could we simply create a new, unrelated hierarchy for ExpressibleByCharacterLiteral? (which is a serendipitously unclaimed name in the standard library.)

@_marker
protocol _ExpressibleByBuiltinCharacterLiteral
{
}

extension Unicode.Scalar:_ExpressibleByBuiltinCharacterLiteral {}
extension Character:_ExpressibleByBuiltinCharacterLiteral {}

protocol ExpressibleByASCIILiteral
{
    init(asciiLiteral:UInt8)
}

protocol ExpressibleByCharacterLiteral:ExpressibleByASCIILiteral
{
    associatedtype CharacterLiteralType:_ExpressibleByBuiltinCharacterLiteral
    init(characterLiteral:CharacterLiteralType)
}
extension ExpressibleByCharacterLiteral
    where CharacterLiteralType == Unicode.Scalar
{
    init(asciiLiteral:UInt8)
    {
        self.init(characterLiteral: .init(asciiLiteral))
    }
}
extension ExpressibleByCharacterLiteral
    where CharacterLiteralType == Character
{
    init(asciiLiteral:UInt8)
    {
        self.init(characterLiteral: .init(.init(asciiLiteral)))
    }
}

extension UInt8:ExpressibleByASCIILiteral
{
    init(asciiLiteral:UInt8) { self = asciiLiteral }
}
extension Unicode.Scalar:ExpressibleByCharacterLiteral
{
    init(characterLiteral:Self) { self = asciiLiteral }
}
extension Character:ExpressibleByCharacterLiteral
{
    init(characterLiteral:Self) { self = asciiLiteral }
}

the key thing to note here is that String does not conform to ExpressibleByCharacterLiteral. so we would not have the situation where 'x' + 'y' == "xy" can occur.

ExpressibleByExtendedGraphemeClusterLiteral and ExpressibleByUnicodeScalarLiteral could then continue to exist unchanged with the double-quoted syntax, and the language could deprecate them at whatever pace people are comfortable with, which may very well be “never”.

behavioral changes i can forsee:

Basic type identities

('€')                   → ('€' as Character)

// compilation error
('€' as String)         → Never 

("1" + "1")             → ("ab" as String)

// compilation error, because `+ (lhs:String, rhs:Character)` does not exist
("1" + '€')             → Never 

// compilation error, because `+ (lhs:Character, rhs:Character)` does not exist
('1' + '1' as String)   → Never

// compilation error, because `UInt8` is not implicitly convertible to `Int`
('1' + '1' as Int)      → Never

Initializers of integers

Int.init("0123")        → (123 as Int?)
// compilation error, because `Int.init(_:Character)` does not exist
// compilation error, because `Int.init(_:Unicode.Scalar)` does not exist
// compilation error, because `Int.init(_:UInt8)` exists but `'€'` is not ASCII
Int.init('€')           → Never

Int.init('3')           → Int.init(51 as UInt8) → (51 as Int)
(['a', 'b'] as [Int8])  → ([97, 98] as [Int8])

More arithmetic

('a' + 1)           → (98 as UInt8)
('b' - 'a' + 10)    → (11 as UInt8)
// runtime error, from integer overflow
('a' * 'b')         → Never
("123".firstIndex(of: '2')) → (String.Index.init(_rawBits: 65799) as String.Index?)