Okay. So in your opinion, having character literals would not be useful without having integer conformances to the character-literal protocols?
I’m not sure I think they’re needed at all, but I don’t consider the Collection/Element problem enough motivation on its own to make a change like this. (OptionSets have the same problem.) So a proposal that just changes Character and UnicodeScalar is uninteresting to me; I basically never use those types to begin with*, and so a syntax to construct only them makes the language a little more complicated for a use case I’ll never see.
* and generally one shouldn’t, because even grapheme clusters usually can’t be manipulated independently. Consider uppercasing “ß”, which at least historically has produced “SS”.
My take:
- As expressed by @John_McCall, “enabling the syntax, have it require one of the two existing protocols (extended grapheme cluster / unicode scalar), and make it use
Character
as its default literal type”. - Analogous to
let d: Double = 2
, the compiler might acceptlet i: UInt8 = 'a'
(orlet myArray: [UInt8] = ['a','b']
), figuring-out the right integer value, and giving sense todigit + 10 - 'a'
. (Any opinion on this?) - Please no further magic with types, keep it simple. Once something is a Character or UnicodeScalar, do not substract it from an integer.
- Is then anything missing to make real use cases easier?
- Please do not confound Unicode scalars in a limited range expesssible by UInt8 or UInt16 with UTF-8 encoding or UTF-16 encoding. Note that you need 21 bits to express any Unicode codepoint as the according number (which makes it expressable by UInt32).
i remember one of the things that sank the 2018 proposal was the ExpressibleByExtendedGraphemeClusterLiteral
→ ExpressibleByStringLiteral
implication.
so this is why i recreated separate ExpressibleByCodepointLiteral
and ExpressibleByCharacterLiteral
protocols in the current proposal. but i accept that people do not like the idea of introducing two new protocols that do very similar things to two existing protocols we already have today. so in light of that, i would like to gauge everyone’s feelings about a potential alternative because we do have a feature in the compiler today that we did not have four years ago: marker protocols.
because the whole reason ExpressibleByCharacterLiteral
exists in the proposal is so that ExpressibleByStringLiteral
won’t inherit from it but if you take away that reason it doesn’t really make sense to have ExpressibleByCharacterLiteral
that is just a duplicate of ExpressibleByExtendedGraphemeClusterLiteral
and i think a better way to approach this is to instead say which types we are going to opt in to the single quoted syntax and that we will not be enabling this syntax for String
and StaticString
the same way we added Sendable
but said some of the types like UnsafePointer
are not going to be Sendable
by default.
so if we add a @_marker
protocol ExpressibleBySingleQuotedLiteral
that types like Character
and Unicode.Scalar
conform to but types like String
and StaticString
don’t conform to, then we can continue to use the ExpressibleByExtendedGraphemeClusterLiteral
and ExpressibleByUnicodeScalarLiteral
protocols.
that way single quoted Character
literals won’t be range-limited and can contain extended grapheme clusters.
'🇨🇦'.property
'🇺🇸'.function()
and the new ExpressibleByASCIILiteral
/ExpressibleByBMPLiteral
protocols would be orthogonal to ExpressibleBySingleQuotedLiteral
and types would have to conform to both.
because it is a marker protocol it isn’t in the ABI so it would backdeploy.
I agree, I was just thinking this yesterday. As this is coming up again and again, I've pushed a commit that splits the new _ExpressibleByASCIILiteral and _ExpressibleBySingleQuotedLiteral marker protocols off, rather than grafting them onto the existing hierarchy. You can then manually conform Character and Unicode.Scale to these protocols directly.
@_marker _ExpressibleByASCIILiteral
↳ @_marker _ExpressibleBySingleQuotedLiteral
ExpressibleByUnicodeScalarLiteral
↳ ExpressibleByExtendedGraphemeClusterLiteral
↳ ExpressibleByStringLiteral
This doesn't really change the situation that I wouldn't want to see single quoted character literals accepted without the integer conversions and it seems the language working group has already adjudicated on that.
I'd find that a procedural anomaly if it weren't for that fact it seems to be a reflection of the view of the broader Swift community.
thank you john! but it isn’t quite what i was suggesting, rather i was envisioning that ExpressibleBySingleQuotedLiteral
would be a syntactical marker protocol, and it would be orthogonal to the literal expression domain protocols, which have runtime impact and cannot be @_marker
. so, it would look like:
// new
@_marker ExpressibleBySingleQuotedLiteral
↳ ExpressibleByASCIILiteral // only for ASCII-restricted domains
↳ ExpressibleByBMPLiteral // only for BMP-restricted domains
// existing
ExpressibleByUnicodeScalarLiteral
↳ ExpressibleByExtendedGraphemeClusterLiteral
↳ ExpressibleByStringLiteral
extension Unicode.Scalar:ExpressibleBySingleQuotedLiteral
{
}
extension Character:ExpressibleBySingleQuotedLiteral
{
}
thoughts?
so, i was thinking about the “we don’t want users writing 'x' + 'y'
” problem, and i realized this problem isn’t actually exclusive to UTF-8/UTF-16, we have a lot of types in the ecosystem today that suffer from a similar problem.
at the risk of going off topic, i want to talk about FilePath
, because i think FilePath
is a good case study for where we have a similar problem that does not have to do with code units.
because we usually want to think of FilePath
as a collection of path components, and this kind of abstraction probably wants to support concatenation with +
like:
let directory:FilePath = "Sources"
let fileID:FilePath = "Foo.swift"
let file:FilePath = directory + fileID // returns "Sources/Foo.swift"
and FilePath
is also stringlike so we probably want to make it ExpressibleByStringLiteral
so you could do:
self.load(textures: ["albedo.png", "specular.png", "normals.png"])
but then users would be able to write nonsensical things like
self.parse(sourceFile: "Sources" + "Foo.swift")
and we don’t want this because what is sourceFile
? is it "SourcesFoo.swift"
or is it "Sources/Foo.swift"
?
and what we really need is to be able to add a little bit of friction to the literal type inference so that you would always have to write:
self.load(textures: ["albedo.png", "specular.png"] as [FilePath])
self.parse(sourceFile: ("Sources" as FilePath) + ("Foo.swift" as FilePath))
and i think this is actually a more general feature that we need, and maybe it could look like an attribute on an ExpressibleBy
conformance like:
extension FilePath:@noninferred ExpressibleByStringLiteral
{
}
There was also a prior review here. The current draft proposal has an extended reverie explaining why the Core Team was wrong, and of course you’re entitled to think that and argue it in your proposal, but I don’t think you can be too surprised that the Language Workgroup still believes now what many of its members fairly clearly believed then.
although i agree with @johnno1962 viewpoint technically, i think it is obvious that bare conformances to ExpressibleByASCIILiteral
for UInt8
are controversial and detracting from the proposal.
so regardless of whether one would describe the proposal as a “reverie” (my oh my everyone's delusional except for me!) i think the only productive way forward is to limit the proposed changes to areas where there is broad agreement, namely:
@_marker ExpressibleBySingleQuotedLiteral
↳ ExpressibleByASCIILiteral // only for ASCII-restricted domains
↳ ExpressibleByBMPLiteral // only for BMP-restricted domains
and conformances to ExpressibleBySingleQuotedLiteral
for Character
and Unicode.Scalar
that allow expressing the full range of those types with single quoted literals.
i have put a lot of thought into formulating this design in a manner that would not block off future directions for UTF string processing, and in my view at least, continuing to try and push through an omnibus bill does not make sense.